Assume \(\Omega\subseteq \mathbb{R}^n\) is a region, and that we have a rectangular box \(B := I_1\times\cdots\times I_n\) which contains \(\Omega\). Imagine also that we have a function (usually called a characteristic function) \(\chi_\Omega\colon B\to \{T,F\}\) which returns a boolean value telling us whether a given input lies in \(\Omega\), or not.
Now, using uniform distributions on the intervals \(I_i\) for \(i=1,\ldots,n\), if we sample \(N\) points from the box \(B\) and count the sample points which lie in \(\Omega\) we can estimate the volume \(\Omega\). Let \(p\) be the number of sample points which lie in \(\Omega\). Then the ratio \(\frac{p}{N}\) approximates the ratio of the volume of \(\Omega\) by the volume of the box \(B\).
I am going to use the following lisp function.
(defun monte-carlo (experiment n &optional (scale 1.0))
(do ((i 0 (1+ i))
(p 0 (if (funcall experiment) (1+ p) p)))
((> i n) (/ (* p scale) n))))
MONTE-CARLO
Here the argument function experiment
needs to do both
the sampling and the testing. Let us test it on a function. Let \(f(x) =
\frac{1}{\sqrt{2\pi}} exp(-x^2/2)\) and assume that I would like
to integrate \(f(x)\) on \(\[-1.96,1.96\]\).
I will then write:
(defun experiment-erf ()
(let* ((x (- (random 3.92) 1.96))
(y (random 1.0))
(z (/ (exp (- 0 (* x x 0.5)))
(sqrt (* 8 (atan 1))))))
(< y z)))
EXPERIMENT-ERF
Notice that the bounding box \(\[-1.96,1.96\]\times \[0,1\]\) has area
\(3.92\). So, if we call our function
monte-carlo
with a specific sample size, the return value
is going to be an approximation of the value \[ \frac{1}{3.92}\int_{-1.96}^{1.96}
\frac{e^{-x^2/2}}{\sqrt{2\pi}} dx \] So, the integral we are
looking for is going to be obtained by multiplying the return value by
\(3.92\).
(monte-carlo 'experiment-erf 500 3.92)
0.89376
The result should be approximately \(0.95\). Let me statistically analyze the function simply calculating its average and standard deviation over 100 runs.
(defparameter collection
(loop for i from 1 to 100 collect (monte-carlo 'experiment-erf 500 3.92)))
COLLECTION
In order to analyze the ensemble, I will need the following function:
(defun analyze (collection)
(let* ((N (length collection))
(average (/ (reduce '+ collection) N))
(deviation (sqrt (/ (loop for x in collection
sum (let ((u (- x average))) (* u u))) (1- N)))))
(format nil "Average: ~A~% Deviation: ~A~%" average deviation)))
ANALYZE
And now, if we analyze our data we get
(analyze collection)
Average: 0.95953757
Deviation: 0.079398565
As one can easily see, the monte carlo integration method is not very efficient in a 1-dimensional integral as even the ordinary Riemann sum would probably give a better convergence rate with respect to the sample size. So, let us move to a higher dimensional problem.
This time, I would like to approximate the area within an ellipse. Specifically the area of the region \(\frac{x^2}{3} + \frac{y^2}{5} \leq 1\). Then I will need the following experiment function
(defun experiment-ellipse ()
(let* ((a (sqrt 3))
(b (sqrt 5))
(x (- (random (* 2 a)) a))
(y (- (random (* 2 b)) b))
(z (+ (/ (* x x) 3) (/ (* y y) 5))))
(< z 1)))
EXPERIMENT-ELLIPSE
Now, let me put this function through a test: We know that the real value of the area is \(\pi\sqrt{15}\) which is approximately 12.167336 and from a single run we get
(monte-carlo 'experiment-ellipse 200 (* (sqrt 15) 4))
11.309112
and if we look at 100 runs we will get
(setf collection
(loop for i from 1 to 100 collect (monte-carlo 'experiment-ellipse 200 (* 4 (sqrt 15)))))
(analyze collection)
Average: 12.214616
Deviation: 0.45041347
In practice, it would not be feasible to actually test the statistical quality of the result we will obtain using ensambles as I did above. Let me see what happens if I increase the sample size
(time (monte-carlo 'experiment-ellipse 999999 (* 4 (sqrt 15))))
12.171126
Evaluation took:
0.163 seconds of real time
0.160010 seconds of total run time (0.160010 user, 0.000000 system)
98.16% CPU
388,385,688 processor cycles
0 bytes consed
Not bad for this sample size, considering the undeserved reputation of lisp in numerical computations.