The Kitchen Sink and Other Oddities

Atabey Kaygun

A Simple Monte-Carlo Integration Implementation in Lisp

Description of the problem

Assume \(\Omega\subseteq \mathbb{R}^n\) is a region, and that we have a rectangular box \(B := I_1\times\cdots\times I_n\) which contains \(\Omega\). Imagine also that we have a function (usually called a characteristic function) \(\chi_\Omega\colon B\to \{T,F\}\) which returns a boolean value telling us whether a given input lies in \(\Omega\), or not.

Now, using uniform distributions on the intervals \(I_i\) for \(i=1,\ldots,n\), if we sample \(N\) points from the box \(B\) and count the sample points which lie in \(\Omega\) we can estimate the volume \(\Omega\). Let \(p\) be the number of sample points which lie in \(\Omega\). Then the ratio \(\frac{p}{N}\) approximates the ratio of the volume of \(\Omega\) by the volume of the box \(B\).

An implementation

I am going to use the following lisp function.

(defun monte-carlo (experiment n &optional (scale 1.0)) 
   (do ((i 0 (1+ i))
        (p 0 (if (funcall experiment) (1+ p) p)))
       ((> i n) (/ (* p scale) n))))
MONTE-CARLO

Here the argument function experiment needs to do both the sampling and the testing. Let us test it on a function. Let \(f(x) = \frac{1}{\sqrt{2\pi}} exp(-x^2/2)\) and assume that I would like to integrate \(f(x)\) on \(\[-1.96,1.96\]\).
I will then write:

(defun experiment-erf ()
   (let* ((x (- (random 3.92) 1.96))
          (y (random 1.0))
          (z (/ (exp (- 0 (* x x 0.5))) 
                (sqrt (* 8 (atan 1))))))
         (< y z)))
EXPERIMENT-ERF

Notice that the bounding box \(\[-1.96,1.96\]\times \[0,1\]\) has area \(3.92\). So, if we call our function monte-carlo with a specific sample size, the return value is going to be an approximation of the value \[ \frac{1}{3.92}\int_{-1.96}^{1.96} \frac{e^{-x^2/2}}{\sqrt{2\pi}} dx \] So, the integral we are looking for is going to be obtained by multiplying the return value by \(3.92\).

(monte-carlo 'experiment-erf 500 3.92)
0.89376

The result should be approximately \(0.95\). Let me statistically analyze the function simply calculating its average and standard deviation over 100 runs.

(defparameter collection 
               (loop for i from 1 to 100 collect (monte-carlo 'experiment-erf 500 3.92)))
COLLECTION

In order to analyze the ensemble, I will need the following function:

(defun analyze (collection)
   (let* ((N (length collection))
          (average (/ (reduce '+ collection) N))
          (deviation (sqrt (/ (loop for x in collection 
                                    sum (let ((u (- x average))) (* u u))) (1- N)))))
         (format nil "Average: ~A~%    Deviation: ~A~%" average deviation)))
ANALYZE

And now, if we analyze our data we get

(analyze collection)
Average: 0.95953757
Deviation: 0.079398565

As one can easily see, the monte carlo integration method is not very efficient in a 1-dimensional integral as even the ordinary Riemann sum would probably give a better convergence rate with respect to the sample size. So, let us move to a higher dimensional problem.

This time, I would like to approximate the area within an ellipse. Specifically the area of the region \(\frac{x^2}{3} + \frac{y^2}{5} \leq 1\). Then I will need the following experiment function

(defun experiment-ellipse ()
   (let* ((a (sqrt 3))
          (b (sqrt 5))
          (x (- (random (* 2 a)) a))
          (y (- (random (* 2 b)) b))
          (z (+ (/ (* x x) 3) (/ (* y y) 5))))
         (< z 1)))
EXPERIMENT-ELLIPSE

Now, let me put this function through a test: We know that the real value of the area is \(\pi\sqrt{15}\) which is approximately 12.167336 and from a single run we get

(monte-carlo 'experiment-ellipse 200 (* (sqrt 15) 4))
11.309112

and if we look at 100 runs we will get

(setf collection 
     (loop for i from 1 to 100 collect (monte-carlo 'experiment-ellipse 200 (* 4 (sqrt 15)))))
(analyze collection)
Average: 12.214616
Deviation: 0.45041347

In practice, it would not be feasible to actually test the statistical quality of the result we will obtain using ensambles as I did above. Let me see what happens if I increase the sample size

(time (monte-carlo 'experiment-ellipse 999999 (* 4 (sqrt 15))))
12.171126

Evaluation took:
 0.163 seconds of real time
 0.160010 seconds of total run time (0.160010 user, 0.000000 system)
 98.16% CPU
 388,385,688 processor cycles
 0 bytes consed

Not bad for this sample size, considering the undeserved reputation of lisp in numerical computations.