The Kitchen Sink and Other Oddities

Atabey Kaygun

Splitting Streams

I was watching Gary Fredericks’ talk at Clojure West on splitting streams of random numbers. He has a solution to the problem that works for him. I will do something a little more general, and show along the line that every random stream of objects can be split including what he calls linear random streams. There is no need to fiddle with the PRNG’s to make them split-able.

As per his talk, random sequence here means deterministic pseudo-random sequence: given an initial seed, the pseudo-random sequence will be the same as long as one uses the same PRNG.

What do we mean by splitting?

Given a stream of objects stream, a split (the mathematical name for the operation is `de-shuffle’ by the way) would create two streams stream-1 and stream-2 which are mutually disjoint and ordered. This means first that objects in both stream-1 and stream-2 are chosen from stream by preserving their order. In other words, if a appears before b in stream and if both appear in the same stream stream-i then a should appear before b also in stream-i. Mutual disjoint-ness condition states that there is no overlap in stream-1 and stream-2 in terms of the indices of the objects chosen: every object x in stream is either in stream-1 or in stream-2 but not both.

Streams of random numbers

I will use stateful random number generators. I will need the state later for splitting.

(defun make-a-number-stream (seed m)
   (cons (seed-random-state seed)
         (lambda (rng) (values (random m rng) rng))))
MAKE-A-NUMBER-STREAM

Here is the take function:

(defun take (stream n)
   (loop repeat n collect (funcall (cdr stream) (car stream))))
TAKE

Let me test it:

(let ((str (make-a-number-stream 0 15)))
   (take str 20))
(12 5 0 3 11 3 7 9 3 5 2 4 7 6 8 8 12 10 1 6)

If I call the same random stream (another stream created with the same seed) I would get the same sequence:

(let ((str (make-a-number-stream 0 15)))
   (take str 20))
(12 5 0 3 11 3 7 9 3 5 2 4 7 6 8 8 12 10 1 6)

but the results would be different if I use a different seed

(let ((str (make-a-number-stream 1 15)))
   (take str 20))
(5 11 12 8 9 11 5 0 0 1 12 7 13 12 6 9 2 4 14 5)

Splitting a stream

In order to split a stream, I am going to need an additional random stream of binary values of 0’s and 1’s. Then the original stream will be split into two using this new stream:

(defun filter (str seed val)
  (let* ((rng (seed-random-state seed))
         (cut (lambda () (= val (random 2 rng)))))
     (cons (seed-random-state (car str))
           (lambda (rng)
               (do ((x (funcall (cdr str) rng)
                       (funcall (cdr str) rng)))
                   ((funcall cut) x))))))
FILTER
(defun split-stream (stream seed)
   (values 
      (filter stream seed 1)
      (filter stream seed 0)))
SPLIT-STREAM

Remember the last example of the previous section? Let us split that. Aaand, voila!

(let ((str (make-a-number-stream 1 15)))
   (multiple-value-bind (left right)
         (split-stream str 10)
     (list (take left 10) (take right 10))))
((5 11 8 11 5 0 1 7 13 9) (12 9 0 12 12 6 2 4 14 5))

Addendum

Garry explained to me that performance-wise what I did is horrible, and does not scale at all for the applications he has in mind. It is essentially a filtering procedure, and after say \(n\) split, each split stream would be \(2^n\) times slower than the original which, as anyone can imagine, is bad. But then again, what I did is very general and that level of generality comes with a price. This reminds me an Alan Perlis quote “LISP programmers know the value of everything and the cost of nothing.”