Today, I’d like to show you a recursive algorithm that gives the longest common subsequence of two sequences. I got the algorithm from Rosetta Code. The alternate Common Lisp code you see at there is mine :)
The idea is simple: we have two cursors. One at the first sequence A, another on the second sequence B. If the elements under the cursors are equal then the longest common sequence is the common element spliced with the longest common sequence we obtain as we move the cursors to the right. If they are not equal, then we perform two calculations:
we move the cursor to the right on A, but leave the cursor on B unchanged. Then we calculate the longest common subsequence after the cursors
we move the cursor to the right on B, but leave the cursor on A unchanged. Then we calculate the longest common subsequence after the cursors
The longer of the result is the winner.
Here is an implementation of the algorithm I described above:
(defun lcs (xs ys)
(labels ((longer (a b) (if (> (length a) (length b)) a b)))
(cond ((or (null xs) (null ys)) nil)
((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys))))
(t (longer (lcs (cdr xs) ys) (lcs xs (cdr ys)))))))
LCS
Let me test it
(defvar a (loop repeat 14 collect (random 10)))
A
(6 9 5 1 3 5 4 0 7 4 9 1 1 3)
(defvar b (loop repeat 17 collect (random 12)))
B
(7 10 2 8 9 11 1 10 7 9 11 8 10 0 5 6 11)
(time (lcs a b))
(9 1 7 9)
Evaluation took:
4.437 seconds of real time
4.440000 seconds of total run time (4.440000 user, 0.000000 system)
100.07% CPU
10,625,212,475 processor cycles
1,040,384 bytes consed
As we can immediately see, it takes too much time even for a small pair of sequences. One possible solution is to use memoization.
Here is a macro that I wrote earlier that implements memoized version of a function.
(defmacro mem-defun (name args body)
(let ((hash-name (gensym)))
`(let ((,hash-name (make-hash-table :test 'equal)))
(defun ,name ,args
(or (gethash (list ,@args) ,hash-name)
(setf (gethash (list ,@args) ,hash-name)
,body))))))
MEM-DEFUN
So, when we apply it to the function we have we get
(mem-defun lcs (xs ys)
(labels ((longer (a b) (if (> (length a) (length b)) a b)))
(cond ((or (null xs) (null ys)) nil)
((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys))))
(t (longer (lcs (cdr xs) ys) (lcs xs (cdr ys)))))))
LCS
Let me test it again:
(time (lcs a b))
(9 1 7 9)
Evaluation took:
0.006 seconds of real time
0.008000 seconds of total run time (0.008000 user, 0.000000 system)
133.33% CPU
12,112,737 processor cycles
476,712 bytes consed
This is a clear winner.