The Kitchen Sink and Other Oddities

Atabey Kaygun

Longest common subsequence of two sequences

Description of the problem

Today, I’d like to show you a recursive algorithm that gives the longest common subsequence of two sequences. I got the algorithm from Rosetta Code. The alternate Common Lisp code you see at there is mine :)

The algorithm

The idea is simple: we have two cursors. One at the first sequence A, another on the second sequence B. If the elements under the cursors are equal then the longest common sequence is the common element spliced with the longest common sequence we obtain as we move the cursors to the right. If they are not equal, then we perform two calculations:

  1. we move the cursor to the right on A, but leave the cursor on B unchanged. Then we calculate the longest common subsequence after the cursors

  2. we move the cursor to the right on B, but leave the cursor on A unchanged. Then we calculate the longest common subsequence after the cursors

The longer of the result is the winner.

An implementation

Here is an implementation of the algorithm I described above:

(defun lcs (xs ys)
  (labels ((longer (a b) (if (> (length a) (length b)) a b)))
     (cond ((or (null xs) (null ys)) nil)
           ((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys))))
       (t (longer (lcs (cdr xs) ys) (lcs xs (cdr ys)))))))

LCS

Let me test it

(defvar a (loop repeat 14 collect (random 10)))

A

(6 9 5 1 3 5 4 0 7 4 9 1 1 3)

(defvar b (loop repeat 17 collect (random 12)))

B

(7 10 2 8 9 11 1 10 7 9 11 8 10 0 5 6 11)

(time (lcs a b))

(9 1 7 9)


Evaluation took:
 4.437 seconds of real time
 4.440000 seconds of total run time (4.440000 user, 0.000000 system)
 100.07% CPU
 10,625,212,475 processor cycles
 1,040,384 bytes consed

As we can immediately see, it takes too much time even for a small pair of sequences. One possible solution is to use memoization.

Another implementation with memoization

Here is a macro that I wrote earlier that implements memoized version of a function.

(defmacro mem-defun (name args body)
  (let ((hash-name (gensym)))
    `(let ((,hash-name (make-hash-table :test 'equal)))
       (defun ,name ,args 
         (or (gethash (list ,@args) ,hash-name)
             (setf (gethash (list ,@args) ,hash-name)
                   ,body))))))

MEM-DEFUN

So, when we apply it to the function we have we get

(mem-defun lcs (xs ys)
  (labels ((longer (a b) (if (> (length a) (length b)) a b)))
     (cond ((or (null xs) (null ys)) nil)
           ((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys))))
       (t (longer (lcs (cdr xs) ys) (lcs xs (cdr ys)))))))

LCS

Let me test it again:

(time (lcs a b))

(9 1 7 9)


Evaluation took:
  0.006 seconds of real time
  0.008000 seconds of total run time (0.008000 user, 0.000000 system)
  133.33% CPU
  12,112,737 processor cycles
  476,712 bytes consed

This is a clear winner.