Wikipedia offers the following method to compute the longest common subsequence of two sequences (using the so-called "Generic LCS Delta" algorithm):
LCS-Delta(X,Y) m <- LENGTH[X]; n <- LENGTH[Y]; for i <- 1 to m, do c[i,0] <-0; for j <- 0 to n, do c[0,j] <-0; b <- c; for i <- 1 to m, do { for j <- 1 to n do { if (xi = yj) { c[i,j] <- c[i-1,j-1]+1; b[i,j] <- "UP-LEFT"; } else if (c[i-1,j] >= c[i,j-1]) { c[i,j] <- c[i-1,j]; b[i,j] <- "UP"; } else { c[i,j] <- c[i,j-1]; b[i,j] <- "LEFT"; } } } return c and b
This algorithm fails for various trivial cases, such as (X, Y) = ({1}, {-1}), (X, Y) = ({1}, {}), and (X, Y) = ({}, {1}). The problem is that the highlighted lines should read as follows:
for i <- 1 to m, do c[i,0] <- "LEFT";
for j <- 0 to n, do c[0,j] <- "UP";
If you neglect this initialization, differences at the beginning of the sequence are ignored.
Actually, I am probably being too hard on the wikipedia article since it suggests that the differences be computed from the LCS array using the following, somewhat confusing, algorithm:
The following algorithm will give the list of edits needed to make Y equal to X (the diff algorithm):
Notation: [n,m] refers to a LCS component at X[n] and Y[n]
Assumptions:
The LCS as found by findLCS() is placed on a stack moving from the lower-right corner and
following the direction arrows to form the LCS.
oldElem <---- [-1,-1]
while stack is not empty
elem <---- pop from stack
// diff does not require LCS components but add to list here if desired
for i=oldElem[n] to elem[n]
add delete X(i) from list of edits
for i=oldElem[m] to elem[m]
add insert Y(i) to list of edits
The change that I suggest makes it possible to recover the LCS by reversing the links in the table rather than using a stack. I have an implementation in Java.