# Knuth Morris Pratt Prefix Tableby Isai Damier, Android Engineer @ Google

```/*********************************************************************
* Author: Isai Damier
* Title: Knuth Morris Pratt String Search Algorithms Prefix Table
* Project: geekviewpoint
* Package: algorithm.search.string
*
* This is the portion of the KMP algorithm that pre-processes the
* needle/pattern, P. It does so by matching the needle against shifts
* of itself. The resulting skip table S provides information for
* skipping fruitless shifts of the haystack/text.
*
* Let's start by setting up some fundamentals for the demonstration
* ahead. Say we are looping through P as in
*
*   for(int i=0; i<P.length; i++) visit(P[i]);
*
* Let "prefix" mean the portion of P we already visited before
* reaching char P[i].
*
* Then for P={abababca},
*  - the prefix X<i> of P when i=0 is X<0> = {}
*  - the prefix X<i> of P when i=1 is X<1> = {a}
*  - the prefix X<i> of P when i=2 is X<2> = {ab}
*  - the prefix X<i> of P when i=3 is X<3> = {aba} ...
*
* Now for the fun stuff: Let X<i> be the prefix of P with length i.
* Then we define the overlap between X and P as the longest suffix of
* X to match a prefix of P. (Note: this algorithm does not treat X as
* a proper suffix of itself.)
*
* For example, the overlap between X<5> and P is aba because
*
*     X<5>: ab|aba
*     P   :    aba|babca
*
* We could not use
*
*     X<5>: ababa
*     P   : ababa|bca
*
* because this algorithm does not treat X as its own suffix.
*
* Following is the filling of the skip table S:
*
*   i=0 X={} S={-1} Note: P[0] has no prefix
*
*   i=1 X={a} S={-1,0} Note: X has no suffix
*
*   i=2 X={ab} S={-1,0,0} Note: no match
*
*   i=3 X={aba} S={-1,0,0,1}
*     Note:
*        X<3>: ab|a
*        P   :    a|bababca
*     match ends after P[0], so start next shift at i=1
*
*   i=4 X={abab} S={-1,0,0,1,2}
*     Note:
*        X<4>: ab|ab
*        P      : ab|ababca
*     match ends after P[1], so start next shift at i=2
*
*   i=5 X={ababa} S={-1,0,0,1,2,3}
*     Note:
*        X<5>: ab|aba
*        P      : aba|babca
*     match ends after P[2], so start next shift at i=3
*
*   i=6 X={ababab} S={-1,0,0,1,2,3,4}
*     Note:
*        X<5>: ab|abab
*        P   :    abab|abca
*     match ends after P[3], so start next shift at i=4
*
*   i=7 X={abababc} S={-1,0,0,1,2,3,4,0}
*     Note: no match so restart at i=0
*
* For people with a background in computer engineering or integrated
* circuit design, this method is essentially sketching a state
* transition table/diagram for the input P. (as a refresher visit
* http://teahlab.com/ and browse or use the costume search).
*
********************************************************************/
private void prefixTable(char[] P, int[] T) {
int pos = 2, m = 0;
T[0] = -1;
while (pos < P.length) {
if (P[pos - 1] == P[m]) {
T[pos++] = ++m;
} else if (0 < T[m]) {
m = T[m];
} else {
pos++; //same as T[pos++] = 0;
}
}
}```