[Click] Click-fastclassifier

Wed Aug 27 10:29:43 EDT 2003

Hi Jan,

The classifier optimization algorithm is not really documented anywhere
outside the code, despite the fact that it has taken more blood, sweat,
tears, and recoding than any other algorithm in Click :)

Note that classifier optimization actually takes place in
classifier.cc. Click-fastclassifier just changes the decision tree into
straight-line code.

Here's the comment describing what Classifier does now. Down at the bottom
of classifier.cc, you can see several older versions of the optimization
code.

/* Optimize Classifier decision trees by removing useless branches. If we have
   a path like:

   0: x>=5?  ---Y-->  1: y==2?  ---Y-->  2: x>=6?  ---Y-->  3: ...
       \
        --N-->...

   and every path to #1 leads from #0, then we can move #1's "Y" branch to
   point at state #3, since we know that the test at state #2 will always
   succeed.

   There's an obvious exponential-time algorithm to check this. Namely, given
   a state, enumerate all paths that could lead you to that state; then check
   the test against all tests on those paths. This terminates -- the
   classifier structure is a DAG -- but clearly in exptime.

   We reduce the algorithm to polynomial time by storing a bounded number of
   paths per state. For every state S, we maintain a set of up to
   MAX_DOMLIST==4 path subsets D1...D4, so *every* path to state S is a
   superset of at least one Di. (There is no requirement that S contains Di as
   a contiguous subpath. Rather, Di might leave out edges.) We can then shift
   edges as follows. Given an edge S.x-->T, check whether T is resolved (to
   the same answer) by every one of the path subsets D1...D4 corresponding to
   S. If so, then the edge S.x-->T is redundant; shift it to destination
   corresponding to the answer to T. (In the example above, we shift #1.Y to
   point to #3, since that is the destination of the #2.Y edge.) */

This algorithm can be thought of as an extensive modification of BPF+'s
redundant predicate elimination, described in
http://citeseer.nj.nec.com/begel99bpf.html . One modification is that Click
classifiers use a simpler base language, consisting of one operation ("load
one word of packet data, AND with mask, compare against value"), rather
than BPF's little machine language. Another modification is that Click
generalizes edge domination.

I should really write a paper about this stuff. :(

There's a body of literature on decision tree optimization that may or may
not be relevant. (It wasn't when I looked at it.)

Anyway, are you having a _problem_ with the classifier optimization?

Eddie