[Click] click-fastclassifier breaks alignment alignment

rootkit85 at yahoo.it rootkit85 at yahoo.it
Mon Apr 26 06:51:50 EDT 2010


On Fri, Apr 23, 2010 at 4:54 PM, Eddie Kohler <kohler at cs.ucla.edu> wrote:
> Hi Matteo,
>
> I think there is something else going on.  Click-fastclassifier is designed
> to handle alignment correctly, but this requires that the input
> configuration has been passed through click-align.  Do you pass your config
> through click-align BEFORE passing it to click-fastclassifier?  I think the
> answer is Yes.  In that case, then, can you confirm whether the alignment
> click-align produces is correct?  Can you send your configuration both
> before & after click-align?
>
> E
>
>
> On 4/23/10 5:36 AM, rootkit85 at yahoo.it wrote:
>>
>> click-fastclassier generates code which produces _lots_ of unaligned
>> accesses on MIPS platform.
>> For example this code snippet:
>>
>> FastClassifier_a_aeth::length_unchecked_push(Packet *p)
>> {
>>  const unsigned *data = (const unsigned *)(p->data() - 2);
>>  step_0:
>>  if ((data[3]&  0xffff) != 0x800)
>>    goto step_4;
>> [...]
>>
>> is wrong even if the packet is properly aligned, as it starts reading
>> double words from a pointer which address %4 != 0
>> this could be easily fixed by using uint8 in the byte matching code.
>> As the fastclassifier tool is quite a lot of code, do you know any
>> fast way to change it in order to fix?
>>
>> Thanks a lot,
>> Matteo Croce
>> _______________________________________________
>> click mailing list
>> click at amsterdam.lcs.mit.edu
>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>

Well I have a consideration about click's alignment process.
click-align adds Align elements in the configuration which does
Packet::push() which
in turns does memmove()
With some arch like the one i'm using (MIPS 24K) it seems that doing
~50 unaligned accesses is
faster than copying 1500 bytes of data. I managed to run an unaligned
configuration and I gained ~15 MB/sec of real throughput.
So the best solution for alignment accesses would be accessing memory
with char*.
I get an huge improvement in both throughput and processing time with
this small patch:

--- a/elements/standard/classifier.cc	2010-04-12 18:56:27.000000000 +0200
+++ b/elements/standard/classifier.cc	2010-04-26 12:45:20.677465574 +0200
@@ -1127,7 +1127,8 @@
   }

   do {
-      uint32_t data = *((const uint32_t *)(packet_data + ex[pos].offset));
+      const unsigned char *datac = packet_data + ex[pos].offset;
+      uint32_t data = (datac[0] << 24) | (datac[1] << 16) | (datac[2]
<< 8) | datac[3];
       data &= ex[pos].mask.u;
       pos = ex[pos].j[data == ex[pos].value.u];
   } while (pos > 0);



More information about the click mailing list