Help hunting a misterious bug!
Juan Luis Baptiste
juancho at linuxmail.org
Wed Mar 5 01:55:16 EST 2003
Hi Eddie,
I have finished now the DNS Application Level Gateway to be used with NAT-PT elements (AddressTranslator,
ProtocolTranslator46 and ProtocolTranslator64), but I'm having a REALLY strange problem, I hope someone
can give me a hint on this.
This is the setup:
IPv6 Network IPv4 Network
----------------------- ------------------------- ---------------------
| Machine A |___| Click (NAT-PT) |___| Machine B |
|3ffe:1ce1:2:0:200::2 | |3ffe:1ce1:2:0:200::2 / | | 172.25.79.220 |
| (1.0.0.1) | | 172.25.79.156 | | |
----------------------- ------------------------- ---------------------
Inside Click:
IPv6 IPv4
------
IPv6 routing | | IPv4 routing
.... ---> elements ---> AT ---> p64 ---> DNSAlg ---> elements ---> ....
(ie. | | (ie.
.... <--- LookupIP6Route, | | LookupIPRoute, <--- ....
etc) | | etc)
| | | |
----DNSAlg <---| |<--- pt46 <---------------------
------
AT = AddressTranslator
pt46 = ProtcolTranslator46
pt64 = ProtocolTranslator64
The idea of DNSAlg, is that you don't need to refer to IP addresses, instead you refer to
it's Domain names. For example, if you are in machine A and need telnetting to machine B,
instead of doing telnet ::172.25.79.220 , you would do telnet ipv4.test.com ,and DNSAlg
would intercept that AAAA query and translate it to a A query, so the DNS server in the
IPv4 network can answer it. Then, the response to the A query gets agains translated by
DNSAlg to a AAAA query response (in this case changing 172.25.79.229 to ::172.25.79.220
in the answer section).
In the other way works almost in ther same way, instead of telnetting to the IPv4 address
that represents the IPv6 machine in the IPv4 network (telnet 1.0.0.1), you would
do telnet ipv6.test.com, and the A query is translated to a AAAA query, and the response
is translated too, changing in this case 3ffe:1ce1:2:0:200::2 for 1.0.0.1 (or what AT mappings
say). With inverse queries is similar.
DNSAlg works fine with 6 to 4 (AAAA->A) query translations, 4 to 6 query responses translations
and 4 to 6 query translations (A->AAAA), but with the 6 to 4 query responses translations I
have the following problem:
Click segfaults at an almost exact number of translated query responses! generally at the second
or third response, no more, no less. When running Click with gdb, if I put a breakpoint in the
code of DNSAlg where the AAAA query responses are translated, and when gdb stops at the breakpoint,
I instruct gdb to continue, at the second or third query response I get a segfault at
protocolTranslator46::handle_ip4(), in a call to Packet::length(). The strange thing is that
when translating the responses from IPv6 to IPv4, protocolTranslator46 is NEVER called, as you can see
at the previous diagram (protocolTranslator46 is used only when translating from IPv4 to IPv6, not from
IPv6 to IPv4).
The DNS queries have been sent one by one, so there is no chance that the segfault occurs at that element
because another query has entered Click before the previous Query response has leaved. Another strange
thing is that if I add or remove click_chatter() messages in the DNSAlg code, the number of queries needed
to segfault Click varies.
Also, I have reviewed the config file looking for loops, but haven't seen any.
Strange uh?
Has anyone an idea of what could be happening here?
Thanks for any hints anyone can give me.
Cheers,
Juan Luis Baptiste
--
______________________________________________
http://www.linuxmail.org/
Now with e-mail forwarding for only US$5.95/yr
Powered by Outblaze
More information about the click
mailing list