[Click] Error compiling my package in Click for kernel level

Cliff Frey cliff at meraki.com
Thu Feb 9 18:58:43 EST 2012


What is the crash?

On Thu, Feb 9, 2012 at 3:41 PM, Harkeerat Bedi <hsbedi at memphis.edu> wrote:

> Thank you Cliff for your suggestions. I am trying them out now. I used
> Valgrind and it was helpful. Now my package is running in kernel level. It
> does crash sometimes and I am trying to figure them out.
>
> However, I observe a kernel crash every time I try to use
> "click-uninstall" to unload my config. I have not defined the cleanup()
> function for my element because the external C++ code/classes which I use
> with my element have their own destructors defined.
>
> I was not able to gather any debugging info for this crash yet. Can you
> think of anything why this can be happening?
>
> Thank you.
>
> Regards,
> Harkeerat Bedi
> University of Memphis
>
> On Tue, Feb 7, 2012 at 11:04 AM, Cliff Frey <cliff at meraki.com> wrote:
>
>> I'm glad that the suggestions helped out.
>>
>> You could try various forms of kernel debugging, for instance using a
>> sysreq key (can be configured when compiling the linux kernel yourself).
>>  You might be able to turn on hung_task_panic/hung_task_warnings to
>> possibly get more output from the kernel.
>>
>> You could also run your userlevel element under valgrind, just to
>> double-check that it isn't doing anything unsafe.
>>
>> You could add click_chatter() calls _everywhere_ and see if any of them
>> correspond to the hang.
>>
>> You could simplify your element as much as possible to the smallest
>> possible failing case, and then post that to the mailing list.
>>
>> But those are just my random ideas...
>>
>> Cliff
>>
>>
>> On Mon, Feb 6, 2012 at 11:22 PM, Harkeerat Bedi <hsbedi at memphis.edu>wrote:
>>
>>> Thank you Cliff for your suggestions and explanations. They were very
>>> helpful.
>>>
>>> I ended up removing the static member variables which I was using from
>>> my code. I did this mainly because, my package contains some external
>>> C++ files which do not contain an element. These files instead hold C++
>>> classes which had these static member variables. They were of non-primitive
>>> types. I was not sure if I could have used static_initialize() function
>>> in this scenario. Was it possible?
>>>
>>> Now after removing them, my package compiles and runs in kernel level. It
>>> runs as expected for a few seconds, and then it hangs. Since I am running
>>> it in kernel level, the whole system crashes. I am currently running Click
>>> on a VM. I have Click version 2.0.1 from git-hub installed (both user
>>> level and patchless kernel level) on an Ubuntu 10.04 LTS system. My kernel
>>> is 2.6.32-38-generic (i686) and gcc version is 4.4.3 (Ubuntu
>>> 4.4.3-4ubuntu5).
>>>
>>> I would like debug this issue. I do on get any helpful information using
>>> "dmesg".
>>>
>>> I tried to debug it using gdb using the following commands:
>>>      sudo gdb /usr/local/click/sbin/click-install
>>>      run myconfig.click
>>>
>>> And I get the following output:
>>>
>>>      Starting program: /usr/local/click/sbin/click-install myconfig.click
>>>      [Thread debugging using libthread_db enabled]
>>>
>>>      Program exited normally.
>>>      (gdb)
>>>
>>> The above terminates the gdb session, but not the Click configuration
>>> which still remains active and running. However, after a while the
>>> kernel/system crashes. This crash happens even if I don't use the gdb
>>> debugger.
>>>
>>> Can you suggest how I can try to debug this?
>>>
>>> I am able to run my package (and debug it using gdb) in user level and
>>> do not experience any issues.
>>>
>>> Thank you once again.
>>>
>>> Regards,
>>> Harkeerat Bedi
>>> University of Memphis
>>>
>>>
>>>
>>> On Mon, Feb 6, 2012 at 1:11 AM, Cliff Frey <cliff at meraki.com> wrote:
>>>
>>>> responses inline
>>>>
>>>> On Sun, Feb 5, 2012 at 10:41 PM, Harkeerat Bedi <hsbedi at memphis.edu>wrote:
>>>>
>>>>> After removing floating point arithmetic, now the following errors
>>>>> remain
>>>>> (dmesg output):
>>>>>
>>>>> [ 4184.551295] click: starting router thread pid 9889 (f3d7f6c0)
>>>>> [ 4184.569707] myelementpackage: Unknown symbol __dso_handle
>>>>> [ 4184.572406] myelementpackage: Unknown symbol __cxa_atexit
>>>>> [ 4197.961612] click: stopping router thread pid 9889
>>>>> [ 4197.961637] click module exiting
>>>>>
>>>>> Can you kindly suggest how I can fix these errors? If they are due to
>>>>> my
>>>>> use of static member variables, can we handle these errors without
>>>>> removing
>>>>> them?
>>>>>
>>>>
>>>> You may need to make your static/global variables be of simple type (or
>>>> pointers to classes), and then use a static_initalize() function to
>>>> allocate and assign values to those static/global variables.
>>>>
>>>>
>>>>> Also, in the #define written above, if I cast the values of A and B
>>>>> with
>>>>> int64_t, like:
>>>>> #define fixedpt_xdiv(A,B) (int32_t)(((int64_t)A << 8) / (int64_t)B)
>>>>>
>>>>> I get the following additional error:
>>>>> [ 4475.103019] myelementpackage: Unknown symbol __divdi3
>>>>>
>>>>> Can you suggest why casting with int64_t causes such an error?
>>>>>
>>>>
>>>> This is causing gcc to assume that libgcc is available, and it is not
>>>> in the kernel.  You can use the functions in click/include/click/bigint.hh
>>>> to perform 64/32 bit division if you need it.
>>>>
>>>>
>>>>>
>>>>> On a side note, I was interested to see if I could get similar errors
>>>>> if I
>>>>> added floating point arithmetic in the sample package provided by
>>>>> Click in
>>>>> ...DIR/etc/samplepackage/ directory.
>>>>> I added the following code in the initialize() definition in
>>>>> sampleelt.cc:
>>>>>
>>>>> int
>>>>> SamplePackageElement::initialize(ErrorHandler *errh)
>>>>> {
>>>>>        errh->message("Successfully linked with package!");
>>>>>    float temp;
>>>>> float temp1 = 121.1211;
>>>>> float temp2 = 345234.32423;
>>>>>    temp = temp2 / temp1;
>>>>>    click_chatter("temp: %d", (int64_t) temp);
>>>>>        return 0;
>>>>> }
>>>>>
>>>>> However, the element ran successfully in kernel level, with the
>>>>> following
>>>>> output in dmesg:
>>>>>
>>>>> [14044.407445] click: starting router thread pid 19173 (f4026780)
>>>>> [14044.412844] test.click:3: While initializing 'test ::
>>>>> SamplePackageElement':
>>>>> [14044.412848]   Successfully linked with package!
>>>>> [14044.412853] chatter: temp: 2850
>>>>> [14046.287880] click: stopping router thread pid 19173
>>>>> [14046.287897] click module exiting
>>>>>
>>>>> Can you kindly suggest why we did not observe any errors for using
>>>>> floating
>>>>> point arithmetic in this case?
>>>>>
>>>>
>>>> That code is completely inlined/executed at compile time.  Try marking
>>>> one of the variables as "volatile" or make one of them depend on a runtime
>>>> quantity, and that code will fail to load as well.
>>>>
>>>> Cliff
>>>>
>>>>
>>>
>>
>


More information about the click mailing list