[Click] 2nd Try:: chatter: in arp querier: cannot make packet! -- resulting in kernel oops

Cliff Frey cliff at meraki.com
Sat Feb 4 15:40:31 EST 2012


I don't think that anyone on the mailing list can give much support to such
an old version of click.

I don't think that ARPQuerier is the problem, the problem is that you are
running out of memory.  You need to find your memory leak and fix it.
 Newer versions of click can be configured/compiled with memory allocation
debugging, and/or you can look at changes to /proc/slabinfo for hints as to
where you might be leaking memory.

Cliff

On Fri, Feb 3, 2012 at 3:17 AM, sri <bskmohan at gmail.com> wrote:

> Hi,
>
> Did anybody face the below issue earlier?
> Appreciate your response.
>
> Thanks,
> K M
>
>
> ---------- Forwarded message ----------
> From: sri <bskmohan at gmail.com>
> Date: Tue, Jan 31, 2012 at 7:03 PM
> Subject: chatter: in arp querier: cannot make packet! -- resulting in
> kernel oops
> To: click at amsterdam.lcs.mit.edu
>
>
> Hi Experts,
>
> Am facing "oops, kernel could not allocate memory for skbuff" issue
> while generating ARP query by click router module.
> As a result, the machine is becoming unresponsive for some time and
> need to reboot for making it up again.
>
> It was working fine on the centos-5.3 kernel (2.6.18-128.el5) and
> recently upgraded the OS to centos 5.5.
> Machine is running 4 GB of RAM and am surprising how all the memory is
> consumed?
>
> Observed two things from the coredumps created at the time of crash:
>
> 1) failsafe_re_fo_ invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>
>  < THIS IS REPEATED FROM SOME OTHER PROCESSES>
> [<c044abd8>] out_of_memory+0x72/0x17a
>  [<c044beab>] __alloc_pages+0x237/0x2b8
>  [<c045f145>] cache_alloc_refill+0x217/0x3e4
>  [<c045ef26>] kmem_cache_alloc+0x22/0x2a
>  [<c046e9f2>] getname+0x1a/0xb0
>  [<c0460d62>] do_sys_open+0x12/0xae
>  [<c0460e2b>] sys_open+0x16/0x18
>  [<c0403c9b>] syscall_call+0x7/0xb
>
>
> 2) oops, kernel could not allocate memory for skbuff
> chatter: in arp querier: cannot make packet!
> BUG: unable to handle kernel NULL pointer dereference at virtual
> address 00000060
>
> Tracing the message showed that the control was flowing thru the following
> code:
> ------------------------------------ code snippet start
> -------------------------------------------------------
> src/click/click-1.6.0/lib/packet.cc
>        WritablePacket *
>        Packet::make(uint32_t headroom, const unsigned char *data, uint32_t
> len, uint32_t tailroom) {
>                int want = 1;
>                if (struct sk_buff *skb = skbmgr_allocate_skbs(headroom,
> len +
> tailroom, &want)) {
> ----
> src/click/click-1.6.0/linuxmodule/skbmgr.cc
>        struct sk_buff *skbmgr_allocate_skbs(unsigned headroom, unsigned
> size, int *want)
>        RecycledSkbPool::allocate(unsigned headroom, unsigned size, int
> want,
> int *store_got)
>                342   while (got < want) {
>                343     struct sk_buff *skb = alloc_skb(size, GFP_ATOMIC);
>                344 #if DEBUG_SKBMGR
>                345     _allocated++;
>                346 #endif
>                347     if (!skb) {
>                348       printk("<1>oops, kernel could not allocate
> memory for skbuff\n");
>                349       break;
>                350     }
> ------------------------------------code snippet end
> ----------------------------------------------
>
> At the time of crash, the following logs were thrown:
> -----------------------------------LOG start---------------------
> Mem-info:
> DMA per-cpu:
> cpu 0 hot: high 0, batch 1 used:0
> cpu 0 cold: high 0, batch 1 used:0
> DMA32 per-cpu: empty
> Normal per-cpu:
> cpu 0 hot: high 186, batch 31 used:77
> cpu 0 cold: high 62, batch 15 used:51
> HighMem per-cpu:
> cpu 0 hot: high 186, batch 31 used:6
> cpu 0 cold: high 62, batch 15 used:7
> Free pages:      902160kB (895592kB HighMem)
> Active:61161 inactive:3887 dirty:64 writeback:0 unstable:0 free:225540
> slab:183727 mapped-file:4522 mapped-anon:44493 pagetables:605
> DMA free:3576kB min:144kB low:180kB high:216kB active:24kB
> inactive:0kB present:16384kB pages_scanned:312286092
> all_unreclaimable? yes
> lowmem_reserve[]: 0 0 880 4080
> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 880 4080
> Normal free:2992kB min:8044kB low:10052kB high:12064kB active:0kB
> inactive:116kB present:901120kB pages_scanned:1056265289
> all_unreclaimable? yes
> lowmem_reserve[]: 0 0 0 25600
> HighMem free:895592kB min:512kB low:7824kB high:15140kB
> active:244620kB inactive:15432kB present:3276800kB pages_scanned:0
> all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB
> 1*2048kB 0*4096kB = 3576kB
> DMA32: empty
> Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB
> 0*1024kB 1*2048kB 0*4096kB = 2992kB
> HighMem: 7126*4kB 8504*8kB 7649*16kB 6080*32kB 3817*64kB 754*128kB
> 136*256kB 90*512kB 33*1024kB 11*2048kB 1*4096kB = 895592kB20555
> pagecache pages
> Swap cache: add 27, delete 25, find 0/0, race 0+0
> Free swap  = 4192848kB
> Total swap = 4192956kB
> Free swap:       4192848kB
> 1048576 pages of RAM
> 819200 pages of HIGHMEM
> 569028 reserved pages
> 32658 pages shared
> 2 pages swap cached
> 64 pages dirty
> 0 pages writeback
> 4522 pages mapped
> 183727 pages slab
> 605 pages pagetables
> oops, kernel could not allocate memory for skbuff
> chatter: in arp querier: cannot make packet!
> BUG: unable to handle kernel NULL pointer dereference at virtual
> address 00000060
>  printing eip:
> fb6e3df9
> *pde = 71b6d067
> Oops: 0000 [#1]
> last sysfs file: /devices/pci0000:00/0000:00:00.0/class
> Modules linked in: xfrm4_mode_transport(U) krng(U) ansi_cprng(U)
> chainiv(U) rng(U) authenc(U) aes_generic(U) testmgr_cipher(U)
> aes_i586(U) cbc(U) hmac(U) crypto_hash(U) testmgr(U)
> crypto_blkcipher(U) cryptomgr(U) esp4(U) xfrm4_esp(U) aead(U)
> crypto_algapi(U) xfrm_nalgo(U) crypto_api(U) softdog(U) click(U)
> proclikefs(U) deflate(U) zlib_deflate(U) af_key(U) pkp_drv(PU)
> autofs4(U) kick(U) dm_mirror(U) dm_log(U) dm_multipath(U) scsi_dh(U)
> dm_mod(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U)
> i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) ac(U)
> parport_pc(U) lp(U) parport(U) joydev(U) sg(U) i2c_i801(U) i2c_core(U)
> ide_cd(U) cdrom(U) cdc_ether(U) usbnet(U) bnx2(U) pcspkr(U) mptctl(U)
> mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) megaraid_sas(U)
> ata_piix(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U)
> ohci_hcd(U) ehci_hcd(U)
> CPU:    0
> EIP:    0060:[<fb6e3df9>]    Tainted: P      VLI
> EFLAGS: 00010292   (2.6.18-cisco.nac.3 #1)
> EIP is at _ZN11ARPQuerier14send_query_forERK9IPAddressitit+0x299/0x4c0
> [click]
> eax: cb0fedc0   ebx: 00000001   ecx: f7fff0c0   edx: c957f300
> esi: 00000000   edi: 00000060   ebp: 00000001   esp: f6bb0f14
> ds: 007b   es: 007b   ss: 0068
> Process kclick (pid: 2227, ti=f6bb0000 task=f745b000 task.ti=f6bb0000)
> Stack: fb76e178 1e31a61d ed3e10a8 0000e602 00000000 f40b5600 ed3e1000
> ed3e14c4
>       f4aa2300 f4aa2300 ed3e1000 f40b5600 f38bf440 ed3e1000 000000f5
> fb6e569c
>       00000000 00000001 0000e602 2e02c13c 00ed4f4c f746e470 f746e470
> ed3e14d8
> Call Trace:
>  [<fb6e569c>] _ZN11ARPQuerier11expire_hookEP5TimerPv+0x14c/0x230 [click]
>  [<fb6a540a>] _ZN6Master10run_timersEv+0xca/0x100 [click]
>  [<fb69c150>] _ZN12RouterThread6driverEv+0x190/0x290 [click]
>  [<fb7452d2>] _Z11click_schedPv+0x82/0x130 [click]
>  [<fb745250>] _Z11click_schedPv+0x0/0x130 [click]
>  [<c040496b>] kernel_thread_helper+0x7/0x10
>  =======================
> Code: 44 85 ed 0f 8e fa 00 00 00 31 d2 b9 2e 00 00 00 b8 1c 00 00 00
> c7 04 24 00 00 00 00 e8 31 dd f9 ff 85 c0 89 c6 0f 84 34 01 00 00 <8b>
> 56 60 31 c0 8b be a0 00 00 00 89 d1 c1 e9 02 f3 ab f6 c2 02
> EIP: [<fb6e3df9>]
> _ZN11ARPQuerier14send_query_forERK9IPAddressitit+0x299/0x4c0 [click]
> SS:ESP 0068:f6bb0f14
> ----------------------------------LOG end ----------------------
>
> My machine environment includes Centos-5.5 kernel (2.6.18-194.el5) and
> click-1.6 loaded as a kernel module.
>
> As we customized the click module, upgrading would be difficult and it
> was working very well with centos5.3 kernel.
>
> Any suggestions/pointers to resolve this or to find root cause are
> appreciated.
>
> Thanks in advance.
>
> --
> --
>  Krishna Mohan B
>
>
> --
> --
>  Krishna Mohan B
>
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>


More information about the click mailing list