[chord] Problem with dhashcli
Katarzyna Stefanowicz
kate_stefik at tlen.pl
Sun Feb 10 01:38:43 EST 2008
Emil Sit wrote:
> Czesc Katarzyna,
Hello Emil,
do you know Polish?
> You should be able to mirror the code in dhashclient::insert
> below in your publish code to deal with this.
>
> Something like (untested)...
>
> void
> incognito_impl::publish (const incognito_store_arg &req)
> {
> str data = str (req.data.base (), req.data.size ());
> if (req.key_type == DHASH_NOAUTH)
> data = dhblock_noauth::marshal_block (data);
> ref<dhash_block> = New refcounted<dhash_block> (data, req.key_type);
>
> ...
>
> Does that help?
Thank you very much for explanation and advice. It was very helpful.
I did as you suggested but unfortunately some think is still wrong.
My current publish function looks like this:
> void incognito_impl::publish(const incognito_store_arg& req)
> {
> str data = str (req.data.base (), req.data.size ());
> str marshalled_data;
> if (req.key_type == DHASH_NOAUTH)
> marshalled_data = dhblock_noauth::marshal_block (data);
> else if (req.key_type == DHASH_CONTENTHASH)
> marshalled_data = dhblock_chash::marshal_block (data);
> else {
> warnx << "incognito_impl::publish: unknown key_type: "
> << " FIXME: " << __FILE__ << ":" << __LINE__ << "\n";
> return;
> }
>
>
> ref<dhash_block> block = New refcounted<dhash_block> (marshalled_data, req.key_type);
> block->ID = req.key_value;
>
> _dhcli->insert (block,
> wrap (mkref (this), &incognito_impl::publish_insert_cb));
> }
I start my network. At the beginning noauth block db at node
e2bb6d6101eff02a4a1eca3e578800732999a561 (will be needed later) is
empty:
> chordtest at test2:/tmp/dhash-test2-c$ /tmp/dbdump -t
> db/e2bb6d6101eff02a4a1eca3e578800732999a561.n
> EOF.
> total keys: 0
> total bytes: 0
Then I insert some noauth block with random id (here:
b444ac06613fc8d63795be9ad0beaf5500000000). It works:
> 1202621799.034331 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 2/5) -> e2bb6d6101eff02a4a1eca3e578800732999a561 in 3ms: DHASH_OK
> 1202621799.035274 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 1/5) -> bc2cd702b77a6da8b3d7b254ae38450efe760bd2 in 4ms: DHASH_OK
> 1202621799.036149 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 4/5) -> ed2b9e87da6e557d4e4c048995d7ac61eb2481bb in 5ms: DHASH_OK
> 1202621799.038140 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 3/5) -> ea182e8573df8cf54320288a8f70e26bf4cb4464 in 7ms: DHASH_OK
> 1202621799.038236 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 5/5) -> fb5fa23d77f54f3679d948430bcda824abd48a60 in 7ms: DHASH_OK
I'm able to read contents of this block.
Then I check the same db:
> chordtest at test2:/tmp/dhash-test2-c$ /tmp/dbdump -t
> db/e2bb6d6101eff02a4a1eca3e578800732999a561.n
> key[1] b444ac06613fc8d63795be9ad0beaf5500000000 16 0
> EOF.
> total keys: 1
> total bytes: 16
But when I try to insert the same block again I get:
> 1202621838.928104 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 2/5) -> e2bb6d6101eff02a4a1eca3e578800732999a561 in 1ms: DHASH_STALE
> 1202621838.928321 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 1/5) -> bc2cd702b77a6da8b3d7b254ae38450efe760bd2 in 1ms: DHASH_STALE
> 1202621838.929519 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 4/5) -> ed2b9e87da6e557d4e4c048995d7ac61eb2481bb in 3ms: DHASH_STALE
> 1202621838.929631 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 3/5) -> ea182e8573df8cf54320288a8f70e26bf4cb4464 in 3ms: DHASH_STALE
> 1202621838.929683 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 5/5) -> fb5fa23d77f54f3679d948430bcda824abd48a60 in 3ms: DHASH_STALE
> 1202621838.929695 dhashcli: ad91f3d2b200dba01e82a43a4dad65a76a98570f: store (b444ac06613fc8d63795be9ad0beaf5500000000): only stored 0 of 5 encoded.
> 1202621838.929706 dhashcli: ad91f3d2b200dba01e82a43a4dad65a76a98570f: store (b444ac06613fc8d63795be9ad0beaf5500000000): failed; insufficient frags/blocks stored.
And worse - when I try to insert different content, all target nodes
crash:
> 1202621856:787429 RPC failure: RPC: Timed out destined for bc2cd702b77a6da8b3d7b254ae38450efe760bd2 at 10.14.5.6 seqno 255 out 138333412
> 1202621856.787515 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 1/5) -> bc2cd702b77a6da8b3d7b254ae38450efe760bd2 in 2903ms: DHASH_RPCERR
> 1202621856:791351 RPC failure: RPC: Timed out destined for ed2b9e87da6e557d4e4c048995d7ac61eb2481bb at 10.14.5.4 seqno 258 out 138321228
> 1202621856.791382 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 4/5) -> ed2b9e87da6e557d4e4c048995d7ac61eb2481bb in 2907ms: DHASH_RPCERR
> 1202621856:799349 RPC failure: RPC: Timed out destined for ea182e8573df8cf54320288a8f70e26bf4cb4464 at 10.14.5.2 seqno 257 out 138307420
> 1202621856.799379 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 3/5) -> ea182e8573df8cf54320288a8f70e26bf4cb4464 in 2915ms: DHASH_RPCERR
> 1202621856:799404 RPC failure: RPC: Timed out destined for e2bb6d6101eff02a4a1eca3e578800732999a561 at 10.14.5.2 seqno 256 out 138327772
> 1202621856.799433 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 2/5) -> e2bb6d6101eff02a4a1eca3e578800732999a561 in 2915ms: DHASH_RPCERR
> 1202621856:803358 RPC failure: RPC: Timed out destined for fb5fa23d77f54f3679d948430bcda824abd48a60 at 10.14.5.2 seqno 259 out 138338956
> 1202621856.803393 dhashcli: store b444ac06613fc8d63795be9ad0beaf5500000000 (frag 5/5) -> fb5fa23d77f54f3679d948430bcda824abd48a60 in 2919ms: DHASH_RPCERR
> 1202621856.803407 dhashcli: ad91f3d2b200dba01e82a43a4dad65a76a98570f: store (b444ac06613fc8d63795be9ad0beaf5500000000): only stored 0 of 5 encoded.
> 1202621856.803418 dhashcli: ad91f3d2b200dba01e82a43a4dad65a76a98570f: store (b444ac06613fc8d63795be9ad0beaf5500000000): failed; insufficient frags/blocks stored.
On dead node I can see (first line after first insert):
> lsd: e2bb6d6101eff02a4a1eca3e578800732999a561 db write: U b444ac06613fc8d63795be9ad0beaf5500000000 16
> lsd: ../../../chord-0.1/dhash/dhblock_noauth_srv.C:101: void dhblock_noauth_srv::after_delete(chordID, str, u_int32_t, cb_dhstat, adb_status): Assertion `err == ADB_OK' failed.
I've run lsd under gdb, and it looks like err=ADB_NOTFOUND:
> (gdb) bt
> #0 0xb7f04402 in __kernel_vsyscall ()
> #1 0xb7b759d1 in raise () from /lib/tls/i686/cmov/libc.so.6
> #2 0xb7b77219 in abort () from /lib/tls/i686/cmov/libc.so.6
> #3 0xb7b6f0df in __assert_fail () from /lib/tls/i686/cmov/libc.so.6
> #4 0x080bb53a in dhblock_noauth_srv::after_delete (this=0x83de598, key=@0xbfeae374, data=@0xbfeae388, exp=0, cb=@0xbfeae380, err=ADB_NOTFOUND)
> at ../../../chord-0.1/dhash/dhblock_noauth_srv.C:101
> #5 0x080bdc88 in callback_c_1_4<dhblock_noauth_srv*, dhblock_noauth_srv, void, adb_status, bigint, str, unsigned int, ptr<callback<void, dhash_stat, void, void> > >::opera
> tor() (this=0x83e0d88, b1=ADB_NOTFOUND) at /home/maya/incognito/src/build/chord/../sfslite/../../sfslite-0.8.16/async/callback1.h:2183
> #6 0x08148629 in adb::generic_cb (this=0x83de64c, res=0x83e04a8, cb=@0xbfeae3dc, err=RPC_SUCCESS)
> at /home/maya/incognito/src/build/chord/../sfslite/../../sfslite-0.8.16/async/callback1.h:4198
> #7 0x0814c434 in callback_c_1_2<adb*, adb, void, clnt_stat, adb_status*, ptr<callback<void, adb_status, void, void> > >::operator() (this=0x1340, b1=RPC_SUCCESS)
> at /home/maya/incognito/src/build/chord/../sfslite/../../sfslite-0.8.16/async/callback1.h:1890
> #8 0x0828bc05 in rpccb::finish (this=0x83e38f8, stat=RPC_SUCCESS) at ../../../sfslite-0.8.16/arpc/aclnt.C:139
> #9 0x0828db86 in aclnt::dispatch (xi=@0xbfeae534, msg=0xb798e00c "?\031mP", len=28, src=0x0) at ../../../sfslite-0.8.16/arpc/aclnt.C:610
> #10 0x0829fbd5 in xhinfo::dispatch (this=0x83de8a0, msg=0xb798e00c "?\031mP", len=<value optimized out>, src=0x0) at ../../../sfslite-0.8.16/arpc/xhinfo.C:88
> #11 0x082975ce in axprt_pipe::getpkt (this=0x83de6f8, cpp=0xbfeae5c8, eom=0xb798e028 "") at ../../../sfslite-0.8.16/arpc/axprt_pipe.C:302
> #12 0x08296abe in axprt_pipe::callgetpkt (this=0x83de6f8) at ../../../sfslite-0.8.16/arpc/axprt_pipe.C:361
> #13 0x08297431 in axprt_pipe::input (this=0x83de6f8) at ../../../sfslite-0.8.16/arpc/axprt_pipe.C:332
> #14 0x082a4a8a in fdcb_check () at ../../../sfslite-0.8.16/async/core.C:275
> #15 0x082a507d in amain () at ../../../sfslite-0.8.16/async/core.C:427
> #16 0x0804ff41 in main (argc=14, argv=0xbfeae8d4) at ../../../chord-0.1/lsd/lsd.C:786
Result of running dbdump is the same (1 entry). Do you have any ideas
what may be wrong?
Best regards,
Katsiaryna Stsefanovich
More information about the chord
mailing list