[chord] infinite retries on KEYHASH blocks
Michael Walfish
mwalfish at lcs.mit.edu
Mon Jul 21 22:27:35 EDT 2003
Hello,
1) It appears, from the code in dhash/client.C, that the intended behavior
on a DHASH_KEYHASH retrieve() request when the block is not found on
the correct node is:
--look for the block on each successor
--if that fails, repeat looking for the block on each successor
'retries' number of times (hardcoded to 5 in the code below, marked with
**).
2) However, the code never checks the number of retries. I marked -------
below where it looks like this check should happen.
3) The current behavior is that the retries decrement ad infinitum. The
parameter goes to -2, -3, etc. Confirmed in gdb. And of course doing
dhashclient->retrieve(non_existent_chordid, DHASH_KEYHASH, wrap(&cb))
inside 'typetest' results in the code never returning (you can see the
retries happening every 5 seconds forever in the output from lsd). This is
all with the stock code.
Was wondering:
a) if infinite retrying is intended behavior
b) if not, whether this will be fixed soon. I'd be happy to mail in a
patch. It's just a tiny code modification, of course:
if (options & DHASHCLIENT_NO_RETRY_ON_LOOKUP) {
-->
if (retries == 0 || options & DHASHCLIENT_NO_RETRY_ON_LOOKUP) {
Thanks,
Mike
--------------------------------------------------------------------
from dhash/client.C
void
dhashcli::retrieve (blockID blockID, cb_ret cb, int options,
ptr<chordID> guess)
{
. . . .
if (blockID.ctype == DHASH_KEYHASH) {
ci->first_hop (wrap (this, &dhashcli::retrieve_block_hop_cb, rs, ci,
options, **5**, guess),
guess);
}
. . . .
}
void
dhashcli::retrieve_block_hop_cb (ptr<rcv_state> rs, route_iterator *ci,
int options, int retries, ptr<chordID> guess,
bool done)
{
. . . .
chord_node s = rs->succs.pop_front ();
dhash_download::execute (clntnode, s, rs->key, NULL, 0, 0, 0,
wrap (this, &dhashcli::retrieve_dl_or_walk_cb,
rs, status, options, retries, guess));
}
void
dhashcli::retrieve_dl_or_walk_cb (ptr<rcv_state> rs, dhash_stat status,
int options, int retries, ptr<chordID> guess,
ptr<dhash_block> blk)
{
chordID myID = clntnode->my_ID ();
if(!blk) {
if (options & DHASHCLIENT_NO_RETRY_ON_LOOKUP -------- ) {
rs->complete (DHASH_NOENT, NULL);
rs = NULL;
} else if (rs->succs.size() == 0) {
trace << myID << ": walk (" << rs->key << "): No luck walking successors, retrying..\n";
route_iterator *ci = r_factory->produce_iterator_ptr (rs->key.ID);
delaycb (5, wrap (ci, &route_iterator::first_hop,
wrap (this, &dhashcli::retrieve_block_hop_cb,
rs, ci, options, retries - 1, guess),
guess));
} else {
chord_node s = rs->succs.pop_front ();
dhash_download::execute (clntnode, s, rs->key, NULL, 0, 0, 0,
wrap (this, &dhashcli::retrieve_dl_or_walk_cb,
rs, status, options, retries,
guess));
}
} else {
rs->timemark ();
blk->ID = rs->key.ID;
blk->hops = rs->r.size ();
blk->errors = rs->nextsucc - dhash::num_dfrags ();
blk->retries = blk->errors;
rs->complete (DHASH_OK, blk);
rs = NULL;
}
}
More information about the chord
mailing list