[chord] question about chord

Frank Dabek fdabek at MIT.EDU
Tue Jan 6 17:42:13 EST 2004


On Tue, 2004-01-06 at 17:34, David Nguyen wrote:
> Hi,
> 
> my name is david nguyen, i've emailed a couple times before since we are
> just trying
> to see how chord works as part of a class study.  I'm a second year graduate
> student
> in computer engineering and this work is solely for the class, not personal
> research.
> 
> i had two nodes running chord but was wondering what some of the status
> messages
> mean.  I understand the ID's and finger table and successor ideas from teh
> Chord paper
> but was wondering what the end of this logfile (that i created) means:
> starting at line  ***********CARDINAL JOINS TRUNKS*************
> 

see below.

> also, for dhash, is there a paper you could point us to such that we get a
> better
> idea of what is going on?
> 

The CFS SOSP paper is probably your best bet, but it doesn't include
information about the use of fragmentation in DHash. That will be
covered in an upcoming NSDI paper.


> Thanks so much,
> -dave
> 
> *******CARDINAL JOINS TRUNKS*******
> 25002844dd6f071a90167950bce5e2e933870c86: estimating total number of nodes
> as 6
> 25002844dd6f071a90167950bce5e2e933870c86: stabilize_finger: findsucc of
> finger 1
> 25002844dd6f071a90167950bce5e2e933870c86: stabilize_finger: findsucc of
> finger 158

because the new node joined a couple of finger entries (1 and 158) had
to be updated. the node called 'findsucc' to determine the new entries.
To avoid chatter in the log files, nodes only print messages about
entries that have changed since the last time they were inspected.

> 1073331920:929677 25002844dd6f071a90167950bce5e2e933870c86: stabilize:
> stable! stabilize timer 1600

the system has become stable. "stable" means that the stablize routine
looked at every finger and found none that needed to be changed.

> 1073335064:979685 REXMIT 794405545 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073335064:979815 REXMIT 2371499125 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073335340:549677 REXMIT 708308122 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073335340:549805 REXMIT 889207735 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073349210:389773 REXMIT 698669136 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073355014:139681 REXMIT 3808437255 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073355014:139790 REXMIT 1974313820 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073360456:749827 REXMIT 3978217233 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073360456:749941 REXMIT 1786272656 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073363545:019683 REXMIT 313621822 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073418747:989684 REXMIT 2264650877 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 1073418747:989816 REXMIT 2023465229 rexmits 0, timeout 5 ms, destined for
> 192.168.1.3
> 

These messages mean that several RPCs had to be retransmitted becase
they were delayed for more than 5ms. On a local network the retransmit
timers are set very low (since the vast majority of packets have very
low RTTs). If a RPC is delayed for some reason (disk read, page fault,
heavy load) it may be retransmitted. This isn't unusual (but probably
not optimal either).

--Frank

> _______________________________________________
> chord mailing list
> chord at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/chord



More information about the chord mailing list