[chord] C# Implementation of Chord (NChord)

Andrew Cencini acencini at gmail.com
Thu Aug 7 11:41:42 EDT 2008


Hi, Emil -

Thanks for adding the link.  Most of the testing and tuning was carried out
in a previous life in simulation on between one and 279 physical nodes.  I
built a set of automated integration tests (sadly, not included with the
distribution) that would build a simulated ring of some size (I have other
interesting observations about some fast and not-so-fast ways to build large
Chord rings), while also killing nodes at a fixed rate (not using depart).
While nodes were joining and being killed off, I would run FindSuccessor (on
a number of threads in parallel) against the IDs of some number of known
good nodes in the system, to ensure that routing was working correctly to
those nodes (pleasantly, it did).

In fact, the tool I used to generate the images at
http://www.cencini.com/ChordTopoMaps was in fact derived from a more
simplistic ring-continuity-verification tool I had built (I added hopcount
information to see if hopcounts would suffer terribly during churn, and
found that while there were a few "sticky" instances at times, the routing
remained correct and efficient).  Prior to that tool / those tests, I had
built a different test tool, ChordViewer, which simply followed successors
around the ring for visualization (I didn't know your vis tool existed at
the time but they appear to be pretty similar).  That approach, as I'm sure
you've found, works great in stable times, but under churn, following the
successor chain doesn't cut it - thus switching over to using FindSuccessor
to verify the correctness of the ring.

In that previous life, I worked for a hardware vendor, so I was able to get
my hands on a few incredibly nice collections of machines at various times
for testing.  One of my favorite failure conditions we came across was in a
factory setting, when one of the switches a number of Chord nodes was
attached to was misconfigured, causing the nodes periodically all to drop
their connection to each other.  I added a seed-node-cache-and-rejoin
maintenance task that helped the ring re-form itself once connectivity was
restored - this worked pretty well so I included an extremely simple version
of this with NChord.  A lot more obviously can (and has) been done in that
area.

As regards testing this in a wide area network, sadly, this work has (until
now) been relegated to internal test environments (mostly aimed at
medium-to-large internet data center deployments).  I'm hoping some
enterprising souls out there will find it interesting enough to help modify
this distribution a little bit to be better-suited to a wide-area scenario.
I've done a bunch of other micro-tweaking experiments on the Chord
implementation over the past couple of years though I haven't observed (in
most cases) anything too shocking though I am certain to keep trying.  

What's included with NChord is sort of the most straightforward and highly
functional set of functionality I could put together - mainly out of
courtesy to those who want to perform experiments using this implementation
of Chord without having to explain or factor out non-standard optimizations,
etc.  I also have written and submitted a paper on building a structured
object storage and retrieval system on top of a predecessor to NChord (core
routing, etc. should be functionally identical though the implementation may
differ somewhat).  I'd be happy to share a draft of the paper privately if
there's interest in seeing those results as well.  

Please let me know if there are any other questions or comments you may
have.  I have really enjoyed working with Chord, and it's also tremendously
enjoyable to discuss and hear what good work others are up to with it.

Thanks,
--andrew

-----Original Message-----
From: Emil Sit [mailto:sit at MIT.EDU] 
Sent: Thursday, August 07, 2008 4:35 AM
To: Andrew Cencini
Cc: chord at amsterdam.lcs.mit.edu
Subject: Re: [chord] C# Implementation of Chord (NChord)

On Wed, 06 August 2008 at 19:17 (-0700), Andrew Cencini wrote:
> implementation of Chord.  After spending a good amount of time tweaking
and
> tuning the implementation, I've put up an open-source version of this
> implementation (NChord) on SourceForge.

Cool.  I've added a link to your site at

    http://pdos.csail.mit.edu/chord/faq.html#lang

Can you write a little bit about how you tested and tuned your
implementation? e.g. what churn and network failure conditions?
It looks like you had a nice cluster of over 5000 chord nodes
running at some point.  Do you have experience running it in
the wide-area at all?

Thanks!

-- 
Emil Sit / MIT CSAIL PDOS / http://pdos.csail.mit.edu/chord/  



More information about the chord mailing list