Documentation

Before you get started

There are three basic building blocks for a full WheelFS setup:

  • Storage servers: A node that is responsible for storing and serving files and directories.
  • Clients: A node that mounts WheelFS via FUSE, and communicates with storage nodes to access and update files and directories, on behalf of some user or application.
  • Configuration servers: A node that keeps track of all the storage nodes in a WheelFS network.

Any particular computer can host any of these components at the same time, and can even run multiple copies of the same components at the same time (listening on different ports and using different mount points, of course). A functional WheelFS network must have at least one instance of each building block running somewhere.

If you'd like to use our existing PlanetLab WheelFS network, all you need to worry about is setting up clients for yourself – we already have a network of configuration and storage servers that you can connect to and use for free.

Installing WheelFS

PlanetLab nodes

Currently, we suggest that you run the WheelFS client in your own slice, and use it to connect to our network of WheelFS PlanetLab servers. To do that:

  • On your local machine, generate a unique public/private RSA key pair using ssh-keygen. Use an empty password when generating your keys.
  • Send an email to wheelfs-admin at the domain pdos.csail.mit.edu including your PlanetLab slice name and the RSA public key you just generated. Also explain how you plan to use WheelFS. You will get a response within a few days.
  • Send an email to support at the domain planet-lab.org asking them to add the fd_fusemount and umount vsys scripts to your slice, for all nodes in the “wfs” group.
  • Download the necessary PlanetLab RPMs: fuse, fuse-libs, and wfs-client-pl.
  • Distribute and install these RPMs on all the PlanetLab nodes on which you want to run. For now, you can only use nodes in the “wfs” group. To find those nodes and add them to your slice, you can run this script: getnodes_wfs.py
  • On every node, save the generated public/private key pair in a pair of local files: /etc/wfs_key_dir/user_keys/slice_name[.pub]. For example, if you upload your private key to each node as /tmp/id_rsa[.pub], running these commands on each node will do what you want:
sudo mkdir -p /etc/wfs_key_dir/user_keys
sudo chmod -R 777 /etc/wfs_key_dir
mv /tmp/id_rsa /etc/wfs_key_dir/user_keys/`whoami`
mv /tmp/id_rsa.pub /etc/wfs_key_dir/user_keys/`whoami`.pub
  • Run /etc/init.d/fuse-client-pl start on every node. This mounts WheelFS at /mnt/wfs.
  • If you'd like WheelFS to survive a machine or vserver reboot, add /etc/init.d/fuse-client-pl start to /etc/rc.vinit on all the nodes. (This might not work 100% of the time.)

TODO: How to run a full server+client WheelFS network.

See the PlanetLab user tools page for tools that might help you run these commands on many nodes in parallel.

Non-PlanetLab nodes

We do not officially support running WheelFS on non-PlanetLab machines. You can give it a try, but we can't promise it will work (or even compile).

  • First, make sure that the machine you're compiling on meets all the compilation prerequisites.
  • Then, download the latest WheelFS source code and compile it.
  • Next, create a distribution to install on all your target WheelFS nodes.
  • Now install that distribution on all the target nodes.
  • Then follow the instructions for running WheelFS from the command line.

Using WheelFS

Introduction

WheelFS looks just like a regular part of your file system, and each client node will see the same files in its WheelFS mount as all the other nodes. You and your application can access files and directories within the WheelFS mount point just as if it were a local file system – by default, WheelFS offers the same close-to-open consistency semantics as a local file system (which means once one node closes a file, another node that opens the same file will see the changes made by the first node). However, because WheelFS is designed to run over the wide-area network, it provides controls to you and your application that can be used to change its behavior.

We call these controls semantic cues, and they are very easy to use. They can be inserted into the pathname of a file, just like a directory. For example, consider the following file (where WheelFS is mounted at /mnt/wfs/):

/mnt/wfs/strib/foo

If an application wanted WheelFS to open the file using eventual consistency semantics (see below for an explanation of eventual consistency), it would access the file using the following pathname:

/mnt/wfs/strib/.EventualConsistency/foo

Both of these pathnames refer to exactly the same file: updates through one of the pathnames on one node will still eventually appear when another node reads the pathname. It's the same file, being accessed through a different path, with potentially different consistency behavior.

Semantic cues

Cues come in a few different categories, generally chosen to reflect the challenges presented by wide-area networks:

  • Consistency: These cues allow applications to control how hard WheelFS should try to present an up-to-date view of data across all nodes. In general, to get better consistency, applications must be willing to endure slower file system operation when there are node failures.
  • Placement: These cues allow applications to choose where data are stored in WheelFS so that, for example, data can be read quickly by nodes that are likely to access it in the future. Storage locations are specified at the granularity of a site; each node is configured by the WheelFS administrator to belong to one site. For the PlanetLab deployment, nodes are group together using their domain names (e.g., planetlab2.csail.mit.edu and planetlab3.csail.mit.edu would belong to the same site, “csail.mit.edu”).
  • Durability: These cues allow applications to control the amount of resources and time WheelFS spends making sure data is durable and will survive failures. In general, to make data more durable, the application must use more bandwidth and writes will be slower.
  • Large reads: These cues allow applications to enable large read optimizations. If the applications knows that it will be reading all of a large file, or if a lot of nodes will be reading the same large file at the same time (e.g., a program binary at startup time), these cues might improve performance.
  • Administrative: Miscellaneous cues useful for system administration.

Some cues are permanent, while others are transient. Permanent cues should be specified when a file or directory is created, and effect the behavior of that file or directory for the rest of its existence. Transient cues, on the other hand, only affect a specific reference to a file or directory (e.g., when open returns a file handle for a pathname using a transient cue, all reads/writes using that file handle will be affected by the cue). Later references to the file not using the cue will not be affected.

Below is a summary of the cues that WheelFS currentl supports. Cues can be referenced either by their full name or their abbreviation. Applications can override cues that appear earlier in the pathname by using the cue's anti-cue.

Category Cue (abbrev) Type Description Anti-cue (abbrev)
Consistency .EventualConsistency (.ec) Transient Relax WheelFS's default close-to-open consistency semantics. This means some nodes may not read the most recent writes done by another node. Improves performance when there are failures, since WheelFS will not wait a long time for the failure to heal. .Strict
.MaxTime=T (.mt=T) Transient When WheelFS communicates over the network to other nodes, only wait T milliseconds for a response. The default network timeout is T=10000 milliseconds (10 seconds).
Placement .Site=X Permanent Place the primary copy of data created under this cue at site X whenever possible.
.KeepTogether (.kt) Permanent When a directory is created using this cue, any later subdirectory or file created under it will be placed at the same site as its parent directory. Allows applications to ensure that an entire subtree of data is co-located at a single site for quick access. .WriteLocal (.wl)
.RepSites=NRS (.rs=NRS) Permanent Replicate data at NRS distinct sites whenever possible. By default, NRS=NRL – that is, each replica of a file or directory is kept at a distinct site.
Durability .RepLevel=NRL (.rl=NRL) Permanent Replicate a file or directory NRL times, where NRL ≥ 1. By default, NRL = 3.
.SyncLevel=NSL (.sl=NSL) Transient When updating a file or directory, ensure the update is synchronized to at least NSL replicas before returning successfully to the application, where NSL ≥ 0 and NSLNRL. By default, NSL = NRL to ensure consistency in the face of failures.
Large Reads .WholeFile (.wf) Transient Tells WheelFS that the entire file will eventually be read, so that WheelFS can start caching the data in anticipation of future requests. For directories, tells WheelFS that all of the entries in the directory will be inspected (e.g., ls -l), so that WheelFS can gather the necessary information in advance. This can waste bandwidth if the entire file or directory is not subsequently used by the application. .NotWholeFile (.nwf)
.HotSpot (.hs) Transient Enables WheelFS's peer-to-peer cooperative caching mode for a particular file. This attempts to read data for a file from nearby nodes that have already read the data, similar to Bittorrent. Can be used to efficiently distribute large files being read by many clients simultaneously. .NotHotSpot (.nhs)
Administrative .Sites Transient Acts as a virtual directory that lists the site names of all currently live WheelFS storage servers.
.Nodes Transient Acts as a virtual directory that lists the IP address, port, and internal WheelFS ID of all currently live WheelFS storage servers. If used within a specific site directory (e.g., /wfs/.sites/csail.mit.edu/.nodes), it will only list nodes that are associated with the given site.
.Ident=X Transient Manage permissions for a particular WheelFS user X. By default, calls on WheelFS such as chmod or ls -l manage permissions for the user that the application is running as. Using the cue, applications can set or list permissions for any WheelFS user.

Examples uses

These examples assume you have WheelFS mounted at /mnt/wfs.

Cache directory

If you'd like to use WheelFS to store cached data that doesn't have to be strongly consistent (i.e., it's ok if some nodes see an out-of-date version of the data sometimes), but you'd like it to limit network timeouts to one second, you can use a directory path for the top-pevel of the cache as follows:

/mnt/wfs/app_name/.EventualConsistency/.MaxTime=1000

Program binary

If you're using WheelFS to distribute a large binary to many nodes (e.g., in order to start a large distributed experiment), you can direct the nodes to read the file out of WheelFS using a pathname like this one:

/mnt/wfs/app_name/.WholeFile/.HotSpot/program_binary

Geographic data placement

Let's say you're using WheelFS to build a large Web service, and your users each have an individual set of data that they read all the time. Let's also say that you know which of your data centers each user is closest to (and will be routed to each time he/she access your website). When you create an account for the new user, you could create a directory like the following for them in WheelFS, where closest_site is the WheelFS site name you've assigned to the nodes at the data center closest to the new user:

/mnt/wfs/app_name/.Site=closest_site/.KeepTogether/user_name

Then, any time your Web service needs to save data for that user in the file system, it can simply write into the directory subtree starting at /mnt/wfs/app_name/user_name/ (no need to use placement cues once the directory is created), and WheelFS will guarantee that a replica will be located close to the user.

 
documentation.txt · Last modified: 2009/08/06 15:31 by strib · [Old revisions]
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki