In this lab you will write a device driver for a network interface card (NIC) and add support for UDP network sockets to xv6. This lab has 4 weeks allocated to it. We expect you can do parts 1 and 2 each in a week, leaving another week (excluding the Thanksgiving break) for part 3.
Fetch the xv6 source for the lab and check out the net branch:
$ git fetch $ git checkout net $ make clean
Before writing code, you may find it helpful to review "Chapter 4: Traps and device drivers", "Section 7.13: File descriptor layer" from the xv6 book, and the lecture notes on networking.
You'll use a network device called the E1000 to handle network communication. To xv6 (and the driver you write), the E1000 looks like a real piece of hardware connected to a real Ethernet local area network (LAN). But in reality, the E1000 your driver will talk to is an emulation provided by qemu, connected to a LAN that is also emulated by qemu. On this LAN, xv6 (the "guest") has an IP address of 10.0.2.15. The only other (emulated) computer on the LAN has IP address 10.0.2.2. qemu arranges that when xv6 uses the E1000 to send a packet to 10.0.2.2, it's really delivered to the appropriate application on the (real) computer on which you're running qemu (the "host").
You will use QEMU's "user-mode network stack" to emulate a LAN. QEMU's documentation has more about the user-mode stack here. We've updated the Makefile to enable QEMU's user-mode network stack and the E1000 network card.
We have also configured QEMU to record all incoming and outgoing packets to packets.pcap in your lab directory. It may be helpful to review these recordings to confirm that xv6 is sending and receiving the packets you expect. To get a hex/ASCII dump of captured packets use tcpdump like this:
tcpdump -XXnr packets.pcap
We've added some files to the xv6 repository for this lab to help get you started. kernel/net.c and kernel/net.h contain all the code you will need to create and parse packet headers for for ethernet, IP, UDP, and ARP. These files also contain code for a flexible data structure to hold packets, called an mbuf. The file kernel/e1000.c contains initialization code the for the E1000 as well as empty functions for transmitting and received packets, which you'll fill in. kernel/e1000_dev.h contains definitions for registers and flag bits defined by the E1000 and described in the Intel E1000 Software Developer's Manual. Finally, kernel/pci.c contains code that runs when xv6 boots and searches for an E1000 card on the PCI bus.
In this part of the assignment, you will complete the implementation of the E1000 networking driver by adding code to handle received packets and to transmit packets.
Browse Intel's Software Developer's Manual for the E1000. This manual covers several closely related Ethernet controllers. QEMU emulates the 82540EM. Skim Chapter 2 now to get a feel for the device. To write your driver, you'll need to be familiar with Chapters 3 and 14, as well as 4.1 (though not 4.1's subsections). You'll also need to use Chapter 13 as a reference. The other chapters mostly cover components of the E1000 that your driver won't have to interact with. Don't worry about the details right now; just get a feel for how the document is structured so you can find things later. Keep in mind that the E1000 has many advanced features, most of which you can ignore. Only a small set of basic features is needed to complete this lab.
Your job is to implement support for sending and receiving packets. You'll need to write code for e1000_recv() and e1000_transmit(), both in kernel/e1000.c.
The e1000_init() function we provide you in e1000.c configures the E1000 to read packets to be transmitted from RAM, and to write received packets to RAM. This technique is called DMA, for direct memory access, referring to the fact that the E1000 hardware directly writes and reads packets to/from RAM.
Because bursts of packets might arrive faster than the driver can process them, e1000_init() provides the E1000 with multiple buffers into which the E1000 can write packets. The E1000 requires these buffers to be described by an array of "descriptors" in RAM; each descriptor contains an address in RAM where the E1000 can write a received packet. struct rx_desc describes the descriptor format. The array of descriptors is called the receive ring, or receive queue. It's a circular ring in the sense that when the card or driver reaches the end of the array, it wraps back to the beginning. e1000_init() allocates mbuf packet buffers for the E1000 to DMA into. There is also a transmit ring into which the driver places packets it wants the E1000 to send. e1000_init() configures the two rings to have size RX_RING_SIZE and TX_RING_SIZE.
When the E1000 receives new packets from the ethernet, it first DMAs them to mbufs pointed to by RX (receive) ring descriptors, and then generates an interrupt. Your e1000_recv() code must scan the RX ring and deliver each new packet's mbuf to the network stack (in net.c) by calling net_rx(). You will then need to allocate a new mbuf and place it into the descriptor, so that when the E1000 reaches that point in the RX ring again it finds a fresh buffer into which to DMA a new packet.
When the network stack in net.c needs to send a packet, it calls e1000_transmit() with an mbuf that holds the packet content to be sent. Your transmit code must place the mbuf in a descriptor in the TX (transmit) ring. This includes extracting the payload's location in memory and its length, and encoding this information into a descriptor in the TX ring. struct tx_desc describes the descriptor format. You will need to ensure that mbufs are eventually freed, but only after the E1000 has finished sending the packet (the E1000 sets the E1000_TXD_STAT_DD bit in the descriptor to indicate this).
In addition to reading and writing the descriptor rings in RAM, your driver will need to interact with the E1000 through its memory-mapped control registers, to detect when received packets are available and to inform the E1000 that the driver has filled in some TX descriptors with packets to send. The global variable regs holds a pointer to the E1000's first control register; your driver can get at the other registers by indexing regs as an array. You'll need to use indices E1000_RDT and E1000_TDT in particular.
To test your driver, run make qemu in one window, and in another window on the same machine run make ping. This tool asks your host operating system to send one UDP packet per second to xv6. First, however, your host operating system sends an "ARP" request packet to xv6 to find out its 48-bit Ethernet address, and expects xv6 to respond with an ARP reply. Your xv6 won't be able to send these ARP replies until you've completed this part of the lab, so your host operating system will keep sending ARP requests.
Once you've finished this part, when you run make ping, xv6 will transmit just one packet (an ARP reply) and ping will send UDP/IP packets. tcpdump -XXnr packets.pcap should produce output like this:
reading from file packets.pcap, link-type EN10MB (Ethernet) 22:01:52.121066 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 0x0000: ffff ffff ffff 5255 0a00 0202 0806 0001 ......RU........ 0x0010: 0800 0604 0001 5255 0a00 0202 0a00 0202 ......RU........ 0x0020: 0000 0000 0000 0a00 020f .......... 22:01:52.121516 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56, length 28 0x0000: ffff ffff ffff 5254 0012 3456 0806 0001 ......RT..4V.... 0x0010: 0800 0604 0002 5254 0012 3456 0a00 020f ......RT..4V.... 0x0020: 5255 0a00 0202 0a00 0202 RU........ 22:01:52.121646 IP 10.0.2.2.54253 > 10.0.2.15.2000: UDP, length 15 0x0000: 5254 0012 3456 5255 0a00 0202 0800 4500 RT..4VRU......E. 0x0010: 002b 0000 0000 4011 62b2 0a00 0202 0a00 .+....@.b....... 0x0020: 020f d3ed 07d0 0017 399b 7468 6973 2069 ........9.this.i 0x0030: 7320 6120 7069 6e67 21 s.a.ping! 22:01:53.126538 IP 10.0.2.2.54253 > 10.0.2.15.2000: UDP, length 15 0x0000: 5254 0012 3456 5255 0a00 0202 0800 4500 RT..4VRU......E. 0x0010: 002b 0001 0000 4011 62b1 0a00 0202 0a00 .+....@.b....... 0x0020: 020f d3ed 07d0 0017 399b 7468 6973 2069 ........9.this.i 0x0030: 7320 6120 7069 6e67 21 s.a.ping!
Your output will look somewhat different, but it should contain the strings "ARP, Request", "ARP, Reply", "UDP", and "this.is.a.ping!".
Some hints for receiving:
Some hints for sending:
By the way, you'll need locks to cope with the possibility that xv6 might use the E1000 from more than one process, or might be using the E1000 in a kernel thread when an interrupt arrives.
Now that you have finished the E1000 driver, you will need to support userspace applications. To help with this, a test user program called nettests has been provided, but you will need to implement support for network sockets first so that it can interact with xv6.
Sockets are a standard system call API to allow user programs to use the network. Real operating systems support many kinds of sockets, but for this lab you will implement only sockets that allow access to the UDP network protocol. A UDP socket allows a program to exchange packets with a program on another computer. In order for the UDP protocol to know which programs are communicating, each socket is tagged with a local port number, a remote IP address, and a remote port number. You can find out more about UDP with a web search for User Datagram Protocol.
A socket appears to a user program as a file descriptor. A program creates a new socket with the connect(raddr, lport, rport) system call. raddr is the 32-bit IP address of the other (remote) computer, lport is the local port number, and rport is the port number on the other computer. Calling read() on a socket file descriptor receives a single packet. If no packets are currently available to be received, a read blocks and waits for the next packet to arrive. Calling write() sends a packet. You can see some examples in user/nettests.c.
Each network socket only receives packets for the combination of remote IP address and port numbers passed to connect(). Your implementation is required to support multiple sockets.
We have provided you with an implementation of sys_connect() in kernel/sysfile.c and a partial implementation of sockets in kernel/sysnet.c. sys_connect() calls sockalloc(), which creates a struct sock object for each socket. sockets is a linked list of all active sockets. It is useful for finding which socket to deliver newly received packets to. Each socket object maintains a queue of mbufs waiting to be read. Packets stay in these queues until the read() system call dequeues them.
Your job is to add code to xv6 to implement read(), write(), and close() for sockets; add your code to the end of sysnet.c; you'll also have to modify kernel/file.c to hook into the system calls. You'll also need to complete the implementation of sockrecvudp(), which net.c calls each time a UDP packet is received from the E1000.
$ nettests testing one ping: OK testing single-process pings: OK testing multi-process pings: OK testing DNS DNS arecord for pdos.csail.mit.edu. is [ITS_IP_ADDRESS] DNS OK all tests passed. $
And, at the same time, this output from make server:
$ make server python3 server.py 26099 listening on localhost port 26099 hello world! ...
Here are some hints:
In this part you can extend xv6 in any way you fancy. Most of the labs, including this one, have optional challenges and they may provide inspiration, but you are not restricted to them. Another source of inspiration are example projects, but some of which are quite ambitious.
Your job is to extend xv6 in way that is interesting to you. Write a paragraph about what you did and submit that in answers-net.txt. XXXSubmit code on a separate branch?.
This completes the lab. Make sure you pass all of the make
grade tests. If this lab had questions, don't forget to write up your
answers to the questions in answers-lab-name.txt. Commit your changes
(including adding answers-lab-name.txt) and type make handin in the lab
directory to hand in your lab.
Create a new file, time.txt, and put in it a single integer, the
number of hours you spent on the lab. Don't forget to git add and
git commit the file.
Submit the lab
You will turn in your assignments using
website. You need to request once an API key from the submission
website before you can turn in any assignments or labs.
This completes the lab. Make sure you pass all of the make grade tests. If this lab had questions, don't forget to write up your answers to the questions in answers-lab-name.txt. Commit your changes (including adding answers-lab-name.txt) and type make handin in the lab directory to hand in your lab.
Create a new file, time.txt, and put in it a single integer, the number of hours you spent on the lab. Don't forget to git add and git commit the file.
After committing your final changes to the lab, type make handin to submit your lab.
$ git commit -am "ready to submit my lab" [util c2e3c8b] ready to submit my lab 2 files changed, 18 insertions(+), 2 deletions(-) $ make handin tar: Removing leading `/' from member names Get an API key for yourself by visiting https://6828.scripts.mit.edu/2020/handin.py/ Please enter your API key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 79258 100 239 100 79019 853 275k --:--:-- --:--:-- --:--:-- 276k $make handin will store your API key in myapi.key. If you need to change your API key, just remove this file and let make handin generate it again (myapi.key must not include newline characters).
If you run make handin and you have either uncomitted changes or untracked files, you will see output similar to the following:
M hello.c ?? bar.c ?? foo.pyc Untracked files will not be handed in. Continue? [y/N]Inspect the above lines and make sure all files that your lab solution needs are tracked i.e. not listed in a line that begins with ??. You can cause git to track a new file that you create using git add filename.
If make handin does not work properly, try fixing the problem with the curl or Git commands. Or you can run make tarball. This will make a tar file for you, which you can then upload via our web interface.
If you pursue a challenge problem, whether it is related to networking or not, please let the course staff know!