| 6.097: OPERATING SYSTEM ENGINEERING |
| Fall 2002 |
| Lab 5 |
| Hand out date: Thursday November 7th |
| Due date: Monday November 18th |
In this lab, you will write the code to exec() an
executable stored in the on-disk file system. In typical exokernel
fashion, exec() will be implemented in the user space
library operating system. The file system stored on-disk is trivial in
the extreme -- notably, it's read-only.
ossrc/lab5_answers.html (do not link to any external
files).
athena% cd ~/6.097/ossrc athena% gmake handin . . . athena% ls -l handin5.tgz
A fool-proof way to accomplish this is to use the following command:
athena% mhmail 6.097-handin@pdos.lcs.mit.edu -subject http://web.mit.edu/PATH/TO/handin5.tgz -body empty
You should now download the code for the lab. Many files are absent from this tarball and must be copied form your lab 4 solutions.
Be careful not to overwrite your lab 4 solutions.
athena% add gnu 6.097 sipb athena% cd ~/6.097 athena% mv ossrc lab4-solutions athena% wget http://pdos.lcs.mit.edu/6.097/labs/lab5.tar.gz athena% gtar -zvxf lab5.tar.gz drwxr-xr-x cates/wheel 0 Sep 15 19:34 2002 ossrc/ drwxr-xr-x cates/wheel 0 Sep 15 19:34 2002 ossrc/kern/ drwxr-xr-x cates/wheel 0 Sep 15 19:34 2002 ossrc/kern/inc/ -rw-r--r-- cates/wheel 2528 Sep 14 17:00 2002 ossrc/kern/inc/asm.h . . . athena% cp lab4-solutions/kern/locore.S ossrc/kern athena% cp lab4-solutions/kern/trap.c ossrc/kern athena% cp lab4-solutions/kern/pmap.c ossrc/kern athena% cp lab4-solutions/kern/env.c ossrc/kern athena% cp lab4-solutions/kern/sched.c ossrc/kern athena% cp lab4-solutions/kern/init.c ossrc/kern athena% cp lab4-solutions/kern/syscall.c ossrc/kern athena% cp lab4-solutions/kern/inc/syscall.h ossrc/kern/inc athena% cp lab4-solutions/user/simple/libos.c ossrc/user/simple
Two changes have been made in the tarball which require your
attention. First, in ossrc/GNUmakefile.global the
optimization level has been reduced from -O6 to -O2 (-O6
was generating buggy code in some circumstances). This impacts
you because if you have latent bugs in your code, they might be
revealed by changing the optimization level.
Second, the __start code from
user/simple/libos.c has been moved into an assembly file
entry.S. You should comment out this code as it appears
in your libos.c. (For the sake of cleanliness, you might
wish to move your asm_pgfault_handler into
entry.S as well)
The user/GNUmakefile has been modified so that it creates
a file system disk image. The image consists of a table of contents
followed by all the user a.out executables one after another, for each
program in ossrc/user. The table of contents is one
block in size (i.e., NBPG), and each executable is padded
out to the nearest block in size.
+--------------+
| TOC | NBPG
+--------------+
| executable 1 | k1 * NBPG
+--------------+
| executable 2 | K2 * NBPG
+--------------+
.
.
+--------------+
| executable n | Kn * NBPG
+--------------+
The tools/mkimg tool builds the disk image according to
this format. You should refer to it for the details (especially, for
the format of the table of contents).
ossrc/.bochsrc and you will find the line:
diskd: file="./user/bochs.img", cyl=200, heads=16, spt=63
This instructs Bochs to treat this image as the second disk drive.
In this exercise, you'll add a system call to read blocks from the disk. The disk interface is a lot simpler than v6 UNIX' interface primarily because it only supports synchronous reads. It uses programmed I/O (i.e., inb/outb) to transfer the data from the disk.
Your task is to add a system call to your kernel which allows user
processes to read blocks from the disk. While the disk supports
reading sectors, the interface should be block based so it fits well
with the virtual memory system. The exact interface of your system
call is defined below. You should base its implementation around the
read_block() routine shown below. This helper function
should look familiar because it was used by the bootloader from lab 1.
// Allocates a page of memory and maps it at 'va' with read-only
// permissions. Then reads block number 'blockno' from disk number 'diskno'
// into that page.
// RETURNS
// 0 -- on sucess
// <0 -- otherwise
int
sys_disk_read (u_int diskno, u_int blockno, u_int va)
{
// your code goes here
}
void
read_block (u_int diskno, u_int blockno, char *destination)
{
#define SECTOR_SIZE 512
unsigned int sectors_per_block = NBPG/SECTOR_SIZE;
unsigned int sectorno = sectors_per_block * blockno;
unsigned char status;
assert (diskno == 0 || diskno == 1);
do {
status = inb(0x1f7);
} while (status & 0x80);
outb (0x1f2, sectors_per_block); // sector count
outb (0x1f3, (sectorno >> 0) & 0xff);
outb (0x1f4, (sectorno >> 8) & 0xff);
outb (0x1f5, (sectorno >> 16) & 0xff);
outb (0x1f6, 0xe0 | (0x1 & diskno) << 4 | ((sectorno >> 24) & 0x0f));
outb (0x1f7, 0x20); // CMD 0x20 means read sector
do {
status = inb (0x1f7);
} while (status & 0x80);
insl (0x1f0, destination, NLPG);
}
fs_lookup() in
libos.c as describe below:
// Looks up 'name' in the disk image created by mkimg.
//
//RETURNS:
// block offset of 'name' in the disk
// <0 -- on error (i.e., 'name' does not exist)
int
fs_lookup (char *name)
{
// your code goes here
}
You might also wish to add these functions to libos.c.
unsigned int
ntohl (unsigned int x)
{
unsigned char *s = (unsigned char *)&x;
return (unsigned int)(s[0] << 24 | s[1] << 16 | s[2] << 8 | s[3]);
}
int
strcmp (const char *s1, const char *s2)
{
/* this code is from FreeBSD's libkern */
while (*s1 == *s2++)
if (*s1++ == 0)
return (0);
return (*(const unsigned char *)s1 - *(const unsigned char *)(s2 - 1));
}
size_t
strlen(const char *str)
{
/* this code is from FreeBSD's libkern */
const char *s;
for (s = str; *s; ++s);
return (s - str);
}
char *
strcpy(char *to, const char *from)
{
/* this code is from FreeBSD's libkern */
char *save = to;
for (; (*to = *from) != 0; ++from, ++to);
return(save);
}
Exercise 3: Basic Exec
In this exercise you will implement exec().
//Creates process 'name'
//
//RETURNS
// <0 -- on error
// Nothing is returned on success, since
// new process starts executing from the beginning,
// and old process no longer exists.
//
int
exec (char *name)
{
// your code goes here
}
exec() replaces the current environment with a new one.
It proceeds as follow:
- Create a new environment
- allocate a stack at
USTACKTOP - NBPG
- load 'name' at
UTEXT
- start it running from the beginning (i.e.,
UTEXT+0x20)
- Parent exits.
You might need to add another parameter to
sys_env_alloc(). The behavior of this system call should
vary if it is called by exec() or by fork().
For fork(), the new environment inherits a number of
values from its parent; such as a trap frame, the exception stack, and
the page fault handler. exec(), on the other hand, does
not want to inherit these values from its parent. For example, its
execution begins at UTEXT+0x20 (which presumable is its
__start label), not from where the parent called
sys_env_alloc(). Furthermore, the startup code will
allocate an exception stack and register it with the kernel.
You should test your code as you see fit. One suggestion, however, is
to make a program that exec()'s itself. This should
cause exec() to be run over and over infinitely. You'll
run out of memory eventually, unless you go back and implement
env_free(). Just to be clear: implementing
env_free() is just a suggestion -- it is not obligatory.
Exercise 4: Exec arguments
In this exercise, you'll extend exec() with the ability
to pass arguments to the new environment.
For example,
exec ("simple", "-f", "foo", "-c", "junk", NULL); // NOTICE: the trailing NULL!
Should invoke simple so that it can access its arguments as:
void
umain (int argc, char *argv[])
{
int i;
for (i = 0; i < argc; i++) {
print (" argv[", i, "] = ");
sys_cputs (argv[i]);
sys_cputs ("\n");
}
}
Output:
argv[0] = "simple"
argv[1] = "-f"
argv[2] = "foo"
argv[3] = "-c"
argv[4] = "junk"
There are two components of this work: what the parent does and what
the child does.
- On the parent side (the side which invokes
exec()), add
an ellipsis to the signature of exec() so that it can
take a variable number of arguments.
int
exec (char *name,...)
{
// your code goes here
}
Then exec() must setup the stack of the new environment
so that the arguments appear. The parent should format the memory
according to the following diagram.
USTACKTOP:
+--------------+
| block of | Block of strings. In the example
| memory | "simple", "-f", "foo", "-c", and
| holding NULL | "junk" would be stored here.
| terminated |
| argv strings |
+--------------+
| &argv[n] | Next, comes the argv array--an array of
| . | pointers to the string. Each &argv[*] points
| . | into the "block of strings" above.
| . |
| &argv[1] |
| &argv[0] |<-.
+--------------+ |
| argv ptr |__/ In the body of umain(), access to argc
%esp -> | argc | and argv reference these two values.
+--------------+
If these values are on the stack when umain() is called,
then umain() will be able to access its arguments via the
int argc and char *argv[] parameters.
As indicated in the diagram above, the parent code must also create
the new environment with its %esp pointing at the
argc value. You'll probably need to modify
sys_env_alloc() to take the initial %esp as an
additional parameter.
Warning: the diagram shows the memory at USTACKTOP since
this is where it will be mapped in the child's address space.
However, be careful! When the parent formats the arguments, it will
need to do so at some temporary address, since it can't (well,
shouldn't) map over its own stack. Similarly, take care when set the
pointers arg ptr, &argv[0] .. &argv[n]. These pointers need to
account for the fact that the data will be remapped into the child at
USTACKTOP.
-
Now for the child side of the exec(): examine the
entry path of the child process under the ___start label.
You'll see that it is written such that __main() and
umain() can be defined taking int argc, char
*argv[]. You'll also notice, that the entry path also takes
care of the case when a new process is created by the kernel, in which
case no arguments are passed.
The code on the child side has been done for you, except that you
should add the (int argc, char *argv[]) parameters to the
definitions of __main() and umain().
Your task is to implement the parameter passing strategy described
above. You may assume that there are few enough (and short enough)
arguments so that only one page of stack is needed in the child
environment.
Technical Detail: Actually only the argc and the
argv ptr must be placed on the new env's stack. The
argv ptr must point to the &argv[0]
.. &argv[n] array, each of which point to a string. As a
consequence, the &argv[0] .. &argv[n] array and the
"block of strings" can be located anywhere in the new env's address
space--not necessarily on the stack. In practice, we find it
convenient to store all of these values on the stack as has been
presented in this exercise.
Questions:
- How long approximately did it take you to do this lab?
This completes the lab.
Version: $Revision: 1.8 $. Last modified: $Date: 2002/11/08 06:40:19 $