6.097: OPERATING SYSTEM ENGINEERING
Fall 2002
Lab 2 Solutions

These solutions were derived in part from work by Oskar Bruening and Neil Sanchala.

Exercise 1: VM layout

  1. EntryBase Virtual AddressPoints to Page Table used for
    10230xffc00000252-256MB of physical memory
    [...][...][...]
    9610xf04000004-8MB of physical memory
    9600xf0000000 (KERNBASE)0-4MB of physical memory
    9590xefc00000Kernel VPT
    9580xef800000Kernel stack
    9570xef400000User VPT
    9560xef000000Ppage array
    9550xeec00000Env array
    9540xee800000unassigned
    [...][...]unassigned
    10x00400000unassigned
    00x00000000usually unassigned,
    temporarily used to point to the kernel (see 1.2)
  2. The first entry of the page directory is used only from the point where paging is turned on, to where segmentation is reconfigured with a base address of 0 (for all segments).

    All memory references during that time, must fall within the virtual address range [KERNBASE, KERNBASE+4MB). Segmentation maps this VA range is to the LA range [0, 4MB). And paging, specifically the first entry of the page directory, maps this LA range to the PA range [0, 4MB). This PA range is where the kernel resides (actually, the kernel is loaded at 1MB-the bottom of extended memory).

    In pratice, this limits our kernel to 4MB in size, so that we can be sure that all memory references (for instruction and data) fall within this range.

    To support larger kernels, we could use more entries of the page directory.

    After segmentation is reconfigured, the mapping from KERNBASE down to 0 is carried out by the page tables alone. VA references in the first 4MB should fault, particularly to address 0.

    If this special mapping were omitted, the kernel would crash before it reconfigured segmentation. Most likely, the next instruction fetch after paging is turned on (with lcr3()) would fault.

Exercise 2: Physical page management

  1. The maximum amount of physical memory is limited to 256MB by the way that kernel manages physical memory. The kernel maps all physical memory into every address space in the range [KERNBASE, 2^32-1]; a 256 MB range. Only this physical memory is put on the ppages[] array, and is available for allocation. (Technically, the I/O hole reduces this amount by 384KB.)

  2. Every 1024 pages of physical memory needs one page table to map them at KERNBASE. In addition, for every physical page a 12 byte struct Ppage is allocated in the ppage[] array.
    4096/(1024*4096) + 12/4096 = 1/1024 + 12/4096 = 4/4096 + 12/4096 =
    16/4096 = 1/256 = ~0.39 % overhead.
    

Exercise 3: Thinking about design decisions

  1. On the x86, the kernel and user can co-exist in the same address space because the page directory/table can express which parts of the address space the user can access and which it can't (the later being the kernel part). The PG_U marks a page as being accessible by the user, otherwise only the kernel has access.

    On the PDP/11, the user and kernel can't exist in the same address space, since the hardware doesn't not have bits in the segment registers (PDRs) to differentiate user permission vs. kernel permisssion for a given segment. If a segment is writeable, anyone can write it. The kernel can't have writeable memory that is off limits to the user. Therefore, to enforce isolation the kernel and user are placed in separate address spaces.

Exercise 4: Printf Potpourri

  1. console.c exports:

    void cnputc (short int c);
    
    printf.c uses cnputc to print each character to the console.
  2. For this problem, we just need to know that CRT_COLS = 80 is the number of columns in the console. This code takes all the rows on the console except for the first one, and shifts them up one (this is done using bcopy. It then clears the last line.
  3. fmt points to the formatting string. That is, here fmt = "x %d, y %x, z %d\n". ap is a pointer to x on the stack. The remaining argumetns sit above x on the stack.

    1. cnputc(ch = 'x')
    2. cnputc(ch = ' ')
    3. va_arg - before: points to x, after: to y.
    4. ksprintn(uq = (u_int)x, base = 10)
    5. cnputc(ch = '1')
    6. cnputc(ch = ',')
    7. cnputc(ch = ' ')
    8. cnputc(ch = 'y')
    9. cnputc(ch = ' ')
    10. va_arg - before: points to y, after: to z.
    11. ksprintn(uq = (u_int)y, base = 16)
    12. cnputc(ch = '3')
    13. cnputc(ch = ',')
    14. cnputc(ch = ' ')
    15. cnputc(ch = 'z')
    16. cnputc(ch = ' ')
    17. va_arg - before: points to z, after: points to whatever happens to be on the stack above z.
    18. ksprintn(uq = (u_int)z, base = 10)
    19. cnputc(ch = '4')
    20. cnputc(ch = '\n')
  4. The code outputs "He110 World".

    1. cnputc(ch = 'H')
    2. va_arg - before: points to 57616, after: to &i.
    3. ksprintn(uq = (u_int)57616, base = 16)
    4. cnputc(ch = 'e')
    5. cnputc(ch = '1')
    6. cnputc(ch = '1')
    7. cnputc(ch = '0')
    8. cnputc(ch = ' ')
    9. cnputc(ch = 'W')
    10. cnputc(ch = 'o')
    11. va_arg - before: points to &i, after: point to whatever happens to be on the atack above &i.
    12. cnputc(ch = 'r')
    13. cnputc(ch = 'l')
    14. cnputc(ch = 'd')
    We'd need to set i = 0x726c6400. That is, we'd need to reverse the order of the bytes. 57616 can stay at the same value, because that would be converted to bytes by the compiler, so it'd be stored in a big-endian format on a big-endian system.
  5. It doesn't compile after the replacement because &i isn't a char *, and gcc is looking for a char * to match to the %s. gcc knows to look for the right types of arguments because we added an __attribute__ when we declared printf in printf.h. The attribute tells gcc that this function is going to take "printf"-like arguments, so gcc should check the arguments.

  6. warn() will print out whatever is on the stack before the '3'. This hapens because kprintf will continue calling va_arg past the end of the va_list.

    The exact value on the stack can is unknown. The compiler could have allocated stack space for local variables or pushed callee-saved registers.

  7. If printf() changed its interface so that arguments are passed in opposite order (i.e., the format string as the last argument), then GCC's calling convention change is counteracted. After both changes, printf()'s arguments will appear on the stack exactly as they did before either changes (to GCC or to printf()'s interface). Thus, the behavior of the va_* functions would not need to change.

    Many people thought that the problem could be solved by changing va_* functions. However, in this case, printf() has no idea of known how many arguments (and their types) were passed to it. This information is encoded in the format string, which is sitting up the stack at distance unknown to printf().

This completes the lab.