| 6.097: OPERATING SYSTEM ENGINEERING |
| Fall 2002 |
| Lab 1 Solutions |
These solutions were derived in part from work by Michelle Duvall and Sean Fay.
(0) Breakpoint 1, 0x7c00 in ?? () Next at t=205567 (0) 0000:7c00: 90: nop <bochs:3> s Next at t=205568 (0) 0000:7c01: 90: nop <bochs:4> s Next at t=205569 (0) 0000:7c02: 90: nop <bochs:5> s Next at t=205570 (0) 0000:7c03: 90: nop <bochs:6> s Next at t=205571 (0) 0000:7c04: ebfa: jmp +#fa <bochs:7> s Next at t=205572 (0) 0000:7c00: 90: nop
This corresponds to the code in ex1.S, which has exactly these
operations (4 nops followed by a jmp back to
the first instruction).
Note that before any instructions are executed, t=0. After the first instruction is executed, t=1. Thus, the value of t represents the total number of instructions executed. The line:
Next at t=205567
thus indicates that 205567 BIOS instructions are executed before the 'start' label is reached.
The machine code for a NOP is 0x90.
jmp
start above? Write out the bytes in hex. Refer to the x86
instruction set manual. Explain each of the bytes.
The machine code for jmp start is 0xebfa.
EB:
JMP rel8 = Jump short, relative (8-bit), displacement relative to
next instruction, where a short jump is a jump within the current code
segment limited to -128 to +127 from the current EIP value.
FA:
signed 8-bit offset added to EIP (currently the next instruction)
to determine destination. Here, EIP + (-6) is specified, and EIP =
0x7c06.
The relative offset is 0xfa = -6. A number
of people made mistakes here; be sure to understand how to convert
between decimal and twos-complement.
jmp instruction
relative to the instruction counter at the beginning or the end of the
jmp? How do you know this?
The Intel's x86 instruction set manual specifies clearly that
the EIP is for the next instruction (pg 3-358). This is corroborated
by the above jmp instruction in this example whose offset
accounts for 4 bytes of nop plus 2 bytes for the jump itself.
The value of the EAX register is 0x7c01.
jmp *%eax does. What is the CS:EIP before and
after this instruction?
The instruction jmp *%eax jumps within this code segment to the EIP
specified by the value in register eax.
The CS:EIP before this instruction is 0x0000:0x7c0a.
The CS:EIP after this instruction is 0x0000:0x7c01.
This code in ex2.S would not run correctly if the BIOS
did not load it at address 0x7c00 because the code jumps to an
absolute address that is calculated at link-time from a user (or
default linker) specified base address; the correctness of the code
depends on the destination address being loaded at the correct
location. If the code is not placed where it was told it would
be placed (by the linker), it will still jump to the fixed link-time
determined address which will no longer have the expected
instructions.
ex1.S would still run
correctly even if the BIOS did NOT load it at address
0x7c00.
The code in ex1.S would still run correctly even if the
BIOS did not load it at address 0x7c00 because the code does not use
any linker-resolved memory addresses. In fact, ex1.S
uses only the current state of the machine as represented by the EIP
to determine the base address of where to jump, and thus is location
independent.
BOOTLOADER_LINK_ADDRESS parameter in
the GNUmakefile. Explain how this value affects the value
of EAX loaded by "movl $here,%eax".
The value of BOOTLOADER_LINK_ADDRESS is the address at
which the linker assumes the code will be loaded. Therefore, changing
the value of BOOTLOADER_LINK_ADDRESS will change the
value loaded into EAX by 'movl $here, %eax' from 0x7c01 to the value
of BOOTLOADER_LINK_ADDRESS + 1.
ex2.S corresponds to the bytes in the
disk.img file.
instruction machine code byte start length nop 90 0 1 nop 90 1 1 nop 90 2 1 nop 90 3 1 movl $here,%eax 66b8 017c 0000 4 6 jmp *%eax 66ff e0 10 3
BOOTLOADER_LINK_ADDRESS in the
GNUmakefile and describe how the disk image
changes. Could the load address of the boot sector affect contents of
the disk image? Why or why not? Now do you see why the link address
should match the load address?
The disk image changes in bytes 6 and 7 from 01 and 7c respectively to
the little endian format of BOOTLOADER_LINK_ADDRESS+1.
Furthermore, a number of bytes of zeros will sometimes be prepended to
the code (depending on the value of
BOOTLOADER_LINK_ADDRESS); the linker pads to an 8-byte
boundary. The load address of the boot
sector, however, can not affect contents of the disk image
because the code has already been compiled and linked for a
specific place in memory when it is loaded.
gmake ex2. Paste the output in
your answer and write a short commentary next to each command. Why is
the command necessary? What is it doing? What is the input file? What
is the output file? For certain steps of the build process you'll just
have to guess.
gmake ex2.aout OBJS=ex2.o
LINK=0x7c00
gmake[1]: Entering directory `/afs/athena.mit.edu/user/s/e/seanf/6.097/lab1/bootloader/src'
**** ASSEMBLING: ex2.S ==> ex2.o
/mit/6.097/bin/i386-osclass-aout-gcc -c ex2.S**** LINKING: ex2.o
==> ex2.aout
/mit/6.097/bin/i386-osclass-aout-ld -N -e start -Ttext 0x7c00 -o ex2.aout
ex2.ocp ex2.aout ex2.aout.dbg
/mit/6.097/bin/i386-osclass-aout-strip ex2.aoutrm ex2.o
gmake[1]: Leaving directory `/afs/athena.mit.edu/user/s/e/seanf/6.097/lab1/bootloader/src'
gmake disk.img BOOTLOADER=ex2.aout
gmake[1]: Entering directory `/afs/athena.mit.edu/user/s/e/seanf/6.097/lab1/bootloader/src'
**** UPDATING .bocshrc
/mit/6.097/bin/generate-bochsrc disk.img 10 16 63**** CREATING Bochs disk
image: disk.img
**** step 1: zero the image
dd if=/dev/zero of=disk.img count=30003000+0 records in
3000+0 records out
**** step 2: write boot sector
dd if=ex2.aout of=disk.img conv=notrunc bs=1 skip=32 seek=016+0 records in
16+0 records out
**** done: bochs disk image file = disk.img
gmake[1]: Leaving directory `/afs/athena.mit.edu/user/s/e/seanf/6.097/lab1/bootloader/src'
0x7c00 <bogus+0>: 0x31 0xc0 0x8e 0xd0 0xbc 0x00 0x7c 0xe8 0x7c08 <bogus+8>: 0x02 0x00 0xeb 0xfe 0xc3 0x8d 0x74 0x00 0x7c10 <bogus+16>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7c18 <bogus+24>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7c20 <bogus+32>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
These bytes correspond to the code segment of ex4 as follows. The
code exists from 0x7c00 to 0x7c0c. Afterwards are clean up
instructions (?) and the zeros initialized by dd.
31 c0 = xor AX, AX; 8e d0 = mov SS, AX; bc 00 7c = mov SP, #7c00; e8 02 00 = call 0002; eb fe = jmp +#fe c3 = ret_near;
call subroutine instruction pushes its return address on the
stack. What is the value of SS:SP and the corresponding memory
location right before and right after the "call"? (perhaps use 's' to
single step and 'info registers to view the registers)
The value of SS:SP before the 'call' is 0x0:0x7c00.
The value of SS:SP after the 'call' is 0x0:0x7bfe.
The value that is pushed onto the stack is 0x7c0a.
This value is 10 + the load address (since call pushes EIP onto the stack).
This value is not related to the link address (since it is obtained from the EIP rather than from something hard-coded by the linker; this is related to the answers for 2.3 and 2.4).
The 'ret' instruction (ret_near), pops the return address off of the stack and sets EIP to that value.
The value of SS:SP before the 'ret' is 0x0:0x7bfe.
The value of the corresponding memory location is 0x0a (the lower byte of 0x7c0a).
The value of SS:SP after the 'ret' is 0x0:0x7c00.
The value of the corresponding memory location is 0x31 (as was originally at the beginning of the
code).
The stack and the boot sector's code abut in
memory. The boot sector's code grows up from 0x7c00 while the stack grows down in memory from
just before 0x7c00.
The stack and the BIOS are completely separated; the stack is in
the location aforementioned, and the BIOS is between locations 0xf0000
and 0xfffff.
ex4.S would not run if the code was
loaded at a different address. Why is this? (hint: the
call and ret instructions aren't the
problem)
This question was worded poorly. The intended answer was that
the movw $start,%sp instruction will load
an unexpected value into the stack pointer. However, this by itself
would be harmless most of the time unless there happened to be other
important data at the address $start.
Below is a chart of the hard disk image. Showing the sector number, contents of the sector, NREADS, and READER. Fill in NREADS with the number of times the particular sector is read. Fill in READER with the agent who reads it, either PC hardware, BIOS, bootloader or other.
SECTOR # CONTENTS NREADS READER 1 boot loader 1 BIOS 2 bytes 0 - 511 of kernel.aout 2 bootloader 3 bytes 512 - 1023 of kernel.aout 1 bootloader . 1 bootloader . 1 bootloader N the last 512 byte of kernel.aout 1 bootloader N+1 zeros 0 N/A N+2 zeros 0 N/A . 0 N/A . 0 N/A M zeros 0 N/A
The first sector of kernel.aout is read twice: first to retrieve the header to determine where to load the kernel and how big it is, and second to actually read the first sector into the appropriate location.
The link address of the kernel is 0xF0100020.
The bootloader actually loads it at address 0x100020.
Kernalload_address (Kernellink_address) = (Kernellink_address & 0xffffff)
Some students wrote that the load address was 0x100000; while the object code (i.e. including
the a.out header) is in fact loaded at 1 Megabyte, since the link
address is clearly 0xF0100020, the load
address is correspondingly 0x00100000. It
is also important that your formula in this question agree with your
answers to the previous two!
If the bootloader respected the kernel's link address, approximately
3842 megabytes of memory would be needed.
xxd -a kernel.aout. In your answer show the
relevant portion of xxd's output and indicate where the follow
fields are located: a_text, a_data and
a_entry. Give the values of these fields in decimal,
except for a_entry which should be given in hex.
0000000: 0701 6400 7800 0000 0000 1000 0000 0000 ..d.x........... 0000010: 0000 0000 2000 10f0 0000 0000 0000 0000 .... ..ð........
a_text is located at 0x4 and is colored
red.
In decimal, its value is 120.
a_data is located at 0x8 and is colored
blue. In decimal, its value
is 1048576.
a_entry is located at 0x14; it is colored
grey and its value is 0xf0100020.
Many people had problems here with the little-endian ordering of the bytes. Remember that two hex digits form a single byte. The four bytes should be arranged so that the first byte is last, the second byte becomes the penultimate byte, the third byte comes second and the last byte in the stream is placed as the high-order byte.
read_sector loaded? Set a breakpoint here and run it in
bochs. Hint use objdump:
athena$ i386-osclass-aout-objdump --adjust-vma=0x7c00 -S bootloader.aout.dbg(For your own purposes you might want to play around
objdump, try out it's various flags, run it on
bootloader.aout, etc. )
The first instruction of the read_sector is loaded at
physical address 0x7cdc.
rep prefix.
The bootloader executed 367,000
instructions. This is determined by looking at the difference between
the number of instructions executed by the CPU after the BIOS
completes and just before the kernel starts to execute.
Most of these instructions are spent reading the kernel image from disk to
memory. 2049 sectors are read in. The instruction that reads
them is repeated 512/4 = 128 times (it reads one 4-byte word at a time).
The loop that calls read_sector plus the other instructions in read_sector
sum to about 50 instructions. 2049 sectors * 178 instructions/sector
= 364722 instructions.
kernel2.c (ignore instructions corresponding to
locore.S). Look for places in the machine code where the
link address has been compiled into the binary. Hint: one of the
places is in write_char() and the other is in
i386_init(). In your answer, paste the two occurrences
of position dependent assembly from the dump, explain how they are
position dependent and to what C code they correspond.
In write_char() :
// video[0] = c; movl _video, %edx movb %al, (%edx) // video[1] = 0x07; // black on white movl _video, %eax movb $7, 1(%eax) // video += 2; addl $2, _video
The _video variable accesses are position dependent instructions because the linker replaces '_video' with the address where it believes _video resides.
In i386_init() :
// write_string ("Hello World!");
pushl $LC0
call _write_string
The pushl $LCO is a position depedent instruction because it pushes the linker's pre-determined address for the "Hello World!" string onto the stack. Note that the call instruction actually encodes the offset relatively, even though the disassembled code appears to have an absolute address: pay close attention to the binary opcodes.
The correct value of ANSWER is (0 -
KERNBASE). This is because bootloader_c.c pays
attention to only the lower six bits of the linker-provided address.
The upper bits are ignored by bitwise and-ing the linker-provided
address with 0xffffff. An equally valid
way of ignoring the upper bits is to subtract a value with just the
upper bits set from the original linker-provided address. Since
KERNBASE is equivalent to the linker-determined load address with the
lower six bits set to zero, using (0 - KERNBASE) as the segment base
address is equivalent to subtracting KERNBASE from every
linker-provided offset address, which allows the code to run in a
different (although known) position.
video in kernel2.c is set
to the value it is set to. You might want to compare the code in
kernel.S.
The address 0xb8000 represents the character portion of the top left
position on the moniter. In other words, this is the first
memory-mapped I/O address designated for communicating with video.
However, since the base of this segment is set to (0 - KERNBASE),
KERNBASE must be added to 0xb8000 to be able to communicate properly
with the system's video. Therefore video is in
kernel2.c is set to (KERNBASE + 0xb8000).
This completes the lab.