Writing Boot Sector Code

|« « # » »|
This is an article that discusses how to write your own 'hello, world' code into the boot sector. At the time of this writing, most such code examples available on the net were meant for NASM compiler. Very little material was available that could be tried with the readily available GNU tools like the GNU assembler (as) and the GNU linker (ld). This article was an effort to fill that gap.

License

Copyright © 2007 Susam Pal

This work is licensed under the Creative Commons Attribution 3.0 License.

Permission is granted to copy, distribute, transmit and/or adapt the work under the condition that you must attribute the work to Susam Pal with its URL, i.e. http://susam.in/articles/boot-sector-code/. A copy of the full license is available at http://susam.in/licenses/cc-by/.

This document is provided WITHOUT ANY WARRANTY; without even the implied warranties of TITLE, MERCHANTIBILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, or the ABSENCE OF LATENT or OTHER DEFECTS, ACCURACY, or the PRESENCE OF ABSENCE OF ERRORS, whether or not discoverable.

Introduction

When the computer starts, the processor starts executing instructions at the memory address 0xfff0. This is usually a location in the BIOS ROM. Thus the BIOS code is executed by the processor. It checks several things, does many tests including the popular POST (power-on self test) and then finds the boot device. It loads the code from its boot sector into the memory and executes it. From here, the code in the boot sector takes over the control. In IBM-compatible PCs, the boot sector is the first sector of a data storage device. This is 512 bytes in length. The following figure shows what the boot sector contains.

Address Description Size in bytes
HexDec
0000Code440
1b8440Optional disk signature4
1bc4440x00002
1be446 Four 16-byte entries for primary partitions64
1fe5100xaa552

This article explains how to write such codes which can be written into the boot sector. Two programs are discussed in the following sections. First one simply prints the character, 'A' on the screen. Second program prints a string. You are expected to have a working knowledge of assembly programming using GNU as. The details of the assembly language won't be discussed. Only how to write code for boot sector will be discussed. All code examples used in this article are available at http://susam.in/files/code/boot-sector/.

The code examples were verified by using the following tools while writing this article:

  1. GNU assembler (GNU Binutils for Debian) 2.18
  2. GNU ld (GNU Binutils for Debian) 2.18
  3. dd (coreutils) 5.97
  4. DOSEMU 1.4.0.0
  5. DOSBox 0.72

Printing a character

# Author: Susam Pal <http://susam.in/>
#
# To assemble and link this code, execute the following commands:-
# as -o char.o char.s
# ld --oformat binary -o char char.o
#
# To write this code into the boot sector of a device, say /dev/sdb:-
# dd if=char of=/dev/sdb
# echo -ne "\x55\xaa" | dd seek=510 bs=1 of=/dev/sdb

.code16
.section .text
.globl _start
_start:
  mov $0xb800, %ax
  mov %ax, %ds
  movb $'A', 0
  movb $0x1e, 1
idle:
  jmp idle

The .code16 directive is to tell the assembler that this code is meant for 16-bit mode. The _start label is meant to tell the linker that this is the entry point in the program.

The video memory of the VGA is mapped to various segments between 0xa000 and 0xc000 in the main memory. The color text mode is mapped to the segment 0xb800. The first two instructions move 0xb800 into the data segment register, so that any data offsets specified is an offset in this segment. Then, the code for the character 'A' (usually 0x41 or 65) is moved into the first location in this segment and the attribute (0x1e) of this character to the second location. The higher nibble (0x1) is the attribute for background color and the lower nibble (0xe) is that of the foreground color. The highest bit of each nibble is the intensifier bit. The other three bits represent red, green and blue. This is represented in a tabular form below.

Attribute
BackgroundForeground
IRGB IRGB
0001 1110
0x10xe

It can be seen from the table that the background color is dark blue and the foreground color is bright yellow. Compile and link the code with the as and ld commands mentioned in the comments. Before writing the code into the boot sector, you might want to verify whether the code works or not with an emulator. DOSEMU is a nice emulator that does this job very well. In Debian, it is available as dosemu package. DOSBox is also pretty good. It is available as dosbox package in Debian. Create a copy of the binary file, char and name it to char.com. It is necessary so that DOSEMU can run it as a command. DOS COM files are merely machine code with no headers. This is what is generated using the --oformat binary option while running ld command.

To avoid, this step of renaming the file to char.com, I often specify the output file to be char.com while running the ld command. For example:-

ld --oformat binary char.com char.o
dosemu char.com

Once you are satisfied with the output of char.com run with DOSEMU, use the two commands given below to write the binary and the MBR signature into the boot sector. Be absolutely sure of what you are doing at this step. /dev/sdb is only an example here. You must change it to the correct device where you want to overwrite the boot sector with this code. If you use dd to write to the wrong device, you might lose access to the data on it. If you are in doubt, take help from a guru.

dd if=char.com of=/dev/sdb
echo -ne "\x55\xaa" | dd seek=510 bs=1 of=/dev/sdb

Now, you may boot your computer with this device.

Printing a string

# Author: Susam Pal <http://susam.in/>
#
# To assemble and link this code, execute:-
# as -o string.o string.s
# ld --oformat binary -Ttext 7c00 -Tdata 7c20 -o string string.o
#
# To write this code into the boot sector of a device, say /dev/sdb:-
# dd if=string of=/dev/sdb
# echo -ne "\x55\xaa" | dd seek=510 bs=1 of=/dev/sdb

.code16

.section .data
message:
  .asciz "hello, world"

.section .text
.globl _start
_start:
  nop
  xor %di, %di
  mov $0xb800, %ax
  mov %ax, %ds
  mov $message, %si
move:
  xor %dx, %dx
  mov %cs:(%si), %dl
  cmp $0, %dl
idle:
  jz idle
  mov %dl, (%di)
  inc %di
  movb $0x1e, (%di)
  inc %di
  inc %si
  jmp move

There are two sections in this code. The data section has the null-terminated string to be displayed. The text section has the code. The code moves the first byte of the string to the location, 0xb800:0x0000, its attribute to 0xb800:0x0001, the second byte of the string to 0xb800:0x0002, its attribute to 0xb800:0x0003 and so on until the string terminates which is detected by the null byte (0x0) in the end. movb %cs:(%si), %dl moves one character from the string indexed by SI register in the code segment into DL register. The reason why we are reading the characters from code segment will become clear from the linker commands discussed below.

While booting, the BIOS reads the code from the first sector of the boot device into the memory at physical address 0x7c00 and jumps to that address. However, while testing with DOSEMU, things are a little different. In DOS, the text section is loaded at an offset 0x0100 in the code segment. This should be specified to the linker while linking so that it can correctly resolve the value of message label. So, the object file has to be linked twice. Once for testing it on DOSEMU and one more time before writing it into the boot sector. For a trial, try this:-

as -o string.o string.s
ld --oformat binary -Ttext 0 -o string.com string.o
objdump -bbinary -mi8086 -D string.com
hd string.com

The -Ttext 0 option tells the linker to assume that the text section should be loaded at an offset of 0x0 in the code segment. The objdump command is used to disassemble the file. This shows where the text section and data section are placed. Here is the portion of the output that must be analyzed.

      1b:       47                      inc    %di
      1c:       46                      inc    %si
      1d:       eb ec                   jmp    0xb
        ...
    101f:       00 68 65                add    %ch,0x65(%bx,%si)
    1022:       6c                      insb   (%dx),%es:(%di)
    1023:       6c                      insb   (%dx),%es:(%di)
Also, try the hd command. Here is the output.
00000000  90 31 ff b8 00 b8 8e d8  be 20 10 31 d2 2e 8a 14  |.1....... .1....|
00000010  80 fa 00 74 fe 88 15 47  c6 05 1e 47 46 eb ec 00  |...t...G...GF...|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001020  68 65 6c 6c 6f 2c 20 77  6f 72 6c 64 00           |hello, world.|
0000102d

It can be seen that the text section occupies the first 0x1f bytes and the linker has put the data section at an offset 0x1020. However, there are only 440 bytes to put our code. So, the string can be placed at offset 0x20 so that the whole binary fits into the boot sector. Now link the object code accordingly and test it once on DOSEMU.

ld --oformat binary -Ttext 100 -Tdata 120 -o string.com string.o
dosemu string.com

If everything is fine, link it once again for boot sector and write it to the boot sector of your device. Again be very careful with the dd commands. /dev/sdb is only an example. You must change it to the device you want to write this code to.

ld --oformat binary -Ttext 7c00 -Tdata 7c20 -o string string.o
dd if=string of=/dev/sdb
echo -ne "\x55\xaa" | dd seek=510 bs=1 of=/dev/sdb

Once written to the device successfully, you may boot your computer with it.

No comments

|« « # » »|

Post a comment

RSS