Assembly and opcodes

Intro

When a program is compiled (and linked) it will no longer contains readable lines of code. Instead it will consist of opcodes (operation codes). I would like to see whether I can find out what dome opcodes mean. You might say “just look at the intel or AMD manual and there you go”…but where is the fun in that.

Take care…I’m not an assembly professional!!! I’m an enthusiastic hobbyist. If you see errors/omissions/…. do not hestiate to point that out. (but please be nice ( 🙂 )

Approach

  1.  Make a very small assembly program (you will see in a second)
  2. Assemble it
  3. link it
  4. Make a hexdump (to make it a more easy read)
  5. Change one instruction in the program
  6. Assemble
  7. Link
  8. Make hexdump
  9. Search for differences in the two hexdumps
  10. See if you can learn something about opcodes

 

Do the work

1. Make a very small assembly program (you will see in a second)

I created the following small program

[martijn@fedora asm]$ cat mov_al_1.asm
section .data    ;I have no data

section .text    ;Here we start the coding

global _start    ;Define our entry point

_start:          ;Start
mov al, 1        ;Our instruction of interest

mov rax, 60      ;Syscall for exiting the program
mov rdi, 0       ;Return code is zero
syscall          ;Make the call

[martijn@fedora asm]$

As you might see. The program does nothing of interest. Only thing it does is moving the value 1 into the register al (1 byte width). The idea is only changing this mov instruction. If everything else stays the same…we should be able to see what changed and then learn something.

2. Assemble it

 [martijn@fedora asm]$ nasm -o mov_al_1.o -f elf64 mov_al_1.asm

3. Link it

[martijn@fedora asm]$ ld -o mov_al_1 mov_al_1.o

 4. Make a hexdump (to make it a more easy read)

[martijn@fedora asm]$ hexdump -C mov_al_1 > hd_mov_al_1

5. Change one instruction in the program

[martijn@fedora asm]$ cat mov_bl_1.asm
section .data

section .text

global _start

_start:
mov bl, 1

mov rax, 60
mov rdi, 0
syscall
[martijn@fedora asm]$

Only thing changed is “mov al, 1” to “mov bl, 1”

6. Assemble

[martijn@fedora asm]$ nasm -o mov_bl_1.o -f elf64 mov_bl_1.asm

7. Link

[martijn@fedora asm]$ ld -o mov_bl_1 mov_bl_1.o

8. Make hexdump

[martijn@fedora asm]$ hexdump -C mov_bl_1 > hd_mov_bl_1

9. Search for differences in the two hexdumps

The hexdump of the “mov al, 1” program

[martijn@fedora asm]$ cat hd_mov_al_1 
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 02 00 3e 00 01 00 00 00 00 10 40 00 00 00 00 00 |..>.......@.....|
00000020 40 00 00 00 00 00 00 00 e8 10 00 00 00 00 00 00 |@...............|
00000030 00 00 00 00 40 00 38 00 02 00 40 00 05 00 04 00 |....@.8...@.....|
00000040 01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 |..@.......@.....|
00000060 b0 00 00 00 00 00 00 00 b0 00 00 00 00 00 00 00 |................|
00000070 00 10 00 00 00 00 00 00 01 00 00 00 05 00 00 00 |................|
00000080 00 10 00 00 00 00 00 00 00 10 40 00 00 00 00 00 |..........@.....|
00000090 00 10 40 00 00 00 00 00 0e 00 00 00 00 00 00 00 |..@.............|
000000a0 0e 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |................|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000 b0 01 b8 3c 00 00 00 bf 00 00 00 00 0f 05 00 00 |...<............|
00001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001020 00 00 00 00 00 00 00 00 01 00 00 00 04 00 f1 ff |................|
00001030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001040 13 00 00 00 10 00 01 00 00 10 40 00 00 00 00 00 |..........@.....|
00001050 00 00 00 00 00 00 00 00 0e 00 00 00 10 00 01 00 |................|
00001060 00 20 40 00 00 00 00 00 00 00 00 00 00 00 00 00 |. @.............|
00001070 1a 00 00 00 10 00 01 00 00 20 40 00 00 00 00 00 |......... @.....|
00001080 00 00 00 00 00 00 00 00 21 00 00 00 10 00 01 00 |........!.......|
00001090 00 20 40 00 00 00 00 00 00 00 00 00 00 00 00 00 |. @.............|
000010a0 00 6d 6f 76 5f 61 6c 5f 31 2e 61 73 6d 00 5f 5f |.mov_al_1.asm.__|
000010b0 62 73 73 5f 73 74 61 72 74 00 5f 65 64 61 74 61 |bss_start._edata|
000010c0 00 5f 65 6e 64 00 00 2e 73 79 6d 74 61 62 00 2e |._end...symtab..|
000010d0 73 74 72 74 61 62 00 2e 73 68 73 74 72 74 61 62 |strtab..shstrtab|
000010e0 00 2e 74 65 78 74 00 00 00 00 00 00 00 00 00 00 |..text..........|
000010f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001120 00 00 00 00 00 00 00 00 1b 00 00 00 01 00 00 00 |................|
00001130 06 00 00 00 00 00 00 00 00 10 40 00 00 00 00 00 |..........@.....|
00001140 00 10 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 |................|
00001150 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 |................|
00001160 00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00 |................|
00001170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001180 10 10 00 00 00 00 00 00 90 00 00 00 00 00 00 00 |................|
00001190 03 00 00 00 02 00 00 00 08 00 00 00 00 00 00 00 |................|
000011a0 18 00 00 00 00 00 00 00 09 00 00 00 03 00 00 00 |................|
000011b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000011c0 a0 10 00 00 00 00 00 00 26 00 00 00 00 00 00 00 |........&.......|
000011d0 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
000011e0 00 00 00 00 00 00 00 00 11 00 00 00 03 00 00 00 |................|
000011f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001200 c6 10 00 00 00 00 00 00 21 00 00 00 00 00 00 00 |........!.......|
00001210 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00001220 00 00 00 00 00 00 00 00 |........|
00001228
[martijn@fedora asm]$

And the hexdump of the “mov bl, 1” program

[martijn@fedora asm]$ cat hd_mov_bl_1 
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 02 00 3e 00 01 00 00 00 00 10 40 00 00 00 00 00 |..>.......@.....|
00000020 40 00 00 00 00 00 00 00 e8 10 00 00 00 00 00 00 |@...............|
00000030 00 00 00 00 40 00 38 00 02 00 40 00 05 00 04 00 |....@.8...@.....|
00000040 01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 |..@.......@.....|
00000060 b0 00 00 00 00 00 00 00 b0 00 00 00 00 00 00 00 |................|
00000070 00 10 00 00 00 00 00 00 01 00 00 00 05 00 00 00 |................|
00000080 00 10 00 00 00 00 00 00 00 10 40 00 00 00 00 00 |..........@.....|
00000090 00 10 40 00 00 00 00 00 0e 00 00 00 00 00 00 00 |..@.............|
000000a0 0e 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |................|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000 b3 01 b8 3c 00 00 00 bf 00 00 00 00 0f 05 00 00 |...<............|
00001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001020 00 00 00 00 00 00 00 00 01 00 00 00 04 00 f1 ff |................|
00001030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001040 13 00 00 00 10 00 01 00 00 10 40 00 00 00 00 00 |..........@.....|
00001050 00 00 00 00 00 00 00 00 0e 00 00 00 10 00 01 00 |................|
00001060 00 20 40 00 00 00 00 00 00 00 00 00 00 00 00 00 |. @.............|
00001070 1a 00 00 00 10 00 01 00 00 20 40 00 00 00 00 00 |......... @.....|
00001080 00 00 00 00 00 00 00 00 21 00 00 00 10 00 01 00 |........!.......|
00001090 00 20 40 00 00 00 00 00 00 00 00 00 00 00 00 00 |. @.............|
000010a0 00 6d 6f 76 5f 62 6c 5f 31 2e 61 73 6d 00 5f 5f |.mov_bl_1.asm.__|
000010b0 62 73 73 5f 73 74 61 72 74 00 5f 65 64 61 74 61 |bss_start._edata|
000010c0 00 5f 65 6e 64 00 00 2e 73 79 6d 74 61 62 00 2e |._end...symtab..|
000010d0 73 74 72 74 61 62 00 2e 73 68 73 74 72 74 61 62 |strtab..shstrtab|
000010e0 00 2e 74 65 78 74 00 00 00 00 00 00 00 00 00 00 |..text..........|
000010f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001120 00 00 00 00 00 00 00 00 1b 00 00 00 01 00 00 00 |................|
00001130 06 00 00 00 00 00 00 00 00 10 40 00 00 00 00 00 |..........@.....|
00001140 00 10 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 |................|
00001150 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 |................|
00001160 00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00 |................|
00001170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001180 10 10 00 00 00 00 00 00 90 00 00 00 00 00 00 00 |................|
00001190 03 00 00 00 02 00 00 00 08 00 00 00 00 00 00 00 |................|
000011a0 18 00 00 00 00 00 00 00 09 00 00 00 03 00 00 00 |................|
000011b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000011c0 a0 10 00 00 00 00 00 00 26 00 00 00 00 00 00 00 |........&.......|
000011d0 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
000011e0 00 00 00 00 00 00 00 00 11 00 00 00 03 00 00 00 |................|
000011f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001200 c6 10 00 00 00 00 00 00 21 00 00 00 00 00 00 00 |........!.......|
00001210 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00001220 00 00 00 00 00 00 00 00 |........|
00001228
[martijn@fedora asm]$

Here in wordpress I do not know how to display the two dump alongside eachother, so to see the changes I use diff:

[martijn@fedora asm]$ diff --suppress-common-lines hd_mov_al_1 hd_mov_bl_1
14c14
< 00001000 b0 01 b8 3c 00 00 00 bf 00 00 00 00 0f 05 00 00 |...<............|
---
> 00001000 b3 01 b8 3c 00 00 00 bf 00 00 00 00 0f 05 00 00 |...<............|
24c24
< 000010a0 00 6d 6f 76 5f 61 6c 5f 31 2e 61 73 6d 00 5f 5f |.mov_al_1.asm.__|
---
> 000010a0 00 6d 6f 76 5f 62 6c 5f 31 2e 61 73 6d 00 5f 5f |.mov_bl_1.asm.__|
[martijn@fedora asm]$

10. See if you can learn something about opcodes

First thing I notice: There are differences on two lines (rather much apart from each other). I you look closely at the the last difference you see that the filename of the original assembly file is in the hexdump of the resulting executable (As a newbee that suprises me a bit). Since the first name (mov_al_1.asm) differs from the second (mov_bl_1.asm), it is no suprise they show up in the compare.

The first diffrence is more exciting (to me at least):

Codes I see in the first dump (mov al,1) :

b0 01

In the second snippet I see  (mov bl,1):

b3 01

Now I cheat a little…I do have a look at the instruction manual from intel. See following snippet:

Snippet of mov instruction manual

B0 + something and then an immediate byte value is the combination of opcodes to add a value to a (byte) register.

I now say:

b0 + 0 = b0 : code for addeing a direct value to (byte) register al

b0 + 3 = b3 : code for adding a direct value to (byte) register bl

I also tried this for mov cl,1

There I find:

b1 01

b0 + 1 = b1 : code for adding a direct value to (byte) cl

 

Conclusion

Remeber..I’m an enthusiastic amateur. Maybe above is very wrong. Also…I do not know to much about addressing modes etc. I do not know whether (anf iyes how much) this effects the code. So…if you feel I’m in error..please point it out and bring me some knowledge.

 

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *