tutorial - x86-64 assembly cheat sheet



rbp not allowed as SIB base? (1)

I'm quite new to x86-64 binary encoding. I'm trying to fix some old "assembler" code.

Anyways, I'm trying to do something like this (Intel syntax):

mov    [rbp+rcx], al

The assembler is currently generating this:

88 04 0D

but that doesn't seem to be a valid instruction. If I change out the base in the SIB-byte from rbp to some other register, it works fine. Another way to make it work is to add a one byte displacement of zero ( 88 44 0D 00 ). This seems to happen with other similar opcodes.

Why can't I use rbp there with mod=00 ?


The encoding that would mean rbp is an escape code for no base register (just a disp32 in SIB or RIP-relative rel32 in ModRM). Most assemblers assemble [rbp] into [rbp + disp8=0] .

Since you don't need it scaled, use [rcx + rbp] instead to avoid needing a disp8=0, because rbp can be an index.

(SS and DS are always equivalent in long mode, so it doesn't matter that base=RBP implies SS while base=RCX implies using the DS segment.)

x86 / x86-64 ModRM addressing mode encoding special cases

(from an answer I wrote on Why are rbp and rsp called general purpose registers? ). This question looks like the perfect place to copy or transplant this section.

rbp / r13 can't be a base register with no displacement : that encoding instead means: (in ModRM) rel32 (RIP-relative), or (in SIB) disp32 with no base register. ( r13 uses the same 3 bits in ModRM/SIB, so this choice simplifies decoding by not making the instruction-length decoder look at the REX.B bit to get the 4th base-register bit). [r13] assembles to [r13 + disp8=0] . [r13+rdx] assembles to [rdx+r13] (avoiding the problem by swapping base/index when that's an option).

rsp / r12 as a base register always needs a SIB byte . (The ModR/M encoding of base=RSP is escape code to signal a SIB byte, and again, more of the decoder would have to care about the REX prefix if r12 was handled differently).

rsp can't be an index register . This makes it possible to encode [rsp] , which is more useful than [rsp + rsp] . (Intel could have designed the ModRM/SIB encodings for 32-bit addressing modes (new in 386) so SIB-with-no-index was only possible with base=ESP. That would make [eax + esp*4] possible and only exclude [esp + esp*1/2/4/8] . But that's not useful, so they simplified the hardware by making index=ESP the code for no index regardless of the base. This allows two redundant ways to encode any base or base+disp addressing mode: with or without a SIB.)

r12 can be an index register . Unlike the other cases, this doesn't affect instruction-length decoding. Also, it can't be worked around with a longer encoding like the other cases. AMD wanted AMD64's register set to be as orthogonal as possible, so it makes sense they'd spend a few extra transistors to check REX.X as part of the index / no-index decoding. For example, [rsp + r12*4] requires index=r12, so having r12 not fully generally purpose would make AMD64 a worse compiler target.

   0:   41 8b 03                mov    eax,DWORD PTR [r11]
   3:   41 8b 04 24             mov    eax,DWORD PTR [r12]      # needs a SIB like RSP
   7:   41 8b 45 00             mov    eax,DWORD PTR [r13+0x0]  # needs a disp8 like RBP
   b:   41 8b 06                mov    eax,DWORD PTR [r14]
   e:   41 8b 07                mov    eax,DWORD PTR [r15]
  11:   43 8b 04 e3             mov    eax,DWORD PTR [r11+r12*8] # *can* be an index

These all apply to 32-bit addressing modes as well; the encoding is identical except there's no EIP-relative encoding, just two redundant ways to encode disp32 with no base.

See also https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing_2 for tables like the ones in Intel's vol.2 manual.

This seems to happen with other similar opcodes.

ModRM encoding of r/m operands is always the same. Some opcodes require a register operand, and some require memory, but the actual ModRM + optional SIB + optional displacement is fixed so the same hardware can decode it regardless of the instruction.

There are a few rare opcodes like mov al/ax/eax/rax, [qword absolute_address] that don't use ModRM encoding at all for their operands, but any that do use the same format.





addressing-mode