tutorial - x86-64 assembly cheat sheet
rbp not allowed as SIB base? (1)
I'm quite new to x86-64 binary encoding. I'm trying to fix some old "assembler" code.
Anyways, I'm trying to do something like this (Intel syntax):
mov [rbp+rcx], al
The assembler is currently generating this:
88 04 0D
but that doesn't seem to be a valid instruction. If I change out the base in the SIB-byte from
to some other register, it works fine. Another way to make it work is to add a one byte displacement of zero (
88 44 0D 00
). This seems to happen with other similar opcodes.
Why can't I use
The encoding that would mean
is an escape code for no base register (just a disp32 in SIB or RIP-relative rel32 in ModRM). Most assemblers assemble
[rbp + disp8=0]
Since you don't need it scaled, use
[rcx + rbp]
instead to avoid needing a disp8=0, because
be an index.
(SS and DS are always equivalent in long mode, so it doesn't matter that base=RBP implies SS while base=RCX implies using the DS segment.)
x86 / x86-64 ModRM addressing mode encoding special cases
(from an answer I wrote on Why are rbp and rsp called general purpose registers? ). This question looks like the perfect place to copy or transplant this section.
can't be a base register with no displacement
: that encoding instead means: (in ModRM)
(RIP-relative), or (in SIB)
with no base register. (
uses the same 3 bits in ModRM/SIB, so this choice simplifies decoding by not making the instruction-length decoder look at
the REX.B bit
to get the 4th base-register bit).
[r13 + disp8=0]
(avoiding the problem by swapping base/index when that's an option).
as a base register always needs a SIB byte
. (The ModR/M encoding of base=RSP is escape code to signal a SIB byte, and again, more of the decoder would have to care about the REX prefix if
was handled differently).
can't be an index register
. This makes it possible to encode
, which is more useful than
[rsp + rsp]
. (Intel could have designed the ModRM/SIB encodings for 32-bit addressing modes (new in 386) so SIB-with-no-index was only possible with base=ESP. That would make
[eax + esp*4]
possible and only exclude
[esp + esp*1/2/4/8]
. But that's not useful, so they simplified the hardware by making index=ESP the code for no index regardless of the base. This allows two redundant ways to encode any base or base+disp addressing mode: with or without a SIB.)
be an index register
. Unlike the other cases, this doesn't affect instruction-length decoding. Also, it can't be worked around with a longer encoding like the other cases. AMD wanted AMD64's register set to be as orthogonal as possible, so it makes sense they'd spend a few extra transistors to check REX.X as part of the index / no-index decoding. For example,
[rsp + r12*4]
requires index=r12, so having
not fully generally purpose would make AMD64 a worse compiler target.
0: 41 8b 03 mov eax,DWORD PTR [r11] 3: 41 8b 04 24 mov eax,DWORD PTR [r12] # needs a SIB like RSP 7: 41 8b 45 00 mov eax,DWORD PTR [r13+0x0] # needs a disp8 like RBP b: 41 8b 06 mov eax,DWORD PTR [r14] e: 41 8b 07 mov eax,DWORD PTR [r15] 11: 43 8b 04 e3 mov eax,DWORD PTR [r11+r12*8] # *can* be an index
These all apply to 32-bit addressing modes as well; the encoding is identical except there's no EIP-relative encoding, just two redundant ways to encode disp32 with no base.
See also https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing_2 for tables like the ones in Intel's vol.2 manual.
This seems to happen with other similar opcodes.
ModRM encoding of r/m operands is always the same. Some opcodes require a register operand, and some require memory, but the actual ModRM + optional SIB + optional displacement is fixed so the same hardware can decode it regardless of the instruction.
There are a few rare opcodes like
mov al/ax/eax/rax, [qword absolute_address]
that don't use ModRM encoding at all for their operands, but any that do use the same format.