assembly - california - What's the purpose of instructions for loading a register to itself?

prop 72 (2)

These curious NOP instructions go all the way back to the original ancestor processor the Intel 8008. In that chip, they were merely an result of the implementation of the register move instruction. Allowing MOV A,A etc simplified the instruction decoder and saved silicon space.

From the 8080 through to the Z80 (and beyond), these became required to maintain backwards compatibility. They even survived into the x86 world in the form

MOV AL,AL etc.

So most modern desktop machines still support these odd instructions.

Note: I used Intel mnemonics when describing Intel machines. Be assured that these assemble down to the same binary code as the Zilog mnemonics.

While looking through the Gameboy's instruction set, I came across instructions such as:



Each of these instructions has it's own opcode in this table, which makes me think they are of some importance due to the restrictions on the number of possible opcodes.

I first thought that it might be dereferencing a pointer in that register and storing the value at that pointer (like in this question), but in an emulator, LD A, A is implemented as:

Z80._r.a = Z80._r.a

They seem to have no effect on the state of the processor (just set registers to their own value) and take the same number of cycles as a NOP to execute.

Why are these opcodes included in the instruction set and what purpose do they serve?

They simplify the decoding unit, if you will check

78 LD A,B
79 LD A,C


47 LD B,A
40 LD B,B
41 LD B,C


48 LD C,B
49 LD C,C

You can notice, that the bottom 3 bits are reserved for source register (values 0-7 going B,C,D,E,H,L,(HL),A), 3 bits next to them are target register, again having the same 0-7 meaning (thus 0 vs 0 creates LD B,B), and the top two bits 01 select the LD, not sure from the quick glance if I deciphered it perfectly.

One would also expect then 76 to be LD (HL),(HL), which makes even less sense than LD A,A, so there's special logic to catch that one and do HALT instead.

So it's about simplicity of instruction decoder, using the same bit patterns to select source/target registers, and about not adding more transistors to catch the same,same situations, except the (HL),(HL) (which maybe will internally fail on both source and target requiring memory access, so maybe the extra "logic" is fairly simple in the HW design.

Keep in mind the early CPUs were often hand-designed and the amount of total transistors had to be kept low both to fit on the chip, and to be manageable to draw the circuitry by hand and verify its correctness.

EDIT: The Z80 has about 8500 transistors, you may want to check: and ... and GameBoy has a bit modified Z80, but the amount of total transistors will be very close-ish to the original value, although I didn't search for exact value, and I'm not sure how far into the future the Nintendo was extending it, maybe they could afford even going for something like 20-50k already, but I doubt it.

Addendum: lately I have read about the Russian Sinclair ZX Spectrum clones, which were heavily modified machines, adding extra power, memory and capabilities... And some of them are using these ld same,same opcodes to control DMA transfers, so on these machines code using them as nop would probably fail to execute properly. This is not GameBoy related, but in case you have binary targetting one of the "Sprinter" or similar Russian ZX clones, and you find one of these in disassembly, don't consider them automatically nop, they may be part of effective code actually doing something (most probably with DMA).