E
Answers to Questions

E.1 Answers to Questions in Chapter 1

cmd.exe
ml64.exe
Address, data, and control
AL, AH, AX, and EAX
BL, BH, BX, and EBX
SIL, SI, and ESI
R8B, R8W, and R8D
FLAGS, EFLAGS, or RFLAGS
(a) 2, (b) 4, (c) 16, (d) 32, (e) 8
Any 8-bit register and any constant that can be represented with 8 bits
32
Destination Constant size

RAX 32

EAX 32

AX 16

AL 8

AH 8

mem₃₂ 32

mem₆₄ 32

NOTE

64-bit add operands support only 32-bit constants.
64

NOTE

Technically the x86-64 allows 16- and 32-bit registers as lea destination operands for legacy reasons; however, such instructions are not generally useful for calculating actual memory addresses (though they might be useful for sneaky addition operations).
Any memory operand will work, regardless of its size.
call
ret
Application binary interface
(a) AL, (b) AX, (c) EAX, (d) RAX, (e) XMM0, (f) RAX
RCX for integer operands, XMM0 for floating-point/vector operands
RDX for integer operands, XMM1 for floating-point/vector operands
R8 for integer operands, XMM2 for floating-point/vector operands
R9 for integer operands, XMM3 for floating-point/vector operands
dword or sdword
qword

E.2 Answers to Questions in Chapter 2

9 × 10³+ 3 × 10²+ 8 × 10¹+ 4 × 10⁰+ 5 × 10^-1+ 7 × 10^-2+ 6 × 10^-3
(a) 10, (b) 12, (c) 7, (d) 9, (e) 3, (f) 15
(a) A, (b) E, (c) B, (d) D, (e) 2, (f) C, (g) CF, (h) 98D1
(a) 0001_0010_1010_1111, (b) 1001_1011_1110_0111, (c) 0100_1010, (d) 0001_0011_0111_1111, (e) 1111_0000_0000_1101, (f) 1011_1110_1010_1101, (g) 0100_1001_0011_1000
(a) 10, (b) 11, (c) 15, (d) 13, (e) 14, (f) 12
(a) 16, (b) 64, (c) 128, (d) 32, (e) 4, (f) 8, (g) 4
(a) 2, (b) 4, (c) 8, (d) 16
(a) 16, (b) 256, (c) 65,636, (d) 2
4
0 through 7
Bit 0
Bit 31
(a) 0, (b) 0, (c) 0, (d) 1
(a) 0, (b) 1, (c) 1, (d) 1
(a) 0, (b) 1, (c) 1, (d) 0
1
AND
OR
NOT
XOR
not
1111_1011
0000_0010
(a) and (c) and (e)
neg
(a) and (c) and (d)
jmp
label:
Carry, overflow, zero, and sign
JZ
JC
JA, JAE, JBE, JB, JE, JNE (and the synonyms JNA, JNAE, JNB, JNBE, plus other synonyms)
JG, JGE, JL, JLE, JE, JNE (and the synonyms JNG, JNGE, JNL, and JNLE)
ZF = 1 if the result of the shift is 0.
The HO bit shifted out of the operand goes into the carry flag.
If the next-to-HO bit is different from the HO bit before the shift, the OF will be set; otherwise, it is cleared, though only for 1-bit shifts.
The SF is set equal to the HO bit of the result.
ZF = 1 if the result of the shift is 0.
The LO bit shifted out of the operand goes into the carry flag.
If the next-to-HO bit is different from the HO bit before the shift, the OF will be set; otherwise, it is cleared, but only for 1-bit shifts.
The SF is always clear after the SHR instruction because a 0 is always shifted into the HO bit of the result.
ZF = 1 if the result of the shift is 0.
The LO bit shifted out of the operand goes into the carry flag.
The OF is always clear after SAR as it is impossible for the sign to change.
The SF is set equal to the HO bit of the result, though technically it will never change.
The HO bit shifted out of the operand goes into the carry flag.
It doesn’t affect the ZF.
The LO bit shifted out of the operand goes into the carry flag.
It doesn’t affect the sign flag.
Multiplication by 2
Division by 2
Multiplication and division
Subtract them and see if their difference is less than a small error value.
A value that has a 1 bit in the HO mantissa position
7
30h through 39h
Apostrophes and quotes
UTF-8, UTF-16, and UTF-32
A scalar integer value that represents a single Unicode character
A block of 65,536 different Unicode characters

E.3 Answers to Questions in Chapter 3

RIP
Operation code, the numeric encoding for a machine instruction
Static and scalar variables
±2GB
The address of the memory location to access
RAX
lea
The final address obtained after all addressing mode calculations are completed
1, 2, 4, or 8
2GB total memory
You can use the VAR[REG] addressing mode(s) to directly access elements of an array using a 64-bit register as an index into the array without first loading the address of the array into a separate base register.
The .data section can hold initialized data values; the .data? section can contain only uninitialized variables.
.code and .const
.data and .data?
An offset into a particular section (for example, .data)
Use some_ID label some_type to inform MASM that the following data is of type some_type when, in fact, it could be another type.
MASM will combine them into a single section.
Use the align 8 statement.
Memory management unit
If b is at an address that is at the last byte in an MMU page and the next page is not readable, loading a word from the memory location starting with b will produce a general protection fault.
A constant expression plus the base address of a variable in memory
To coerce the following operand type to a different type
Little-endian values appear in memory with their LO byte at the lowest address and the HO byte at the highest address. Big-endian values are the opposite: their HO byte appears at the lowest address, and their LO byte appears at the highest address in memory.
xchg al, ah or xchg ah, al
bswap eax
bswap rax
(a) Subtract 8 from RSP, (b) Store the value in RAX at the location pointed at by RSP.
(a) Load RAX from the 8 bytes pointed at by RSP, (b) Add 8 to RSP.
Reverse
Last in, first out
Move the data to and from the stack using the [RSP ± const] addressing mode.
The Windows ABI requires the stack to be aligned on a 16-byte boundary; pushing RAX might make the stack aligned on an 8-byte (but not 16-byte) boundary.

E.4 Answers to Questions in Chapter 4

imul reg, constant
imul destreg, srcreg, constant
imul destreg, srcreg
A symbolic (named) constant for which MASM will substitute the literal constant for the name everywhere it appears in the source file
=, equ, textequ
Text equates substitute a textual string that can be any text; numeric equates must be assigned a numeric constant value that can be represented with a 64-bit integer.
Use the text delimiters < and > around the string literal; for example, <"a long string">.
An arithmetic expression whose value MASM can calculate during assembly
lengthof
The offset into the current section
this and $
Use the constant expression $-startingLocation.
Use a series of (numeric) equates, with each successive equate set to the value of the previous equate plus one; for example:
```
val1 = 0
val2 = val1 + 1
val3 = val2 + 1
etc.
```
Using the typedef directive
A pointer is a variable in memory that holds the address of another object in memory.
Load the pointer variable into a 64-bit register and use the register-indirect addressing mode to reference that address.
Using a qword data declaration, or another data type that is 64 bits in size
The offset operator
(a) Uninitialized pointers, (b) Using pointers to hold an illegal value, (c) Using a pointer after its storage has been freed (dangling pointers), (d) Failing to free storage after it is no longer being used (memory leak), (e) Accessing indirect data by using the wrong data type
Using a pointer after its storage has been freed
Failing to free storage after you are done using it
An aggregate type composed of smaller data objects
A sequence of characters ending with a 0 byte (or other 0 value)
A string containing a length value as its first element
A descriptor is a data type containing a pointer (to the character data), string length, and possibly other information that describes the string data.
A homogenous collection of data elements (all with the same type)
The memory address of the first element of the array
array byte 10 dup (?) (as an example)
Simply fill in the initial values as the operand field of a byte, word, dword, or other data declaration directive. Also, you could use a sequence of one or more constant values as the dup operator operand; for example, 5 dup (2, 3).
(a) base_address + index * 4 (4 is the element size), (b) W[i,j] = base_address + (i * 8 + j) * 2 (2 is the element size), (c) R[i,j,k] = base_address +(((i * 4) + j) * 6 + k) * 8 (8 is the element size)
An organization for multidimensional arrays where you store the elements of each row in the array in contiguous memory locations and then store each row, one after the other, in memory
An organization for multidimensional arrays where you store the elements of each column in the array in contiguous memory locations and then store each column, one after the other, in memory
W word 4 dup (8 dup (?))
A heterogeneous collection of data elements (each field could have different types)
struct and ends
The dot operator
A heterogeneous collection of data elements (each field could have different types); the offset of each field in the union begins at 0.
union and ends
The fields of a record and struct appear at successive memory locations within the struct (each field has its own block of bytes); the fields of a union overlap one another, with each field beginning at offset zero in the union.
An unnamed union whose fields are treated as fields of the enclosing struct

E.5 Answers to Questions in Chapter 5

It pushes the return address onto the stack (the address of the next instruction after the call) and then jumps to the address specified by the operand.
It pops a return address off the stack and moves the address into the RIP register, transferring control to the instruction just beyond the call to the current procedure.
After popping the return address, the CPU adds this value to RSP, removing that number of bytes of parameters from the stack.
The address of the instruction just beyond the call to the procedure
Namespace pollution occurs when so many symbols, identifiers, or names are defined in a source file that it becomes difficult to select new, unique names to use in that source file.
Put two colons after the name; for example, id::.
Use the option noscoped directive just before the procedure.
Use the push instruction to save the register values on the stack upon entry into the procedure; then use the pop instruction to restore the register values immediately before returning from the procedure.
Code is difficult to maintain. (A secondary issue, though minor, is that it takes more space.)
Performance—because you’re often preserving registers that don’t need to be preserved for the calling code
When the subroutine attempts to return, it uses the garbage you left on the stack as the return address, which usually produces undefined results (a program crash).
The subroutine uses whatever was on the stack prior to the call as the return address, with undefined results.
A collection of data, including parameters, local variables, the return address, and other items, associated with the call (activation) of a procedure
RBP
8 bytes (64 bits)

push rbp
mov  rbp, rsp
sub  rsp, sizeOfLocals ; Assuming there are local variables

```
leave
ret
```
and rsp, -16
The section of the source file (usually the body of a procedure) where the symbol is visible and usable in the program
From the moment storage is allocated for the variable to the point the system deallocates that storage
Variables whose storage is automatically allocated upon entry into a block of code (usually a procedure) and automatically deallocated upon exiting that block of code
Upon entry into a procedure (or the block of code associated with that automatic variable)
Using textequ directives or the MASM local directive
var1: –2; local2: –8 (MASM aligns variable on dword boundary); dVar: –9; qArray: –32 (base address of array is the lowest memory address); rlocal: –40 (base address of array is the lowest memory address); ptrVar: –48
option prologue:PrologueDef and option epilogue:EpilogueDef. Should also supply option prologue:none and option epilogue:none to turn this off.
Before MASM emits any code for the procedure, after all the local directives
Wherever a ret instruction appears
The actual parameter’s value
The memory address of the actual parameter’s value
RCX, RDX, R8, and R9 (or smaller subcomponents of these registers)
XMM0, XMM1, XMM2, or XMM3
On the stack, above the shadow locations (32 bytes) reserved for the arguments passed in the registers
Procedures are free to modify volatile registers without preserving their values; procedures must preserve the values of nonvolatile registers across a procedure invocation.
RAX, RCX, RDX, R8, R9, R10, R11, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, and the HO 128 bits of all the YMM and ZMM registers
RBX, RSI, RDI, RBP, RSP, R12, R13, R14, R15, and XMM6–XMM15. Also, the direction flag must be clear upon return from a procedure.
Using positive offsets from the RBP register
Storage reserved on the stack for parameters the caller passes in the RCX, RDX, R8 and R9 registers
32 bytes
32 bytes
32 bytes

NOTE

Shadow storage is the same regardless of how many parameters you pass (including none).
parm1: RBP + 16; parm2: RBP + 24; parm3: RBP + 32; parm4: RBP + 40
```
mov rax, parm4
mov al, [rax]
```
lclVar1: RBP – 1; lclVar2: RBP – 4 (aligned to 2-byte boundary); lclVar3: RBP – 8; lclVar4: RBP – 16
By reference
Application binary interface
In the RAX register
The address of a procedure passed as a parameter
Indirectly. Typically by using a call parm instruction, where parm is the procedural parameter, a qword variable containing the address of the procedure. You could also load the parameter value into a 64-bit register and indirectly call the procedure through that register.
Allocate local storage to hold the register values to preserve and move the register data into that storage upon procedure entry, and then move the data back into the registers just before returning from the procedure.

E.6 Answers to Questions in Chapter 6

AL for 8-bit operands, AX for 16-bit operands, EAX for 32-bit operands, and RAX for 64-bit operands
8-bit mul operation: 16 bits; 16-bit mul operation: 32 bits; 32-bit mul operation: 64 bits; 64-bit mul operator: 128 bits. The CPU put the products at AX for 8×8 products, DX:AX for 16×16 products, EDX:EAX for 32×32 products, and RDX:RAX for 64×64 products.
The quotient in AL, AX, EAX, or RAX and the remainder in AH, DX, EDX, or RDX
Sign-extend AX into DX.
Zero-extend EAX into EDX.
A division by 0 and producing a quotient that will not fit into the accumulator register (AL, AX, EAX, or RAX)
By setting the carry and overflow flags
They scramble the flag; that is, they leave it in an undefined state.
The extended-precision imul instruction produces a 2 × n-bit result, uses implied operands (AL, AX, EAX, and RAX), and modifies the AH, DX, EDX, and RDX registers. Also, the extended-precision imul instruction does not allow constant operands, whereas the generic imul instruction does.
cbw, cwd, cdq, and cqo
They scramble all the flags, leaving them in an undefined state.
It sets the zero flag if the two operands are equal.
It sets the carry flag if the first operand is less than the second operand.
The sign and overflow flags are different if the first operand is less than the second operand; they are the same if the first operand is greater than or equal to the second operand.
An 8-bit register or memory location
They set the operand to 1 if the condition is true, or to false if the condition is not true.
The test instruction is the same as the and instruction except it does not store the result to the destination (first) operand; it only sets the flags.
They both set the condition code flags the same way.
Supply the operand to be tested as the first (destination) operand and an immediate constant containing a single 1 bit in the bit position to test. After the test instruction, the zero flag will contain the state of the desired bit.

The following are some possible, not the only, solutions:

x = x + y

mov eax, x
add eax, y
mov x, eax

x = y – z

mov eax, y
sub eax, z
mov x, eax

x = y * z

mov  eax, y
imul eax, z
mov  x, eax

x = y + z * t

mov  eax, z
imul eax, t
add  eax, y
mov  x, eax

x = (y + z) * t

mov  eax, y
add  eax, z
imul eax, t
mov  x, eax

x = -((x*y)/z)

mov  eax, x
imul y          ; Note: Sign-extends into EDX
idiv z
mov  x, eax

x = (y == z) && (t != 0)

mov   eax, y
cmp   eax, z
sete  bl
cmp   t, 0
setne bh
and   bl, bh
movzx eax, bl   ; Because x is a 32-bit integer
mov   x, eax

The following are some possible, not the only, solutions:

x = x * 2

shl   x, 1

x = y * 5

mov   eax, y
lea   eax, [eax][eax*4]
mov   x, eax

Here is another solution:

mov   eax, y
mov   ebx, eax
shl   eax, 2
add   eax, ebx
mov   x, eax

x = y * 8

mov   eax, y
shl   eax, 3
mov   x, eax

x = x /2

shr   x, 1

x = y / 8

mov   ax, y
shr   ax, 3
mov   x, ax

x = z / 10

movzx eax, z
imul  eax, 6554  ; Or 6553
shr   eax, 16
mov   x, ax

x = x + y

fld   x
fld   y
faddp
fstp  x

x = y – z

fld   y
fld   z
fsubp
fstp  x

x = y * z

fld   y
fld   z
fmulp
fstp  x

x = y + z * t

fld   y
fld   z
fld   t
fmulp
faddp
fstp  x

x = (y + z) * t

fld   y
fld   z
faddp
fld   t
fmulp
fstp  x

x = -((x * y)/z)

fld   x
fld   y
fmulp
fld   z
fdivp
fchs
fstp  x

x = x + y

movss xmm0, x
addss xmm0, y
movss x, xmm0

x = y – z

movss xmm0, y
subss xmm0, z
movss x, xmm0

x = y * z

movss xmm0, y
mulss xmm0, z
movss x, xmm0

x = y + z * t

movss xmm0, z
mulss xmm0, t
addss xmm0, y
movss x, xmm0

b = x < y

fld    y
fld    x
fcomip st(0), st(1)
setb   b
fstp   st(0)

b = x >= y && x < z

fld    y
fld    x
fcomip st(0), st(1)
setae  bl
fstp   st(0)
fld    z
fld    x
fcomip st(0), st(1)
setb   bh
fstp   st(0)
and    bl, bh
mov    b, bl

E.7 Answers to Questions in Chapter 7

Use the lea instruction or the offset operator.
option noscoped
option scoped
jmp reg64 and jmp mem64
A piece of code that maintains history information in variables or via the program counter
If the second letter of the jump mnemonic is n, remove the n; otherwise, insert an n as the second character.
jpo and jpe

NOTE

Technically, the jcxz, jecxz, and jrcxz instructions are also exceptions.
A short code sequence used to extend the range of a jump or call instruction beyond the ±2GB range
cmovcc reg, src, where cc is one of the conditional suffixes (which follow a conditional jump), reg is a 16-, 32-, or 64-bit register, and src is a source register or memory location that is the same size as reg.
You can conditionally execute a large set of different types of instructions by using a conditional jump without the time penalty of a control transfer.
The destination has to be a register, and 8-bit registers are not allowed.
Complete Boolean evaluation of an expression evaluates all components of the expression, even if it is not logically necessary to do so; short-circuit evaluation stops as soon as it determines that the expression must be true or false.

if(x == y || z > t)
{
    Do something 
}
    mov  eax, x
    cmp  eax, y
    sete bl
    mov  eax, z
    cmp  eax, t
    seta bh
    or   bl, bh
    jz   skipIF
     Code for statements that "do something"
skipIF:

if(x != y && z < t)
{
     THEN statements
}
Else
{
     ELSE statements
}
    mov   eax, x
    cmp   eax, y
    setne bl
    mov   eax, z
    cmp   eax, t
    setb  bh
    and   bl, bh
    jz    doElse
     Code for THEN statements
    jmp   endOfIF

doElse:
     Code for ELSE statements
endOfIF:

1st IF:
    mov  ax, x
    cmp  ax, y
    jeq  doBlock
    mov  eax, z
    cmp  eax, t
    jnl  skipIF
doBlock:     Code for statements that "do something"
skipIF:

2nd IF:
    mov   eax, x
    cmp   eax, y
    je    doElse
    mov   eax, z
    cmp   eax, t
    jnl   doElse
     Code for THEN statements
    jmp   endOfIF

doElse:
     Code for ELSE statements
endOfIF:

switch(s)
{
   case 0:   case 0 code  break;
   case 1:   case 1 code  break;
   case 2:   case 2 code  break;
   case 3:   case 3 code  break;
}

    mov eax, s ; Zero-extends!
    cmp eax, 3
    ja  skipSwitch
    lea rbx, jmpTbl
    jmp [rbx][rax * 8]
jmpTbl qword case0, case1, case2, case3

case0: case 0 code
       jmp skipSwitch

case1: case 1 code
       jmp skipSwitch

case2: case 2 code
       jmp skipSwitch

case3: case 3 code

skipSwitch:

switch(t)
{
   case 2:  case 0 code break;
   case 4:  case 4 code break;
   case 5:  case 5 code break;
   case 6:  case 6 code break;
   default: default code
}
    mov eax, t ; Zero-extends!
    cmp eax, 2
    jb  swDefault
    cmp eax, 6
    ja  swDefault
    lea rbx, jmpTbl
    jmp [rbx][rax * 8 – 2 * 8]
jmpTbl qword case2, swDefault, case4, case5, case6

swDefault: default code
       jmp endSwitch

case2: case 2 code
       jmp endSwitch

case4: case 4 code
       jmp endSwitch

case5: case 5 code
       jmp endSwitch

case6: case 6 code

endSwitch:

switch(u)
{
   case 10:   case 10 code  break;
   case 11:   case 11 code  break;
   case 12:   case 12 code  break;
   case 25:   case 25 code  break;
   case 26:   case 26 code  break;
   case 27:   case 27 code  break;
   default:   default code
} 
     lea rbx, jmpTbl1  ; Assume cases 10-12
     mov eax, u        ; Zero-extends!
     cmp eax, 10
     jb  swDefault
     cmp eax, 12
     jbe sw1
     cmp eax, 25
     jb  swDefault
     cmp eax, 27
     ja  swDefault
     lea rbx, jmpTbl2
     jmp [rbx][rax * 8 – 25 * 8]
sw1: jmp [rbx][rax*8-2*8]
jmpTbl1 qword case10, case11, case12
jmpTbl2 qword case25, case26, case27

swDefault: default code
       jmp endSwitch

case10: case 10 code
       jmp endSwitch

case11: case 11 code
       jmp endSwitch

case12: case 12 code
       jmp endSwitch

case25: case 25 code
       jmp endSwitch

case26: case 26 code
       jmp endSwitch

case27: case 27 code

endSwitch:

while(i < j)
{
     Code for loop body
}

whlLp:
     mov eax, i
     cmp eax, j
     jnl endWhl
      Code for loop body
     jmp whlLp
endWhl:

while(i < j && k != 0)
{
     Code for loop body, part a
    if(m == 5) continue;
     Code for loop body, part b
    if(n < 6) break;
     Code for loop body, part c
}

; Assume short-circuit evaluation:

whlLp:
     mov eax, i
     cmp eax, j
     jnl endWhl
     mov eax, k
     cmp eax, 0
     je  endWhl
      Code for loop body, part a
     cmp m, 5
     je  whlLp
      Code for loop body, part b
     cmp n, 6
     jl  endWhl
      Code for loop body, part c
     jmp whlLp
endWhl:

do
{
   Code for loop body
} while(i != j);

doLp:
   Code for loop body
     mov eax, i
     cmp eax, j
     jne doLp

do
{
   Code for loop body, part a
    if(m != 5) continue;
   Code for loop body, part b
    if(n == 6) break;
   Code for loop body, part c
} while(i < j && k > j);

doLp:
   Code for loop body, part a
     cmp m, 5
     jne doCont
   Code for loop body, part b
     cmp n, 6
     je  doExit
   Code for loop body, part c
doCont:     mov eax, i
     cmp eax, j
     jnl doExit
     mov eax, k
     cmp eax, j
     jg  doLp
doExit:

for(int i = 0; i < 10; ++i)
{
   Code for loop body
}

       mov i, 0
forLp: cmp i, 10
       jnl forDone
        Code for loop body
       inc i
       jmp forLp
forDone:

E.8 Answers to Questions in Chapter 8

You compute x = y + z as follows:

mov rax, qword ptr y
add rax, qword ptr z
mov qword ptr x, rax
mov rax, qword ptr y[8]
adc rax, qword ptr z[8]
mov qword ptr x[8], rax

mov rax, qword ptr y
add rax, qword ptr z
mov qword ptr x, rax
mov eax, dword ptr z[8] 
adc eax, qword ptr y[8]
mov dword ptr x[8], eax

mov eax, dword ptr y
add eax, dword ptr z
mov dword ptr x, eax
mov ax, word ptr z[4]
adc ax, word ptr y[4]
mov word ptr x[4], ax

You compute x = y – z as follows:

mov rax, qword ptr y
sub rax, qword ptr z
mov qword ptr x, rax
mov rax, qword ptr y[8]
sbb rax, qword ptr z[8]
mov qword ptr x[8], rax
mov rax, qword ptr y[16]
sbb rax, qword ptr z[16]
mov qword ptr x[16], rax

mov rax, qword ptr y
sub rax, qword ptr z
mov qword ptr x, rax
mov eax, dword ptr y[8]
sbb eax, dword ptr z[8]
mov dword ptr x[8], eax

mov rax, qword ptr y
mul qword ptr z
mov qword ptr x, rax
mov rbx, rdx

mov rax, qword ptr y
mul qword ptr z[8]
add rax, rbx
adc rdx, 0
mov qword ptr x[8], rax
mov rbx, rdx

mov rax, qword ptr y[8]
mul qword ptr z
add x[8], rax
adc rbx, rdx

mov rax, qword ptr y[8]
mul qword ptr z[8]
add rax, rbx
mov qword ptr x[16], rax
adc rdx, 0
mov qword ptr x[24], rdx

mov  rax, qword ptr y[8]
cqo
idiv qword ptr z
mov  qword ptr x[8], rax
mov  rax, qword ptr y
idiv qword ptr z
mov  qword ptr x, rax

The conversions are as follows:

; Note: order of comparison (HO vs. LO) is irrelevant
; for "==" comparison.

    mov rax, qword ptr x[8]
    cmp rax, qword ptr y[8]
    jne skipElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    jne skipElse
    then code
skipElse:

    mov rax, qword ptr x[8]
    cmp rax, qword ptr y[8]
    jnb skipElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    jnb skipElse
    then code
skipElse:

    mov rax, qword ptr x[8]
    cmp rax, qword ptr y[8]
    jna skipElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    jna skipElse
    then code
skipElse:

; Note: order of comparison (HO vs. LO) is irrelevant
; for "!=" comparison.

    mov rax, qword ptr x[8]
    cmp rax, qword ptr y[8]
    jne doElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    je skipElse
doElse:
    then code
skipElse:

The conversions are as follows:

; Note: order of comparison (HO vs. LO) is irrelevant
; for "==" comparison.

    mov eax, dword ptr x[8]
    cmp eax, dword ptr y[8]
    jne skipElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    jne skipElse
    then code
skipElse:

    mov eax, dword ptr x[8]
    cmp eax, dword ptr y[8]
    jnb skipElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    jnb skipElse
    then code
skipElse:

    mov eax, dword ptr x[8]
    cmp eax, dword ptr y[8]
    jna skipElse
    mov rax, qword ptr x
    cmp rax, qword ptr y
    jna skipElse
    then code
skipElse:

The conversions are as follows:

neg qword ptr x[8]
neg qword ptr x
sbb qword ptr x[8], 0

xor rax, rax
xor rdx, rdx
sub rax, qword ptr x
sbb rdx, qword ptr x[8]
mov qword ptr x, rax
mov qword ptr x[8], rdx

mov rax, qword ptr y
mov rdx, qword ptr y[8]
neg rdx
neg rax
sbb rdx, 0
mov qword ptr x, rax
mov qword ptr x[8], rdx

xor rdx, rdx
xor rax, rax
sub rax, qword ptr y
sbb rdx, qword ptr y[8]
mov qword ptr x, rax
mov qword ptr x[8], rdx

The conversions are as follows:

mov rax, qword ptr y
and rax, qword ptr z
mov qword ptr x, rax
mov rax, qword ptr y[8]
and rax, qword ptr z[8]
mov qword ptr x[8], rax

mov rax, qword ptr y
or  rax, qword ptr z
mov qword ptr x, rax
mov rax, qword ptr y[8]
or  rax, qword ptr z[8]
mov qword ptr x[8], rax

mov rax, qword ptr y
xor rax, qword ptr z
mov qword ptr x, rax
mov rax, qword ptr y[8]
xor rax, qword ptr z[8]
mov qword ptr x[8], rax

mov rax, qword ptr y
not rax
mov qword ptr x, rax
mov rax, qword ptr y[8]
not rax
mov qword ptr x[8], rax

mov rax, qword ptr y
shl rax, 1
mov qword ptr x, rax
mov rax, qword ptr y[8]
rcl rax, 1
mov qword ptr x[8], rax

mov rax, qword ptr y[8]
shr rax, 1
mov qword ptr x[8], rax
mov rax, qword ptr y
rcr rax, 1
mov qword ptr x rax

mov rax, qword ptr y[8]
sar rax, 1
mov qword ptr x[8], rax
mov rax, qword ptr y
rcr rax, 1
mov qword ptr x, rax

rcl qword ptr x, 1
rcl qword ptr x[8], 1

rcr qword ptr x[8], 1
rcr qword ptr x, 1

E.9 Answers to Questions in Chapter 9

btoh        proc

            mov     ah, al      ; Do HO nibble first
            shr     ah, 4       ; Move HO nibble to LO
            or      ah, '0'     ; Convert to char
            cmp     ah, '9' + 1 ; Is it "A" to "F"?
            jb      AHisGood
            
; Convert 3Ah to 3Fh to "A" to "F".

            add     ah, 7

; Process the LO nibble here.
            
AHisGood:   and     al, 0Fh     ; Strip away HO nibble
            or      al, '0'     ; Convert to char
            cmp     al, '9' + 1 ; Is it "A" to "F"?
            jb      ALisGood
            
; Convert 3Ah to 3Fh to "A" to "F".

            add     al, 7
ALisGood:   ret
btoh        endp

8
Call qToStr twice: once with the HO 64 bits and once with the LO 64 bits. Then concatenate the two strings.
fbstp
If the input value is negative, emit a hyphen (-) character and negate the value; then call the unsigned decimal conversion function. If the number is 0 or positive, just call the unsigned decimal conversion function.

; Inputs:
;    RAX -   Number to convert to string.
;    CL  -   minDigits (minimum print positions).
;    CH  -   Padding character.
;    RDI -   Buffer pointer for output string.

It will produce the full string required; the minDigits parameter specifies the minimum string size.

; On Entry:

   ; r10        - Real10 value to convert.
   ;              Passed in ST(0).

   ; fWidth     - Field width for the number (note that this
   ;              is an *exact* field width, not a minimum
   ;              field width).
   ;              Passed in EAX (RAX).

   ; decimalpts - # of digits to display after the decimal pt.
   ;              Passed in EDX (RDX). 

   ; fill       - Padding character if the number is smaller
   ;              than the specified field width.
   ;              Passed in CL (RCX).

   ; buffer     - r10ToStr stores the resulting characters
   ;              in this string.
   ;              Address passed in RDI.

   ; maxLength  - Maximum string length.
   ;              Passed in R8D (R8).

A string containing fWidth # characters.

; On Entry:

;    e10     - Real10 value to convert.
;              Passed in ST(0).

;    width   - Field width for the number (note that this
;              is an *exact* field width, not a minimum
;              field width).
;              Passed in RAX (LO 32 bits).

;    fill    - Padding character if the number is smaller
;              than the specified field width.
;              Passed in RCX.

;    buffer  - e10ToStr stores the resulting characters in
;              this buffer (passed in EDI).
;              Passed in RDI (LO 32 bits).

;    expDigs - Number of exponent digits (2 for real4,
;              3 for real8, and 4 for real10).
;              Passed in RDX (LO 8 bits).

A character that separates a sequence of characters from other such sequences, such as beginning or ending a numeric string
Illegal character on input and numeric overflow during conversion

E.10 Answers to Questions in Chapter 10

The set of all possible input (parameter) values
The set of all possible function output (return) values
Computes AL = [RBX + AL × 1]
Byte values: domain is the set of all integers in the range 0 to 255, and the range is also the set of all integers in the range 0 to 255.

The code implementing the functions is as follows:

```
lea rbx, f
mov al, input
xlat
```

lea rbx, f
movzx rax, input
mov ax, [rbx][rax * 2]

lea rbx, f
movzx rax, input
mov al, [rbx][rax * 1]

lea rbx, f
movzx rax, input
mov ax, [rbx][rax * 2]

Modifying input values that are out of a specific range so that they lie within the input domain of the function
Main memory is so slow that it might be faster to compute the value than to look it up via a table.

E.11 Answers to Questions in Chapter 11

Use the cpuid instruction.
Because Intel and AMD have different feature sets
EAX = 1
ECX bit 20
(a) _TEXT, (b) _DATA, (c) _BSS, (d) CONST
PARA or 16 bytes

data  segment align(64) 'DATA'
           .
           .
           .
data  ends

AVX/AVX2/AVX-256/AVX-512
A data type within a SIMD register; typically, 1, 2, 4, or 8 bytes wide
Scalar instructions operate on a single piece of data; vector instructions operate, simultaneously, on two or more pieces of data.
16 bytes
32 bytes
64 bytes
movd
movq
movaps, movapd, and movdqa
movups, movupd, and movdqu

NOTE

lddqu also works.
movhps or movhpd
movddup
pshufb
pshufd, though pshufb could also work
(v)pextrb, (v)pextrw, (v)pextrd, or (v)pextrq
(v)pinsrb, (v)pinsrw, (v)pinsrd, or (v)pinsrq
It takes the bits in the second operand, inverts them, and then logically ANDs these inverted bits with the first (destination) operand.
pslldq
pslrdq
psllq
pslrq
The carry out of the HO bit is lost.
In a vertical addition, the CPU sums values found in the same lane of two separate XMM registers; in a horizontal addition, the CPU sums values found in adjacent lanes of the same XMM register.
In the destination XMM register, by storing 0FFh in the corresponding lane of the destination XMM register (0 for false)
Swap the operands of the pcmpgtq instruction.
It copies the HO bit of each byte in an XMM register into the corresponding bit position of a general-purpose 16-bit register; for example, bit 7 of lane 0 goes into bit 0.
(a) 4 on SSE, 8 on AVX2, (b) 2 on SSE, 4 on AVX2
and rax, -16
pxor xmm0, xmm0
pcmpeqb xmm1, xmm1
include

E.12 Answers to Questions in Chapter 12

and/andn
btr
or
bts
xor
btc
test/and
bt
pext
pdep
bextr
bsf
bsr
Invert the register and use bsf.
Invert the register and use bsr.
popcnt

E.13 Answers to Questions in Chapter 13

Compile-time language
During the assembly and compilation process
echo (or %out)
.err
The = directive
!
It replaces an expression with text representing the value of that compile-time expression.
It replaces a text symbol with the expansion of its text.
It concatenates two or more textual strings at assembly time and stores the result into a text symbol.
It searches for a substring within a larger string in a MASM text object and returns the index of the substring into that object; 0 if the substring does not appear in the larger string.
It returns the length of a MASM text string.
It returns a substring from a larger MASM text string.
if, elseif, else, and endif
while, for, forc, and endm
forc
macro, endm
Specify the macro’s name where you want the text expansion to occur.
As operands to the macro directive
Specify :req after the parameter name in the macro operand field.
Macro parameters are optional, by default, if they don’t have the :req suffix.
Use the :vararg suffix after the last macro parameter declaration.
Use conditional assembly directives such as ifb or ifnb to see if the actual macro argument is blank.
Use the local directive.
exitm
Use exitm <text>.
opattr

E.14 Answers to Questions in Chapter 14

Bytes, words, dwords, and qwords
movs, cmps, scas, stos, and lods
Bytes and words
RSI, RDI, and RCX
RSI and RDI
RCX, RSI, and AL
RDI and EAX
Dir = 0
Dir = 1
Clear the direction flag; alternatively, preserve its value.
Clear
movs and stos
When the source and destination blocks overlap and the source address starts at a lower memory address than the destination block
This is the default condition; you would also clear the direction flag when the source and destination blocks overlap and the source address starts at a higher memory address than the destination block.
Portions of the source block can be replicated in the destination block.
repe
Direction flag should be clear.
No, string instructions test RCX prior to the string operation when using a repeat prefix.
scasb
stos
lods and stos
lods
Verify that the CPU supports SSE 4.2 instructions.
pcmpistri and pcmpistrm
pcmpestri and pcmpestrm
RAX holds the src1 length, and RDX holds the src2 length.
Equal any, or possibly, equal range
Equal each
Equal ordered
The pcmpXstrY instructions always read 16 bytes of memory, even if the string is shorter than this, and there is the possibility of an MMU page fault when it reads data beyond the end of the string.

E.15 Answers to Questions in Chapter 15

ifndef and endif
The assembly of a source file plus any files it includes or indirectly includes
public
extern and externdef
externdef
abs
proc
nmake.exe
Multiple blocks of the following form:
```
target: dependencies
    commands
```
A dependent file is one that the current file depends on for its proper operation; the dependent file must be updated and built prior to the compilation and linking of the current file.
Delete old object and executable files, and delete other cruft.
A collection of object files

E.16 Answers to Questions in Chapter 16

/subsystem:console
https://www.masm32.com/
It slows the assembly process.
/entry:procedure_name
MessageBox
Code that surrounds a call to a function and that changes the way you call the function (for example, parameter order and location)
__imp_CreateFileA
__imp_GetLastError

Destination	Constant size
RAX	32
EAX	32
AX	16
AL	8
AH	8
mem₃₂	32
mem₆₄	32