E
Answers to Questions

E.1 Answers to Questions in Chapter 1

  1. cmd.exe
  2. ml64.exe
  3. Address, data, and control
  4. AL, AH, AX, and EAX
  5. BL, BH, BX, and EBX
  6. SIL, SI, and ESI
  7. R8B, R8W, and R8D
  8. FLAGS, EFLAGS, or RFLAGS
  9. (a) 2, (b) 4, (c) 16, (d) 32, (e) 8
  10. Any 8-bit register and any constant that can be represented with 8 bits
  11. 32
  12. Destination Constant size
    RAX 32
    EAX 32
    AX 16
    AL 8
    AH 8
    mem32 32
    mem64 32
  13. 64
  14. Any memory operand will work, regardless of its size.
  15. call
  16. ret
  17. Application binary interface
  18. (a) AL, (b) AX, (c) EAX, (d) RAX, (e) XMM0, (f) RAX
  19. RCX for integer operands, XMM0 for floating-point/vector operands
  20. RDX for integer operands, XMM1 for floating-point/vector operands
  21. R8 for integer operands, XMM2 for floating-point/vector operands
  22. R9 for integer operands, XMM3 for floating-point/vector operands
  23. dword or sdword
  24. qword

E.2 Answers to Questions in Chapter 2

  1. 9 × 103 + 3 × 102 + 8 × 101 + 4 × 100 + 5 × 10-1 + 7 × 10-2 + 6 × 10-3
  2. (a) 10, (b) 12, (c) 7, (d) 9, (e) 3, (f) 15
  3. (a) A, (b) E, (c) B, (d) D, (e) 2, (f) C, (g) CF, (h) 98D1
  4. (a) 0001_0010_1010_1111, (b) 1001_1011_1110_0111, (c) 0100_1010, (d) 0001_0011_0111_1111, (e) 1111_0000_0000_1101, (f) 1011_1110_1010_1101, (g) 0100_1001_0011_1000
  5. (a) 10, (b) 11, (c) 15, (d) 13, (e) 14, (f) 12
  6. (a) 16, (b) 64, (c) 128, (d) 32, (e) 4, (f) 8, (g) 4
  7. (a) 2, (b) 4, (c) 8, (d) 16
  8. (a) 16, (b) 256, (c) 65,636, (d) 2
  9. 4
  10. 0 through 7
  11. Bit 0
  12. Bit 31
  13. (a) 0, (b) 0, (c) 0, (d) 1
  14. (a) 0, (b) 1, (c) 1, (d) 1
  15. (a) 0, (b) 1, (c) 1, (d) 0
  16. 1
  17. AND
  18. OR
  19. NOT
  20. XOR
  21. not
  22. 1111_1011
  23. 0000_0010
  24. (a) and (c) and (e)
  25. neg
  26. (a) and (c) and (d)
  27. jmp
  28. label:
  29. Carry, overflow, zero, and sign
  30. JZ
  31. JC
  32. JA, JAE, JBE, JB, JE, JNE (and the synonyms JNA, JNAE, JNB, JNBE, plus other synonyms)
  33. JG, JGE, JL, JLE, JE, JNE (and the synonyms JNG, JNGE, JNL, and JNLE)
  34. ZF = 1 if the result of the shift is 0.
  35. The HO bit shifted out of the operand goes into the carry flag.
  36. If the next-to-HO bit is different from the HO bit before the shift, the OF will be set; otherwise, it is cleared, though only for 1-bit shifts.
  37. The SF is set equal to the HO bit of the result.
  38. ZF = 1 if the result of the shift is 0.
  39. The LO bit shifted out of the operand goes into the carry flag.
  40. If the next-to-HO bit is different from the HO bit before the shift, the OF will be set; otherwise, it is cleared, but only for 1-bit shifts.
  41. The SF is always clear after the SHR instruction because a 0 is always shifted into the HO bit of the result.
  42. ZF = 1 if the result of the shift is 0.
  43. The LO bit shifted out of the operand goes into the carry flag.
  44. The OF is always clear after SAR as it is impossible for the sign to change.
  45. The SF is set equal to the HO bit of the result, though technically it will never change.
  46. The HO bit shifted out of the operand goes into the carry flag.
  47. It doesn’t affect the ZF.
  48. The LO bit shifted out of the operand goes into the carry flag.
  49. It doesn’t affect the sign flag.
  50. Multiplication by 2
  51. Division by 2
  52. Multiplication and division
  53. Subtract them and see if their difference is less than a small error value.
  54. A value that has a 1 bit in the HO mantissa position
  55. 7
  56. 30h through 39h
  57. Apostrophes and quotes
  58. UTF-8, UTF-16, and UTF-32
  59. A scalar integer value that represents a single Unicode character
  60. A block of 65,536 different Unicode characters

E.3 Answers to Questions in Chapter 3

  1. RIP
  2. Operation code, the numeric encoding for a machine instruction
  3. Static and scalar variables
  4. ±2GB
  5. The address of the memory location to access
  6. RAX
  7. lea
  8. The final address obtained after all addressing mode calculations are completed
  9. 1, 2, 4, or 8
  10. 2GB total memory
  11. You can use the VAR[REG] addressing mode(s) to directly access elements of an array using a 64-bit register as an index into the array without first loading the address of the array into a separate base register.
  12. The .data section can hold initialized data values; the .data? section can contain only uninitialized variables.
  13. .code and .const
  14. .data and .data?
  15. An offset into a particular section (for example, .data)
  16. Use some_ID label some_type to inform MASM that the following data is of type some_type when, in fact, it could be another type.
  17. MASM will combine them into a single section.
  18. Use the align 8 statement.
  19. Memory management unit
  20. If b is at an address that is at the last byte in an MMU page and the next page is not readable, loading a word from the memory location starting with b will produce a general protection fault.
  21. A constant expression plus the base address of a variable in memory
  22. To coerce the following operand type to a different type
  23. Little-endian values appear in memory with their LO byte at the lowest address and the HO byte at the highest address. Big-endian values are the opposite: their HO byte appears at the lowest address, and their LO byte appears at the highest address in memory.
  24. xchg al, ah or xchg ah, al
  25. bswap eax
  26. bswap rax
  27. (a) Subtract 8 from RSP, (b) Store the value in RAX at the location pointed at by RSP.
  28. (a) Load RAX from the 8 bytes pointed at by RSP, (b) Add 8 to RSP.
  29. Reverse
  30. Last in, first out
  31. Move the data to and from the stack using the [RSP ± const] addressing mode.
  32. The Windows ABI requires the stack to be aligned on a 16-byte boundary; pushing RAX might make the stack aligned on an 8-byte (but not 16-byte) boundary.

E.4 Answers to Questions in Chapter 4

  1. imul reg, constant
  2. imul destreg, srcreg, constant
  3. imul destreg, srcreg
  4. A symbolic (named) constant for which MASM will substitute the literal constant for the name everywhere it appears in the source file
  5. =, equ, textequ
  6. Text equates substitute a textual string that can be any text; numeric equates must be assigned a numeric constant value that can be represented with a 64-bit integer.
  7. Use the text delimiters < and > around the string literal; for example, <"a long string">.
  8. An arithmetic expression whose value MASM can calculate during assembly
  9. lengthof
  10. The offset into the current section
  11. this and $
  12. Use the constant expression $-startingLocation.
  13. Use a series of (numeric) equates, with each successive equate set to the value of the previous equate plus one; for example:
    val1 = 0
    val2 = val1 + 1
    val3 = val2 + 1
    etc.
  14. Using the typedef directive
  15. A pointer is a variable in memory that holds the address of another object in memory.
  16. Load the pointer variable into a 64-bit register and use the register-indirect addressing mode to reference that address.
  17. Using a qword data declaration, or another data type that is 64 bits in size
  18. The offset operator
  19. (a) Uninitialized pointers, (b) Using pointers to hold an illegal value, (c) Using a pointer after its storage has been freed (dangling pointers), (d) Failing to free storage after it is no longer being used (memory leak), (e) Accessing indirect data by using the wrong data type
  20. Using a pointer after its storage has been freed
  21. Failing to free storage after you are done using it
  22. An aggregate type composed of smaller data objects
  23. A sequence of characters ending with a 0 byte (or other 0 value)
  24. A string containing a length value as its first element
  25. A descriptor is a data type containing a pointer (to the character data), string length, and possibly other information that describes the string data.
  26. A homogenous collection of data elements (all with the same type)
  27. The memory address of the first element of the array
  28. array byte 10 dup (?) (as an example)
  29. Simply fill in the initial values as the operand field of a byte, word, dword, or other data declaration directive. Also, you could use a sequence of one or more constant values as the dup operator operand; for example, 5 dup (2, 3).
  30. (a) base_address + index * 4 (4 is the element size), (b) W[i,j] = base_address + (i * 8 + j) * 2 (2 is the element size), (c) R[i,j,k] = base_address +(((i * 4) + j) * 6 + k) * 8 (8 is the element size)
  31. An organization for multidimensional arrays where you store the elements of each row in the array in contiguous memory locations and then store each row, one after the other, in memory
  32. An organization for multidimensional arrays where you store the elements of each column in the array in contiguous memory locations and then store each column, one after the other, in memory
  33. W word 4 dup (8 dup (?))
  34. A heterogeneous collection of data elements (each field could have different types)
  35. struct and ends
  36. The dot operator
  37. A heterogeneous collection of data elements (each field could have different types); the offset of each field in the union begins at 0.
  38. union and ends
  39. The fields of a record and struct appear at successive memory locations within the struct (each field has its own block of bytes); the fields of a union overlap one another, with each field beginning at offset zero in the union.
  40. An unnamed union whose fields are treated as fields of the enclosing struct

E.5 Answers to Questions in Chapter 5

  1. It pushes the return address onto the stack (the address of the next instruction after the call) and then jumps to the address specified by the operand.
  2. It pops a return address off the stack and moves the address into the RIP register, transferring control to the instruction just beyond the call to the current procedure.
  3. After popping the return address, the CPU adds this value to RSP, removing that number of bytes of parameters from the stack.
  4. The address of the instruction just beyond the call to the procedure
  5. Namespace pollution occurs when so many symbols, identifiers, or names are defined in a source file that it becomes difficult to select new, unique names to use in that source file.
  6. Put two colons after the name; for example, id::.
  7. Use the option noscoped directive just before the procedure.
  8. Use the push instruction to save the register values on the stack upon entry into the procedure; then use the pop instruction to restore the register values immediately before returning from the procedure.
  9. Code is difficult to maintain. (A secondary issue, though minor, is that it takes more space.)
  10. Performance—because you’re often preserving registers that don’t need to be preserved for the calling code
  11. When the subroutine attempts to return, it uses the garbage you left on the stack as the return address, which usually produces undefined results (a program crash).
  12. The subroutine uses whatever was on the stack prior to the call as the return address, with undefined results.
  13. A collection of data, including parameters, local variables, the return address, and other items, associated with the call (activation) of a procedure
  14. RBP
  15. 8 bytes (64 bits)
  16. push rbp
    mov  rbp, rsp
    sub  rsp, sizeOfLocals ; Assuming there are local variables
  17. leave
    ret
  18. and rsp, -16
  19. The section of the source file (usually the body of a procedure) where the symbol is visible and usable in the program
  20. From the moment storage is allocated for the variable to the point the system deallocates that storage
  21. Variables whose storage is automatically allocated upon entry into a block of code (usually a procedure) and automatically deallocated upon exiting that block of code
  22. Upon entry into a procedure (or the block of code associated with that automatic variable)
  23. Using textequ directives or the MASM local directive
  24. var1: –2; local2: –8 (MASM aligns variable on dword boundary); dVar: –9; qArray: –32 (base address of array is the lowest memory address); rlocal: –40 (base address of array is the lowest memory address); ptrVar: –48
  25. option prologue:PrologueDef and option epilogue:EpilogueDef. Should also supply option prologue:none and option epilogue:none to turn this off.
  26. Before MASM emits any code for the procedure, after all the local directives
  27. Wherever a ret instruction appears
  28. The actual parameter’s value
  29. The memory address of the actual parameter’s value
  30. RCX, RDX, R8, and R9 (or smaller subcomponents of these registers)
  31. XMM0, XMM1, XMM2, or XMM3
  32. On the stack, above the shadow locations (32 bytes) reserved for the arguments passed in the registers
  33. Procedures are free to modify volatile registers without preserving their values; procedures must preserve the values of nonvolatile registers across a procedure invocation.
  34. RAX, RCX, RDX, R8, R9, R10, R11, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, and the HO 128 bits of all the YMM and ZMM registers
  35. RBX, RSI, RDI, RBP, RSP, R12, R13, R14, R15, and XMM6–XMM15. Also, the direction flag must be clear upon return from a procedure.
  36. Using positive offsets from the RBP register
  37. Storage reserved on the stack for parameters the caller passes in the RCX, RDX, R8 and R9 registers
  38. 32 bytes
  39. 32 bytes
  40. 32 bytes
  41. parm1: RBP + 16; parm2: RBP + 24; parm3: RBP + 32; parm4: RBP + 40
  42. mov rax, parm4
    mov al, [rax]
  43. lclVar1: RBP – 1; lclVar2: RBP – 4 (aligned to 2-byte boundary); lclVar3: RBP – 8; lclVar4: RBP – 16
  44. By reference
  45. Application binary interface
  46. In the RAX register
  47. The address of a procedure passed as a parameter
  48. Indirectly. Typically by using a call parm instruction, where parm is the procedural parameter, a qword variable containing the address of the procedure. You could also load the parameter value into a 64-bit register and indirectly call the procedure through that register.
  49. Allocate local storage to hold the register values to preserve and move the register data into that storage upon procedure entry, and then move the data back into the registers just before returning from the procedure.

E.6 Answers to Questions in Chapter 6

  1. AL for 8-bit operands, AX for 16-bit operands, EAX for 32-bit operands, and RAX for 64-bit operands
  2. 8-bit mul operation: 16 bits; 16-bit mul operation: 32 bits; 32-bit mul operation: 64 bits; 64-bit mul operator: 128 bits. The CPU put the products at AX for 8×8 products, DX:AX for 16×16 products, EDX:EAX for 32×32 products, and RDX:RAX for 64×64 products.
  3. The quotient in AL, AX, EAX, or RAX and the remainder in AH, DX, EDX, or RDX
  4. Sign-extend AX into DX.
  5. Zero-extend EAX into EDX.
  6. A division by 0 and producing a quotient that will not fit into the accumulator register (AL, AX, EAX, or RAX)
  7. By setting the carry and overflow flags
  8. They scramble the flag; that is, they leave it in an undefined state.
  9. The extended-precision imul instruction produces a 2 × n-bit result, uses implied operands (AL, AX, EAX, and RAX), and modifies the AH, DX, EDX, and RDX registers. Also, the extended-precision imul instruction does not allow constant operands, whereas the generic imul instruction does.
  10. cbw, cwd, cdq, and cqo
  11. They scramble all the flags, leaving them in an undefined state.
  12. It sets the zero flag if the two operands are equal.
  13. It sets the carry flag if the first operand is less than the second operand.
  14. The sign and overflow flags are different if the first operand is less than the second operand; they are the same if the first operand is greater than or equal to the second operand.
  15. An 8-bit register or memory location
  16. They set the operand to 1 if the condition is true, or to false if the condition is not true.
  17. The test instruction is the same as the and instruction except it does not store the result to the destination (first) operand; it only sets the flags.
  18. They both set the condition code flags the same way.
  19. Supply the operand to be tested as the first (destination) operand and an immediate constant containing a single 1 bit in the bit position to test. After the test instruction, the zero flag will contain the state of the desired bit.
  20. The following are some possible, not the only, solutions:

    x = x + y

    mov eax, x
    add eax, y
    mov x, eax

    x = y – z

    mov eax, y
    sub eax, z
    mov x, eax

    x = y * z

    mov  eax, y
    imul eax, z
    mov  x, eax

    x = y + z * t

    mov  eax, z
    imul eax, t
    add  eax, y
    mov  x, eax

    x = (y + z) * t

    mov  eax, y
    add  eax, z
    imul eax, t
    mov  x, eax

    x = -((x*y)/z)

    mov  eax, x
    imul y          ; Note: Sign-extends into EDX
    idiv z
    mov  x, eax

    x = (y == z) && (t != 0)

    mov   eax, y
    cmp   eax, z
    sete  bl
    cmp   t, 0
    setne bh
    and   bl, bh
    movzx eax, bl   ; Because x is a 32-bit integer
    mov   x, eax
  21. The following are some possible, not the only, solutions:

    x = x * 2

    shl   x, 1

    x = y * 5

    mov   eax, y
    lea   eax, [eax][eax*4]
    mov   x, eax

    Here is another solution:

    mov   eax, y
    mov   ebx, eax
    shl   eax, 2
    add   eax, ebx
    mov   x, eax

    x = y * 8

    mov   eax, y
    shl   eax, 3
    mov   x, eax
  22. x = x /2
    shr   x, 1

    x = y / 8

    mov   ax, y
    shr   ax, 3
    mov   x, ax

    x = z / 10

    movzx eax, z
    imul  eax, 6554  ; Or 6553
    shr   eax, 16
    mov   x, ax
  23. x = x + y
    fld   x
    fld   y
    faddp
    fstp  x

    x = y – z

    fld   y
    fld   z
    fsubp
    fstp  x

    x = y * z

    fld   y
    fld   z
    fmulp
    fstp  x

    x = y + z * t

    fld   y
    fld   z
    fld   t
    fmulp
    faddp
    fstp  x

    x = (y + z) * t

    fld   y
    fld   z
    faddp
    fld   t
    fmulp
    fstp  x

    x = -((x * y)/z)

    fld   x
    fld   y
    fmulp
    fld   z
    fdivp
    fchs
    fstp  x
  24. x = x + y
    movss xmm0, x
    addss xmm0, y
    movss x, xmm0

    x = y – z

    movss xmm0, y
    subss xmm0, z
    movss x, xmm0

    x = y * z

    movss xmm0, y
    mulss xmm0, z
    movss x, xmm0

    x = y + z * t

    movss xmm0, z
    mulss xmm0, t
    addss xmm0, y
    movss x, xmm0
  25. b = x < y
    fld    y
    fld    x
    fcomip st(0), st(1)
    setb   b
    fstp   st(0)

    b = x >= y && x < z

    fld    y
    fld    x
    fcomip st(0), st(1)
    setae  bl
    fstp   st(0)
    fld    z
    fld    x
    fcomip st(0), st(1)
    setb   bh
    fstp   st(0)
    and    bl, bh
    mov    b, bl

E.7 Answers to Questions in Chapter 7

  1. Use the lea instruction or the offset operator.
  2. option noscoped
  3. option scoped
  4. jmp reg64 and jmp mem64
  5. A piece of code that maintains history information in variables or via the program counter
  6. If the second letter of the jump mnemonic is n, remove the n; otherwise, insert an n as the second character.
  7. jpo and jpe
  8. A short code sequence used to extend the range of a jump or call instruction beyond the ±2GB range
  9. cmovcc reg, src, where cc is one of the conditional suffixes (which follow a conditional jump), reg is a 16-, 32-, or 64-bit register, and src is a source register or memory location that is the same size as reg.
  10. You can conditionally execute a large set of different types of instructions by using a conditional jump without the time penalty of a control transfer.
  11. The destination has to be a register, and 8-bit registers are not allowed.
  12. Complete Boolean evaluation of an expression evaluates all components of the expression, even if it is not logically necessary to do so; short-circuit evaluation stops as soon as it determines that the expression must be true or false.
  13. if(x == y || z > t)
    {
        Do something 
    }
        mov  eax, x
        cmp  eax, y
        sete bl
        mov  eax, z
        cmp  eax, t
        seta bh
        or   bl, bh
        jz   skipIF
         Code for statements that "do something"
    skipIF:
    
    if(x != y && z < t)
    {
         THEN statements
    }
    Else
    {
         ELSE statements
    }
        mov   eax, x
        cmp   eax, y
        setne bl
        mov   eax, z
        cmp   eax, t
        setb  bh
        and   bl, bh
        jz    doElse
         Code for THEN statements
        jmp   endOfIF
    
    doElse:
         Code for ELSE statements
    endOfIF:
  14. 1st IF:
        mov  ax, x
        cmp  ax, y
        jeq  doBlock
        mov  eax, z
        cmp  eax, t
        jnl  skipIF
    doBlock:     Code for statements that "do something"
    skipIF:
    
    2nd IF:
        mov   eax, x
        cmp   eax, y
        je    doElse
        mov   eax, z
        cmp   eax, t
        jnl   doElse
         Code for THEN statements
        jmp   endOfIF
    
    doElse:
         Code for ELSE statements
    endOfIF:
  15. switch(s)
    {
       case 0:   case 0 code  break;
       case 1:   case 1 code  break;
       case 2:   case 2 code  break;
       case 3:   case 3 code  break;
    }
    
        mov eax, s ; Zero-extends!
        cmp eax, 3
        ja  skipSwitch
        lea rbx, jmpTbl
        jmp [rbx][rax * 8]
    jmpTbl qword case0, case1, case2, case3
    
    case0: case 0 code
           jmp skipSwitch
    
    case1: case 1 code
           jmp skipSwitch
    
    case2: case 2 code
           jmp skipSwitch
    
    case3: case 3 code
    
    skipSwitch:
    
    switch(t)
    {
       case 2:  case 0 code break;
       case 4:  case 4 code break;
       case 5:  case 5 code break;
       case 6:  case 6 code break;
       default: default code
    }
        mov eax, t ; Zero-extends!
        cmp eax, 2
        jb  swDefault
        cmp eax, 6
        ja  swDefault
        lea rbx, jmpTbl
        jmp [rbx][rax * 8 – 2 * 8]
    jmpTbl qword case2, swDefault, case4, case5, case6
    
    swDefault: default code
           jmp endSwitch
    
    case2: case 2 code
           jmp endSwitch
    
    case4: case 4 code
           jmp endSwitch
    
    case5: case 5 code
           jmp endSwitch
    
    case6: case 6 code
    
    endSwitch:
    
    switch(u)
    {
       case 10:   case 10 code  break;
       case 11:   case 11 code  break;
       case 12:   case 12 code  break;
       case 25:   case 25 code  break;
       case 26:   case 26 code  break;
       case 27:   case 27 code  break;
       default:   default code
    } 
         lea rbx, jmpTbl1  ; Assume cases 10-12
         mov eax, u        ; Zero-extends!
         cmp eax, 10
         jb  swDefault
         cmp eax, 12
         jbe sw1
         cmp eax, 25
         jb  swDefault
         cmp eax, 27
         ja  swDefault
         lea rbx, jmpTbl2
         jmp [rbx][rax * 8 – 25 * 8]
    sw1: jmp [rbx][rax*8-2*8]
    jmpTbl1 qword case10, case11, case12
    jmpTbl2 qword case25, case26, case27
    
    swDefault: default code
           jmp endSwitch
    
    case10: case 10 code
           jmp endSwitch
    
    case11: case 11 code
           jmp endSwitch
    
    case12: case 12 code
           jmp endSwitch
    
    case25: case 25 code
           jmp endSwitch
    
    case26: case 26 code
           jmp endSwitch
    
    case27: case 27 code
    
    endSwitch:
  16. while(i < j)
    {
         Code for loop body
    }
    
    whlLp:
         mov eax, i
         cmp eax, j
         jnl endWhl
          Code for loop body
         jmp whlLp
    endWhl:
    
    while(i < j && k != 0)
    {
         Code for loop body, part a
        if(m == 5) continue;
         Code for loop body, part b
        if(n < 6) break;
         Code for loop body, part c
    }
    
    ; Assume short-circuit evaluation:
    
    whlLp:
         mov eax, i
         cmp eax, j
         jnl endWhl
         mov eax, k
         cmp eax, 0
         je  endWhl
          Code for loop body, part a
         cmp m, 5
         je  whlLp
          Code for loop body, part b
         cmp n, 6
         jl  endWhl
          Code for loop body, part c
         jmp whlLp
    endWhl:
    
    do
    {
       Code for loop body
    } while(i != j);
    
    doLp:
       Code for loop body
         mov eax, i
         cmp eax, j
         jne doLp
    
    do
    {
       Code for loop body, part a
        if(m != 5) continue;
       Code for loop body, part b
        if(n == 6) break;
       Code for loop body, part c
    } while(i < j && k > j);
    
    doLp:
       Code for loop body, part a
         cmp m, 5
         jne doCont
       Code for loop body, part b
         cmp n, 6
         je  doExit
       Code for loop body, part c
    doCont:     mov eax, i
         cmp eax, j
         jnl doExit
         mov eax, k
         cmp eax, j
         jg  doLp
    doExit:
    
    for(int i = 0; i < 10; ++i)
    {
       Code for loop body
    }
    
           mov i, 0
    forLp: cmp i, 10
           jnl forDone
            Code for loop body
           inc i
           jmp forLp
    forDone:

E.8 Answers to Questions in Chapter 8

  1. You compute x = y + z as follows:
    1. mov rax, qword ptr y
      add rax, qword ptr z
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      adc rax, qword ptr z[8]
      mov qword ptr x[8], rax
    2. mov rax, qword ptr y
      add rax, qword ptr z
      mov qword ptr x, rax
      mov eax, dword ptr z[8] 
      adc eax, qword ptr y[8]
      mov dword ptr x[8], eax
    3. mov eax, dword ptr y
      add eax, dword ptr z
      mov dword ptr x, eax
      mov ax, word ptr z[4]
      adc ax, word ptr y[4]
      mov word ptr x[4], ax
  2. You compute x = yz as follows:
    1. mov rax, qword ptr y
      sub rax, qword ptr z
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      sbb rax, qword ptr z[8]
      mov qword ptr x[8], rax
      mov rax, qword ptr y[16]
      sbb rax, qword ptr z[16]
      mov qword ptr x[16], rax
    2. mov rax, qword ptr y
      sub rax, qword ptr z
      mov qword ptr x, rax
      mov eax, dword ptr y[8]
      sbb eax, dword ptr z[8]
      mov dword ptr x[8], eax
  3. mov rax, qword ptr y
    mul qword ptr z
    mov qword ptr x, rax
    mov rbx, rdx
    
    mov rax, qword ptr y
    mul qword ptr z[8]
    add rax, rbx
    adc rdx, 0
    mov qword ptr x[8], rax
    mov rbx, rdx
    
    mov rax, qword ptr y[8]
    mul qword ptr z
    add x[8], rax
    adc rbx, rdx
    
    mov rax, qword ptr y[8]
    mul qword ptr z[8]
    add rax, rbx
    mov qword ptr x[16], rax
    adc rdx, 0
    mov qword ptr x[24], rdx
  4. mov  rax, qword ptr y[8]
    cqo
    idiv qword ptr z
    mov  qword ptr x[8], rax
    mov  rax, qword ptr y
    idiv qword ptr z
    mov  qword ptr x, rax
  5. The conversions are as follows:
    1. ; Note: order of comparison (HO vs. LO) is irrelevant
      ; for "==" comparison.
      
          mov rax, qword ptr x[8]
          cmp rax, qword ptr y[8]
          jne skipElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          jne skipElse
          then code
      skipElse:
    2.     mov rax, qword ptr x[8]
          cmp rax, qword ptr y[8]
          jnb skipElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          jnb skipElse
          then code
      skipElse:
    3.     mov rax, qword ptr x[8]
          cmp rax, qword ptr y[8]
          jna skipElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          jna skipElse
          then code
      skipElse:
    4. ; Note: order of comparison (HO vs. LO) is irrelevant
      ; for "!=" comparison.
      
          mov rax, qword ptr x[8]
          cmp rax, qword ptr y[8]
          jne doElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          je skipElse
      doElse:
          then code
      skipElse:
  6. The conversions are as follows:
    1. ; Note: order of comparison (HO vs. LO) is irrelevant
      ; for "==" comparison.
      
          mov eax, dword ptr x[8]
          cmp eax, dword ptr y[8]
          jne skipElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          jne skipElse
          then code
      skipElse:
    2.     mov eax, dword ptr x[8]
          cmp eax, dword ptr y[8]
          jnb skipElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          jnb skipElse
          then code
      skipElse:
    3.     mov eax, dword ptr x[8]
          cmp eax, dword ptr y[8]
          jna skipElse
          mov rax, qword ptr x
          cmp rax, qword ptr y
          jna skipElse
          then code
      skipElse:
  7. The conversions are as follows:
    1. neg qword ptr x[8]
      neg qword ptr x
      sbb qword ptr x[8], 0
      
      xor rax, rax
      xor rdx, rdx
      sub rax, qword ptr x
      sbb rdx, qword ptr x[8]
      mov qword ptr x, rax
      mov qword ptr x[8], rdx
    2. mov rax, qword ptr y
      mov rdx, qword ptr y[8]
      neg rdx
      neg rax
      sbb rdx, 0
      mov qword ptr x, rax
      mov qword ptr x[8], rdx
      
      xor rdx, rdx
      xor rax, rax
      sub rax, qword ptr y
      sbb rdx, qword ptr y[8]
      mov qword ptr x, rax
      mov qword ptr x[8], rdx
  8. The conversions are as follows:
    1. mov rax, qword ptr y
      and rax, qword ptr z
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      and rax, qword ptr z[8]
      mov qword ptr x[8], rax
    2. mov rax, qword ptr y
      or  rax, qword ptr z
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      or  rax, qword ptr z[8]
      mov qword ptr x[8], rax
    3. mov rax, qword ptr y
      xor rax, qword ptr z
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      xor rax, qword ptr z[8]
      mov qword ptr x[8], rax
    4. mov rax, qword ptr y
      not rax
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      not rax
      mov qword ptr x[8], rax
    5. mov rax, qword ptr y
      shl rax, 1
      mov qword ptr x, rax
      mov rax, qword ptr y[8]
      rcl rax, 1
      mov qword ptr x[8], rax
    6. mov rax, qword ptr y[8]
      shr rax, 1
      mov qword ptr x[8], rax
      mov rax, qword ptr y
      rcr rax, 1
      mov qword ptr x rax
  9. mov rax, qword ptr y[8]
    sar rax, 1
    mov qword ptr x[8], rax
    mov rax, qword ptr y
    rcr rax, 1
    mov qword ptr x, rax
  10. rcl qword ptr x, 1
    rcl qword ptr x[8], 1
  11. rcr qword ptr x[8], 1
    rcr qword ptr x, 1

E.9 Answers to Questions in Chapter 9

  1. btoh        proc
    
                mov     ah, al      ; Do HO nibble first
                shr     ah, 4       ; Move HO nibble to LO
                or      ah, '0'     ; Convert to char
                cmp     ah, '9' + 1 ; Is it "A" to "F"?
                jb      AHisGood
                
    ; Convert 3Ah to 3Fh to "A" to "F".
    
                add     ah, 7
    
    ; Process the LO nibble here.
                
    AHisGood:   and     al, 0Fh     ; Strip away HO nibble
                or      al, '0'     ; Convert to char
                cmp     al, '9' + 1 ; Is it "A" to "F"?
                jb      ALisGood
                
    ; Convert 3Ah to 3Fh to "A" to "F".
    
                add     al, 7
    ALisGood:   ret
    btoh        endp
  2. 8
  3. Call qToStr twice: once with the HO 64 bits and once with the LO 64 bits. Then concatenate the two strings.
  4. fbstp
  5. If the input value is negative, emit a hyphen (-) character and negate the value; then call the unsigned decimal conversion function. If the number is 0 or positive, just call the unsigned decimal conversion function.
  6. ; Inputs:
    ;    RAX -   Number to convert to string.
    ;    CL  -   minDigits (minimum print positions).
    ;    CH  -   Padding character.
    ;    RDI -   Buffer pointer for output string.
  7. It will produce the full string required; the minDigits parameter specifies the minimum string size.
  8. ; On Entry:
    
       ; r10        - Real10 value to convert.
       ;              Passed in ST(0).
    
       ; fWidth     - Field width for the number (note that this
       ;              is an *exact* field width, not a minimum
       ;              field width).
       ;              Passed in EAX (RAX).
    
       ; decimalpts - # of digits to display after the decimal pt.
       ;              Passed in EDX (RDX). 
    
       ; fill       - Padding character if the number is smaller
       ;              than the specified field width.
       ;              Passed in CL (RCX).
    
       ; buffer     - r10ToStr stores the resulting characters
       ;              in this string.
       ;              Address passed in RDI.
    
       ; maxLength  - Maximum string length.
       ;              Passed in R8D (R8).
  9. A string containing fWidth # characters.
  10. ; On Entry:
    
    ;    e10     - Real10 value to convert.
    ;              Passed in ST(0).
    
    ;    width   - Field width for the number (note that this
    ;              is an *exact* field width, not a minimum
    ;              field width).
    ;              Passed in RAX (LO 32 bits).
    
    ;    fill    - Padding character if the number is smaller
    ;              than the specified field width.
    ;              Passed in RCX.
    
    ;    buffer  - e10ToStr stores the resulting characters in
    ;              this buffer (passed in EDI).
    ;              Passed in RDI (LO 32 bits).
    
    ;    expDigs - Number of exponent digits (2 for real4,
    ;              3 for real8, and 4 for real10).
    ;              Passed in RDX (LO 8 bits).
  11. A character that separates a sequence of characters from other such sequences, such as beginning or ending a numeric string
  12. Illegal character on input and numeric overflow during conversion

E.10 Answers to Questions in Chapter 10

  1. The set of all possible input (parameter) values
  2. The set of all possible function output (return) values
  3. Computes AL = [RBX + AL × 1]
  4. Byte values: domain is the set of all integers in the range 0 to 255, and the range is also the set of all integers in the range 0 to 255.
  5. The code implementing the functions is as follows:
    1. lea rbx, f
      mov al, input
      xlat
    2. lea rbx, f
      movzx rax, input
      mov ax, [rbx][rax * 2]
    3. lea rbx, f
      movzx rax, input
      mov al, [rbx][rax * 1]
    4. lea rbx, f
      movzx rax, input
      mov ax, [rbx][rax * 2]
  6. Modifying input values that are out of a specific range so that they lie within the input domain of the function
  7. Main memory is so slow that it might be faster to compute the value than to look it up via a table.

E.11 Answers to Questions in Chapter 11

  1. Use the cpuid instruction.
  2. Because Intel and AMD have different feature sets
  3. EAX = 1
  4. ECX bit 20
  5. (a) _TEXT, (b) _DATA, (c) _BSS, (d) CONST
  6. PARA or 16 bytes
  7. data  segment align(64) 'DATA'
               .
               .
               .
    data  ends
  8. AVX/AVX2/AVX-256/AVX-512
  9. A data type within a SIMD register; typically, 1, 2, 4, or 8 bytes wide
  10. Scalar instructions operate on a single piece of data; vector instructions operate, simultaneously, on two or more pieces of data.
  11. 16 bytes
  12. 32 bytes
  13. 64 bytes
  14. movd
  15. movq
  16. movaps, movapd, and movdqa
  17. movups, movupd, and movdqu
  18. movhps or movhpd
  19. movddup
  20. pshufb
  21. pshufd, though pshufb could also work
  22. (v)pextrb, (v)pextrw, (v)pextrd, or (v)pextrq
  23. (v)pinsrb, (v)pinsrw, (v)pinsrd, or (v)pinsrq
  24. It takes the bits in the second operand, inverts them, and then logically ANDs these inverted bits with the first (destination) operand.
  25. pslldq
  26. pslrdq
  27. psllq
  28. pslrq
  29. The carry out of the HO bit is lost.
  30. In a vertical addition, the CPU sums values found in the same lane of two separate XMM registers; in a horizontal addition, the CPU sums values found in adjacent lanes of the same XMM register.
  31. In the destination XMM register, by storing 0FFh in the corresponding lane of the destination XMM register (0 for false)
  32. Swap the operands of the pcmpgtq instruction.
  33. It copies the HO bit of each byte in an XMM register into the corresponding bit position of a general-purpose 16-bit register; for example, bit 7 of lane 0 goes into bit 0.
  34. (a) 4 on SSE, 8 on AVX2, (b) 2 on SSE, 4 on AVX2
  35. and rax, -16
  36. pxor xmm0, xmm0
  37. pcmpeqb xmm1, xmm1
  38. include

E.12 Answers to Questions in Chapter 12

  1. and/andn
  2. btr
  3. or
  4. bts
  5. xor
  6. btc
  7. test/and
  8. bt
  9. pext
  10. pdep
  11. bextr
  12. bsf
  13. bsr
  14. Invert the register and use bsf.
  15. Invert the register and use bsr.
  16. popcnt

E.13 Answers to Questions in Chapter 13

  1. Compile-time language
  2. During the assembly and compilation process
  3. echo (or %out)
  4. .err
  5. The = directive
  6. !
  7. It replaces an expression with text representing the value of that compile-time expression.
  8. It replaces a text symbol with the expansion of its text.
  9. It concatenates two or more textual strings at assembly time and stores the result into a text symbol.
  10. It searches for a substring within a larger string in a MASM text object and returns the index of the substring into that object; 0 if the substring does not appear in the larger string.
  11. It returns the length of a MASM text string.
  12. It returns a substring from a larger MASM text string.
  13. if, elseif, else, and endif
  14. while, for, forc, and endm
  15. forc
  16. macro, endm
  17. Specify the macro’s name where you want the text expansion to occur.
  18. As operands to the macro directive
  19. Specify :req after the parameter name in the macro operand field.
  20. Macro parameters are optional, by default, if they don’t have the :req suffix.
  21. Use the :vararg suffix after the last macro parameter declaration.
  22. Use conditional assembly directives such as ifb or ifnb to see if the actual macro argument is blank.
  23. Use the local directive.
  24. exitm
  25. Use exitm <text>.
  26. opattr

E.14 Answers to Questions in Chapter 14

  1. Bytes, words, dwords, and qwords
  2. movs, cmps, scas, stos, and lods
  3. Bytes and words
  4. RSI, RDI, and RCX
  5. RSI and RDI
  6. RCX, RSI, and AL
  7. RDI and EAX
  8. Dir = 0
  9. Dir = 1
  10. Clear the direction flag; alternatively, preserve its value.
  11. Clear
  12. movs and stos
  13. When the source and destination blocks overlap and the source address starts at a lower memory address than the destination block
  14. This is the default condition; you would also clear the direction flag when the source and destination blocks overlap and the source address starts at a higher memory address than the destination block.
  15. Portions of the source block can be replicated in the destination block.
  16. repe
  17. Direction flag should be clear.
  18. No, string instructions test RCX prior to the string operation when using a repeat prefix.
  19. scasb
  20. stos
  21. lods and stos
  22. lods
  23. Verify that the CPU supports SSE 4.2 instructions.
  24. pcmpistri and pcmpistrm
  25. pcmpestri and pcmpestrm
  26. RAX holds the src1 length, and RDX holds the src2 length.
  27. Equal any, or possibly, equal range
  28. Equal each
  29. Equal ordered
  30. The pcmpXstrY instructions always read 16 bytes of memory, even if the string is shorter than this, and there is the possibility of an MMU page fault when it reads data beyond the end of the string.

E.15 Answers to Questions in Chapter 15

  1. ifndef and endif
  2. The assembly of a source file plus any files it includes or indirectly includes
  3. public
  4. extern and externdef
  5. externdef
  6. abs
  7. proc
  8. nmake.exe
  9. Multiple blocks of the following form:
    target: dependencies
        commands
  10. A dependent file is one that the current file depends on for its proper operation; the dependent file must be updated and built prior to the compilation and linking of the current file.
  11. Delete old object and executable files, and delete other cruft.
  12. A collection of object files

E.16 Answers to Questions in Chapter 16

  1. /subsystem:console
  2. https://www.masm32.com/
  3. It slows the assembly process.
  4. /entry:procedure_name
  5. MessageBox
  6. Code that surrounds a call to a function and that changes the way you call the function (for example, parameter order and location)
  7. __imp_CreateFileA
  8. __imp_GetLastError