Most assembly language source files aren’t stand-alone programs. They’re components of a large set of source files, in different languages, compiled and linked together to form complex applications. Programming in the large is the term software engineers have coined to describe the processes, methodologies, and tools for handling the development of large software projects.
While everyone has their own idea of what large is, separate compilation is one of the more popular techniques that support programming in the large. Using separate compilation, you first break your large source files into manageable chunks. Then you compile the separate files into object code modules. Finally, you link the object modules together to form a complete program. If you need to make a small change to one of the modules, you need to reassemble only that one module; you do not need to reassemble the entire program. Once you’ve debugged and tested a large section of your code, continuing to assemble that same code when you make a small change to another part of your program is a waste of time. Imagine having to wait 20 or 30 minutes on a fast PC to assemble a program to which you’ve made a one-line change!
The following sections describe the tools MASM provides for separate compilation and how to effectively employ these tools in your programs for modularity and reduced development time.
The include
directive, when encountered in a source file, merges a specified file into the compilation at the point of the include
directive. The syntax for the include
directive is
include filename
where filename is a valid filename. By convention, MASM include files have an .inc (include) suffix, but the name of any file containing MASM assembly language source will work fine. A file being included into another file during assembly may itself include files.
Using the include
directive by itself does not provide separate compilation. You could use the include
directive to break a large source file into separate modules and join these modules together when you compile your file. The following example would include the print.inc and getTitle.inc files during the compilation of your program:
include print.inc
include getTitle.inc
Now your program will benefit from modularity. Alas, you will not save any development time. The include
directive inserts the source file at the point of the include
during compilation, exactly as though you had typed that code yourself. MASM still has to compile the code, and that takes time. If you are including a large number of source files (such as a huge library) into your assembly, the compilation process could take forever.
In general, you should not use the include
directive to include source code as shown in the preceding example.1 Instead, you should use the include
directive to insert a common set of constants, types, external procedure declarations, and other such items into a program. Typically, an assembly language include file does not contain any machine code (outside of a macro; see Chapter 13 for details). The purpose of using include
files in this manner will become clearer after you see how the external declarations work.
As you begin to develop sophisticated modules and libraries, you will eventually discover a big problem: some header files need to include other header files. Well, this isn’t actually a big problem, but a problem will occur when one header file includes another, and that second header file includes another, and that third header file includes another, and . . . that last header file includes the first header file. Now this is a big problem, because it creates an infinite loop in the compiler and makes MASM complain about duplicate symbol definitions. After all, the first time it reads the header file, it processes all the declarations in that file; the second time around, it views all those symbols as duplicate symbols.
The standard technique for ignoring duplicate includes, well-known to C/C++ programmers, is to use conditional assembly to have MASM ignore the content of an include file. (See “Conditional Assembly (Compile-Time Decisions)” in Chapter 13.) The trick is to place an ifndef
(if not defined) statement around all statements in the include file. You specify the include file’s filename as the ifndef
operand, substituting underlines for periods (or any other undefined symbol). Then, immediately after the ifndef
statement, you define that symbol (using a numeric equate and assigning the symbol the constant 0 is typical). Here’s an example of this ifndef
usage in action:
ifndef myinclude_inc ; Filename: myinclude.inc
myinclude_inc = 0
Put all the source code lines for the include file here
; The following statement should be the last non-blank line
; in the source file:
endif ; myinclude_inc
On the second inclusion, MASM simply skips over the contents of the include file (including any include
directives), which prevents the infinite loop and all the duplicate symbol definitions.
An assembly unit is the assembly of a source file plus any files it includes or indirectly includes. An assembly unit produces a single .obj file after assembly. The Microsoft linker takes multiple object files (produced by MASM or other compilers, such as MSVC) and combines them into a single executable unit (an .exe file). The main purpose of this section (and, indeed, this whole chapter) is to describe how these assembly units (.obj files) communicate linkage information to one another during the linking process. Assembly units are the basis for creating modular programs in assembly language.
To use MASM’s assembly unit facilities, you must create at least two source files. One file contains a set of variables and procedures used by the second. The second file uses those variables and procedures without knowing how they’re implemented.
Instead of using the include
directive to create modular programs, which wastes time because MASM must recompile bug-free code every time you assemble the main program, a much better solution would be to preassemble the debugged modules and link the object code modules together. This is what the public
, extern
, and externdef
directives allow you to do.
Technically, all of the programs appearing in this book up to this point have been separately assembled modules (which happen to link with a C/C++ main program rather than another assembly language module). The assembly language main program named asmMain
is nothing but a function compatible with C++ that the generic c.cpp program has called from its main program. Consider the body of asmMain
from Listing 2-1 in Chapter 2:
; Here is the "asmMain" function.
public asmMain
asmMain proc
.
.
.
asmMain endp
The public asmMain
statement has been included in every program that has had an asmMain
function without any definition or explanation. Well, now it’s time to deal with that oversight.
Normal symbols in a MASM source file are private to that particular source file and are inaccessible from other source files (which don’t directly include the file containing those private symbols, of course). That is, the scope of most symbols in a source file is limited to those lines of code within that particular source file (and any files it includes). The public
directive tells MASM to make the specified symbol global to the assembly unit—accessible by other assembly units during the link phase. Through the public asmMain
statement in the example programs appearing throughout this book, these sample programs have made the asmMain
symbol global to the source file containing them so that the c.cpp program can call the asmMain
function.
Simply making a symbol public is insufficient to use that symbol in another source file. The source file that wants to use the symbol must also declare that symbol as an external symbol. This notifies the linker that it will have to patch in the address of a public symbol whenever the file with the external declaration uses that symbol. For example, the c.cpp source file defines the asmMain
symbol as external in the following lines of code (for what it’s worth, this declaration also defines the external symbols getTitle
and readLine
):
// extern "C" namespace prevents
// "name mangling" by the C++
// compiler.
extern "C"
{
// asmMain is the assembly language
// code's "main program":
void asmMain(void);
// getTitle returns a pointer to a
// string of characters from the
// assembly code that specifies the
// title of that program (which makes
// this program generic and usable
// with a large number of sample
// programs in "The Art of 64-Bit
// Assembly").
char *getTitle(void);
// C++ function that the assembly
// language program can call:
int readLine(char *dest, int maxLen);
};
Note, in this example, that readLine
is a C++ function defined in the c.cpp source file. C/C++ does not have an explicit public declaration. Instead, if you supply the source code for a function in a source file that declares that function to be external, C/C++ will automatically make that symbol public by virtue of the external declaration.
MASM actually has two external symbol declaration directives: extern
and externdef
.2 These two directives use the syntax
extern symbol:type {optional_list_of_symbol:type_pairs}
externdef symbol:type {optional_list_of_symbol:type_pairs}
where symbol is the identifier you want to use from another assembly unit, and type is the data type of that symbol. The data type can be any of the following:
proc
, which indicates that the symbol is a procedure (function) name or a statement labelbyte
, word
, dword
, qword
, oword
, and so on)abs
, which indicates a constant valueThe abs
type isn’t for declaring generic external constants such as someConst = 0
. Pure constant declarations, such as this one, would normally appear in a header file (an include file), which this section will describe shortly. Instead, the abs
type is generally reserved for constants that are based on code offsets within an object module. For example, if you have the following code in an assembly unit,
public someLen
someStr byte "abcdefg"
someLen = $-someStr
someLen
’s type, in an extern
declaration, would be abs
.
Both directives use a comma-delimited list to allow multiple symbol declarations; for example:
extern p:proc, b:byte, d:dword, a:abs
I’d argue, however, that your programs will be more readable if you limit your external declarations to one symbol per statement.
When you place an extern
directive in your program, MASM treats that declaration the same as any other symbol declaration. If the symbol already exists, MASM will generate a symbol-redefinition error. Generally, you should place all external declarations near the beginning of the source file to avoid any scoping or forward reference issues. Because the public directive does not actually define the symbol, the placement of the public directive is not as critical. Some programmers put all the public declarations at the beginning of a source file; others put the public declaration right before the definition of the symbol (as I’ve done with the asmMain
symbol in most of the same programs). Either position is fine.
Because a public symbol from one source file can be used by many assembly units, a small problem develops: you have to replicate the extern
directive in all the files that use that symbol. For a small number of symbols, this is not much of a problem. However, as the number of external symbols increases, maintaining all these external symbols across multiple source files becomes burdensome. The MASM solution is the same as the C/C++ solution: header files.
Header files are include files that contain external (and other) declarations that are common among multiple assembly units. They are called header files because the include statement that injects their code into a source file normally appears at the beginning (at the head) of the source file that uses them. This turns out to be the primary use of include files in MASM: to include external (and other) common declarations.
When you start using header files with large sets of library modules (assembly units), you’ll quickly discover a huge problem with the extern
directive. Typically, you will create a single header file for a large set of library functions, with each function possibly appearing in its own assembly unit. Some library functions might use other functions in the same library module (a collection of object files); therefore, that particular library function’s source file might want to include the header file for the library in order to reference the external name of the other library function.
Unfortunately, if the header file contains the external definition for the function in the current source file, a symbol redefinition error occurs:
; header.inc
ifndef header_inc
header_inc = 0
extern func1:proc
extern func2:proc
endif ; header_inc
Assembly of the following source file produces an error because func1
is already defined in the header.inc include file:
; func1.asm
include header.inc
.code
func1 proc
.
.
.
call func2
.
.
.
func1 endp
end
C/C++ doesn’t suffer from this problem because the external keyword doubles as both a public and an external declaration.
To overcome this problem, MASM introduced the externdef
directive. This directive is similar to C/C++’s external
directive: it behaves like an extern
directive when the symbol is not present in a source file, and it behaves like a public
directive when the symbol is defined in a source file. In addition, multiple externdef
declarations for the same symbol may appear in a source file (though they should specify the same type for the symbol if multiple declarations do appear). Consider the previous header.inc header file modified to use externdef
definitions:
; header.inc
ifndef header_inc
header_inc = 0
externdef func1:proc
externdef func2:proc
endif ; header_inc
Using this header file, the func1.asm assembly unit will compile correctly.
Way back in “The MASM Include Directive” in Chapter 11, I started putting the print
and getTitle
functions in include files so that I could simply include them in every source file that needed to use these functions rather than manually cutting and pasting these functions into every program. Clearly, these are good examples of programs that should be made into assembly units and linked with other programs rather than being included during assembly.
Listing 15-1 is a header file that incorporates the necessary print
and getTitle
declarations:3
; aoalib.inc - Header file containing external function
; definitions, constants, and other items used
; by code in "The Art of 64-Bit Assembly."
ifndef aoalib_inc
aoalib_inc equ 0
; Constant definitions:
; nl (newline constant):
nl = 10
; SSE4.2 feature flags (in ECX):
SSE42 = 00180000h ; Bits 19 and 20
AVXSupport = 10000000h ; Bit 28
; CPUID bits (EAX = 7, EBX register):
AVX2Support = 20h ; Bit 5 = AVX
**********************************************************
; External data declarations:
externdef ttlStr:byte
**********************************************************
; External function declarations:
externdef print:qword
externdef getTitle:proc
; Definition of C/C++ printf function that
; the print function will call (and some
; AoA sample programs call this directly,
; as well).
externdef printf:proc
endif ; aoalib_inc
Listing 15-1: aoalib.inc header file
Listing 15-2 contains the print
function used in “The MASM Include Directive” in Chapter 11 converted to an assembly unit.
; print.asm - Assembly unit containing the SSE/AVX dynamically
; selectable print procedures.
include aoalib.inc
.data
align qword
print qword choosePrint ; Pointer to print function
.code
; print - "Quick" form of printf that allows the format string to
; follow the call in the code stream. Supports up to five
; additional parameters in RDX, R8, R9, R10, and R11.
; This function saves all the Microsoft ABI–volatile,
; parameter, and return result registers so that code
; can call it without worrying about any registers being
; modified (this code assumes that Windows ABI treats
; YMM6 to YMM15 as nonvolatile).
; Of course, this code assumes that AVX instructions are
; available on the CPU.
; Allows up to 5 arguments in:
; RDX - Arg #1
; R8 - Arg #2
; R9 - Arg #3
; R10 - Arg #4
; R11 - Arg #5
; Note that you must pass floating-point values in
; these registers as well. The printf function
; expects real values in the integer registers.
; There are two versions of this program, one that
; will run on CPUs without AVX capabilities (no YMM
; registers) and one that will run on CPUs that
; have AVX capabilities (YMM registers). The difference
; between the two is which registers they preserve
; (print_SSE preserves only XMM registers and will
; run properly on CPUs that don't have YMM register
; support; print_AVX will preserve the volatile YMM
; registers on CPUs with AVX support).
; On first call, determine if we support AVX instructions
; and set the "print" pointer to point at print_AVX or
; print_SSE:
choosePrint proc
push rax ; Preserve registers that get
push rbx ; tweaked by CPUID
push rcx
push rdx
mov eax, 1
cpuid
test ecx, AVXSupport ; Test bit 28 for AVX
jnz doAVXPrint
lea rax, print_SSE ; From now on, call
mov print, rax ; print_SSE directly
; Return address must point at the format string
; following the call to this function! So we have
; to clean up the stack and JMP to print_SSE.
pop rdx
pop rcx
pop rbx
pop rax
jmp print_SSE
doAVXPrint: lea rax, print_AVX ; From now on, call
mov print, rax ; print_AVX directly
; Return address must point at the format string
; following the call to this function! So we have
; to clean up the stack and JMP to print_AUX.
pop rdx
pop rcx
pop rbx
pop rax
jmp print_AVX
choosePrint endp
; Version of print that will preserve volatile
; AVX registers (YMM0 to YMM3):
thestr byte "YMM4:%I64x", nl, 0
print_AVX proc
; Preserve all the volatile registers
; (be nice to the assembly code that
; calls this procedure):
push rax
push rbx
push rcx
push rdx
push r8
push r9
push r10
push r11
; YMM0 to YMM7 are considered volatile, so preserve them:
sub rsp, 256
vmovdqu ymmword ptr [rsp + 000], ymm0
vmovdqu ymmword ptr [rsp + 032], ymm1
vmovdqu ymmword ptr [rsp + 064], ymm2
vmovdqu ymmword ptr [rsp + 096], ymm3
vmovdqu ymmword ptr [rsp + 128], ymm4
vmovdqu ymmword ptr [rsp + 160], ymm5
vmovdqu ymmword ptr [rsp + 192], ymm6
vmovdqu ymmword ptr [rsp + 224], ymm7
push rbp
returnAdrs textequ <[rbp + 328]>
mov rbp, rsp
sub rsp, 256
and rsp, -16
; Format string (passed in RCX) is sitting at
; the location pointed at by the return address;
; load that into RCX:
mov rcx, returnAdrs
; To handle more than three arguments (four counting
; RCX), you must pass data on stack. However, to the
; print caller, the stack is unavailable, so use
; R10 and R11 as extra parameters (could be just
; junk in these registers, but pass them just
; in case).
mov [rsp + 32], r10
mov [rsp + 40], r11
call printf
; Need to modify the return address so
; that it points beyond the zero-terminating byte.
; Could use a fast strlen function for this, but
; printf is so slow it won't really save us anything.
mov rcx, returnAdrs
dec rcx
skipTo0: inc rcx
cmp byte ptr [rcx], 0
jne skipTo0
inc rcx
mov returnAdrs, rcx
leave
vmovdqu ymm0, ymmword ptr [rsp + 000]
vmovdqu ymm1, ymmword ptr [rsp + 032]
vmovdqu ymm2, ymmword ptr [rsp + 064]
vmovdqu ymm3, ymmword ptr [rsp + 096]
vmovdqu ymm4, ymmword ptr [rsp + 128]
vmovdqu ymm5, ymmword ptr [rsp + 160]
vmovdqu ymm6, ymmword ptr [rsp + 192]
vmovdqu ymm7, ymmword ptr [rsp + 224]
add rsp, 256
pop r11
pop r10
pop r9
pop r8
pop rdx
pop rcx
pop rbx
pop rax
ret
print_AVX endp
; Version that will run on CPUs without
; AVX support and will preserve the
; volatile SSE registers (XMM0 to XMM3):
print_SSE proc
; Preserve all the volatile registers
; (be nice to the assembly code that
; calls this procedure):
push rax
push rbx
push rcx
push rdx
push r8
push r9
push r10
push r11
; XMM0 to XMM3 are considered volatile, so preserve them:
sub rsp, 128
movdqu xmmword ptr [rsp + 00], xmm0
movdqu xmmword ptr [rsp + 16], xmm1
movdqu xmmword ptr [rsp + 32], xmm2
movdqu xmmword ptr [rsp + 48], xmm3
movdqu xmmword ptr [rsp + 64], xmm4
movdqu xmmword ptr [rsp + 80], xmm5
movdqu xmmword ptr [rsp + 96], xmm6
movdqu xmmword ptr [rsp + 112], xmm7
push rbp
returnAdrs textequ <[rbp + 200]>
mov rbp, rsp
sub rsp, 128
and rsp, -16
; Format string (passed in RCX) is sitting at
; the location pointed at by the return address;
; load that into RCX:
mov rcx, returnAdrs
; To handle more than three arguments (four counting
; RCX), you must pass data on stack. However, to the
; print caller, the stack is unavailable, so use
; R10 and R11 as extra parameters (could be just
; junk in these registers, but pass them just
; in case):
mov [rsp + 32], r10
mov [rsp + 40], r11
call printf
; Need to modify the return address so
; that it points beyond the zero-terminating byte.
; Could use a fast strlen function for this, but
; printf is so slow it won't really save us anything.
mov rcx, returnAdrs
dec rcx
skipTo0: inc rcx
cmp byte ptr [rcx], 0
jne skipTo0
inc rcx
mov returnAdrs, rcx
leave
movdqu xmm0, xmmword ptr [rsp + 00]
movdqu xmm1, xmmword ptr [rsp + 16]
movdqu xmm2, xmmword ptr [rsp + 32]
movdqu xmm3, xmmword ptr [rsp + 48]
movdqu xmm4, xmmword ptr [rsp + 64]
movdqu xmm5, xmmword ptr [rsp + 80]
movdqu xmm6, xmmword ptr [rsp + 96]
movdqu xmm7, xmmword ptr [rsp + 112]
add rsp, 128
pop r11
pop r10
pop r9
pop r8
pop rdx
pop rcx
pop rbx
pop rax
ret
print_SSE endp
end
Listing 15-2: The print
function appearing in an assembly unit
To complete all the common aoalib functions used thus far, here is Listing 15-3.
; getTitle.asm - The getTitle function converted to
; an assembly unit.
; Return program title to C++ program:
include aoalib.inc
.code
getTitle proc
lea rax, ttlStr
ret
getTitle endp
end
Listing 15-3: The getTitle
function as an assembly unit
Listing 15-4 is a program that uses the assembly units in Listings 15-2 and 15-3.
; Listing 15-4
; Demonstration of linking.
include aoalib.inc
.data
ttlStr byte "Listing 15-4", 0
***************************************************************
; Here is the "asmMain" function.
.code
public asmMain
asmMain proc
push rbx
push rsi
push rdi
push rbp
mov rbp, rsp
sub rsp, 56 ; Shadow storage
call print
byte "Assembly units linked", nl, 0
leave
pop rdi
pop rsi
pop rbx
ret ; Returns to caller
asmMain endp
end
Listing 15-4: A main program that uses the print
and getTitle
assembly modules
So how do you build and run this program? Unfortunately, the build.bat batch file this book has been using up to this point will not do the job. Here’s a command that will assemble all the units and link them together:
ml64 /c print.asm getTitle.asm listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
These commands will properly compile all source files and link together their object code to produce the executable file c.exe.
Unfortunately, the preceding commands defeat one of the major benefits of separate compilation. When you issue the ml64 /c print.asm getTitle.asm listing15-4.asm
command, it will compile all the assembly source files. Remember, a major reason for separate compilation is to reduce compilation time on large projects. While the preceding commands work, they don’t achieve this goal.
To separately compile the two modules, you must run MASM separately on them. To compile the three source files separately, break the ml64
invocation into three separate commands:
ml64 /c print.asm
ml64 /c getTitle.asm
ml64 /c listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
Of course, this sequence still compiles all three assembly source files. However, after the first time you execute these commands, you’ve built the print.obj and getTitle.obj files. From this point forward, as long as you don’t change the print.asm or getTitle.asm source files (and don’t delete the print.obj or getTitle.obj files), you can build and run the program in Listing 15-4 by using these commands:
ml64 /c listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
Now, you’ve saved the time needed to compile the print.asm and getTitle.asm files.
The build.bat file used throughout this book has been far more convenient than typing the individual build commands. Unfortunately, the build mechanism that build.bat supports is really good for only a few fixed source files. While you could easily construct a batch file to compile all the files in a large assembly project, running the batch file would reassemble every source file in the project. Although you can use complex command line functions to avoid some of this, there is an easier way: makefiles.
A makefile is a script in a special language (designed in early releases of Unix) that specifies how to execute a series of commands based on certain conditions, executed by the program make. If you’ve installed MSVC and MASM as part of Visual Studio, you’ve probably also installed (as part of that same process) Microsoft’s variant of make: nmake.exe
.4 To use nmake.exe
, you execute it from a Windows command line as follows:
nmake optional_arguments
If you execute nmake
on a command line by itself (without any arguments), nmake.exe
will search for a file named makefile and attempt to process the commands in that file. For many projects, this is very convenient. You will have all your project’s source files in a single directory (or in subdirectories hanging off that directory), and you will place a single makefile (named makefile) in that directory. By changing into that directory and executing nmake
(or make
), you can build the project with minimal fuss.
If you want to use a different filename than makefile, you must preface the filename with the /f
option, as follows:
nmake /f mymake.mak
The filename doesn’t have to have the extension .mak. However, this is a popular convention when using makefiles that are not named makefile.
The nmake
program does provide many command line options, and /help
will list them. Look up nmake
documentation online for a description of the other command line options (most of them are advanced and are unnecessary for most tasks).
A makefile is a standard ASCII text file containing a sequence of lines (or a set of multiple occurrences of this sequence) as follows:
target: dependencies
commands
The target:
dependencies line is optional. The commands item is a list of one or more command line commands, also optional. The target item, if present, must begin in column 1 of the source line it is on. The commands items must have at least one whitespace character (space or tab) in front of them (that is, they must not begin in column 1 of the source line). Consider the following valid makefile:
c.exe:
ml64 /c print.asm
ml64 /c getTitle.asm
ml64 /c listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
If these commands appear in a file named makefile and you execute nmake
, then nmake
will execute these commands exactly like the command line interpreter would have executed them had they appeared in a batch file.
A target item is an identifier or a filename of some sort. Consider the following makefile:
executable:
ml64 /c listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
library:
ml64 /c print.asm
ml64 /c getTitle.asm
This separates the build commands into two groups: one group specified by the executable
label and one group specified by the library
label.
If you run nmake
without any command line options, nmake
will execute only those commands associated with the very first target in the makefile. In this example, if you run nmake
by itself, it will assemble listing15-4.asm, print.asm, and getTitle.asm; compile c.cpp; and attempt to link the resulting c.obj with print.obj, getTitle.obj, and listing15-4.obj. This should successfully produce the c.exe executable.
To process the commands after the library target, specify the target name as an nmake
command line argument:
nmake library
This nmake
command compiles print.asm and getTitle.asm. So if you execute this command once (and never change print.asm or getTitle.asm thereafter), you need only execute the nmake
command by itself to generate the executable file (or use nmake executable
if you want to explicitly state that you are building the executable).
Although the ability to specify which targets you want to build on the command line is very useful, as your projects get larger (with many source files and library modules), keeping track of which source files you need to recompile all the time can be burdensome and error prone; if you’re not careful, you’ll forget to compile an obscure library module after you’ve made changes to it and wonder why the application is still failing. The make dependencies option allows you to automate the build process to help avoid these problems.
A list of one or more (whitespace-separated) dependencies can follow a target in a makefile:
target: dependency1 dependency2 dependency3 ...
Dependencies are either target names (of targets appearing in that makefile) or filenames. If a dependency is a target name (that is not also a filename), nmake
will go execute the commands associated with that target. Consider the following makefile:
executable:
ml64 /c listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
library:
ml64 /c print.asm
ml64 /c getTitle.asm
all: library executable
The all
target depends on the library
and executable
targets, so it will go execute the commands associated with those targets (and in the order library
, executable
, which is important because the library
object files must be built before the associated object modules can be linked into the executable program). The all
identifier is a common target in makefiles. Indeed, it is often the first or second target to appear in a makefile.
If a target:
dependencies line becomes too long to be readable (nmake
doesn’t really care too much about line length), you can break the line into multiple lines by putting a backslash character (\
) as the last character on a line. The nmake
program will combine source lines that end with a backslash with the next line in the makefile.
Target names and dependencies can also be filenames. Specifying a filename as a target name is generally done to tell the make system how to build that particular file. For example, we could rewrite the current example as follows:
executable:
ml64 /c listing15-4.asm
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
library: print.obj getTitle.obj
print.obj:
ml64 /c print.asm
getTitle.obj:
ml64 /c getTitle.asm
all: library executable
When dependencies are associated with a target that is a filename, you can read the target:
dependencies statement as “target depends on dependencies.” When processing a make command, nmake
compares the modification date and time stamp of the files specified as target filenames and dependency filenames.
If the date and time of the target are older than any of the dependencies (or the target file doesn’t exist), nmake
will execute the commands after the target. If the target file’s modification date and time are later (newer) than all of the dependent files, nmake
will not execute the commands. If one of the dependencies after a target is itself a target elsewhere, nmake
will first execute that command (to see if it modifies the target object, changing its modification date and time, and possibly causing nmake
to execute the current target’s commands). If a target or dependency is just a label (it is not a filename), nmake
will treat its modification date and time as older than any file.
Consider the following modification to the running makefile
example:
c.exe: print.obj getTitle.obj listing15-4.obj
cl /EHa c.cpp print.obj getTitle.obj listing15-4.obj
listing15-4.obj: listing15-4.asm
ml64 /c listing15-4.asm
print.obj: print.asm
ml64 /c print.asm
getTitle.obj: getTitle.asm
ml64 /c getTitle.asm
Note that the all
and library
targets were removed (they turn out to be unnecessary) and that executable
was changed to c.exe (the final target executable file).
Consider the c.exe target. Because print.obj, getTitle.obj, and listing15-4.obj are all targets (as well as filenames), nmake
will first go execute those targets. After executing those targets, nmake
will compare the modification date and time of c.exe against that of the three object files. If c.exe is older than any of those object files, nmake
will execute the command following the c.exe target line (to compile c.cpp and link it with the object files). If c.exe is newer than its dependent object files, nmake
will not execute the command.
The same process happens, recursively, for each of the dependent object files following the c.exe target. While processing the c.exe target, nmake
will go off and process the print.obj, getTitle.obj, and listing15-4.obj targets (in that order). In each case, nmake
will compare the modification date and time of the .obj file with the corresponding .asm file. If the .obj file is newer than the .asm file, nmake
returns to processing the c.exe target without doing anything; if the .obj file is older than the .asm file (or doesn’t exist), nmake
executes the corresponding ml64
command to generate a new .obj file.
If c.exe is newer than all the .obj files (and they are all newer than the .asm files), executing nmake
does nothing (well, it will report that c.exe is up to date, but it will not process any of the commands in the makefile). If any of the files are out of date (because they’ve been modified), this makefile will compile and link only the files necessary to bring c.exe up to date.
The makefiles thus far are missing an important dependency: all of the .asm files include the aoalib.inc file. A change to aoalib.inc could possibly require a recompilation of these .asm files. This dependency has been added to Listing 15-5. This listing also demonstrates how to include comments in a makefile by using the #
character at the beginning of a line.
# listing15-5.mak
# makefile for Listing 15-4.
listing15-4.exe:print.obj getTitle.obj listing15-4.obj
cl /nologo /O2 /Zi /utf-8 /EHa /Felisting15-4.exe c.cpp \
print.obj getTitle.obj listing15-4.obj
listing15-4.obj: listing15-4.asm aoalib.inc
ml64 /nologo /c listing15-4.asm
print.obj: print.asm aoalib.inc
ml64 /nologo /c print.asm
getTitle.obj: getTitle.asm aoalib.inc
ml64 /nologo /c getTitle.asm
Listing 15-5: Makefile to build Listing 15-4
Here’s the nmake
command to build the program in Listing 15-4 by using the makefile (listing15-5.mak) in Listing 15-5:
C:\>nmake /f listing15-5.mak
Microsoft (R) Program Maintenance Utility Version 14.15.26730.0
Copyright (C) Microsoft Corporation. All rights reserved.
ml64 /nologo /c print.asm
Assembling: print.asm
ml64 /nologo /c getTitle.asm
Assembling: getTitle.asm
ml64 /nologo /c listing15-4.asm
Assembling: listing15-4.asm
cl /nologo /O2 /Zi /utf-8 /EHa /Felisting15-4.exe c.cpp print.obj getTitle.obj listing15-4.obj
c.cpp
C:\>listing15-4
Calling Listing 15-4:
Assembly units linked
Listing 15-4 terminated
One common target you will find in most professionally made makefiles is clean
. The clean
target will delete an appropriate set of files to force the entire system to be remade the next time you execute the makefile. This command typically deletes all the .obj and .exe files associated with the project. Listing 15-6 provides a clean
target for the makefile in Listing 15-5.
# listing15-6.mak
# makefile for Listing 15-4.
listing15-4.exe:print.obj getTitle.obj listing15-4.obj
cl /nologo /O2 /Zi /utf-8 /EHa /Felisting15-4.exe c.cpp \
print.obj getTitle.obj listing15-4.obj
listing15-4.obj: listing15-4.asm aoalib.inc
ml64 /nologo /c listing15-4.asm
print.obj: print.asm aoalib.inc
ml64 /nologo /c print.asm
getTitle.obj: getTitle.asm aoalib.inc
ml64 /nologo /c getTitle.asm
clean:
del getTitle.obj
del print.obj
del listing15-4.obj
del c.obj
del listing15-4.ilk
del listing15-4.pdb
del vc140.pdb
del listing15-4.exe
# Alternative clean (if you like living dangerously):
# clean:
# del *.obj
# del *.ilk
# del *.pdb
# del *.exe
Listing 15-6: A clean
target example
Here is a sample clean and remake operation:
C:\>nmake /f listing15-6.mak clean
Microsoft (R) Program Maintenance Utility Version 14.15.26730.0
Copyright (C) Microsoft Corporation. All rights reserved.
del getTitle.obj
del print.obj
del listing15-4.obj
del c.obj
del listing15-4.ilk
del listing15-4.pdb
del listing15-4.exe
C:\>nmake /f listing15-6.mak
Microsoft (R) Program Maintenance Utility Version 14.15.26730.0
Copyright (C) Microsoft Corporation. All rights reserved.
ml64 /nologo /c print.asm
Assembling: print.asm
ml64 /nologo /c getTitle.asm
Assembling: getTitle.asm
ml64 /nologo /c listing15-4.asm
Assembling: listing15-4.asm
cl /nologo /O2 /Zi /utf-8 /EHa /Felisting15-4.exe c.cpp
print.obj getTitle.obj listing15-4.obj
c.cpp
If you want to force the recompilation of a single file (without manually editing and modifying it), a Unix utility comes in handy: touch
. The touch
program accepts a filename as its argument and goes in and updates the modification date and time of the file (without otherwise modifying the file). For example, after building Listing 15-4 by using the makefile in Listing 15-6, were you to execute the command
touch listing15-4.asm
and then execute the makefile in Listing 15-6 again, it would reassemble the code in Listing 15-4, recompile c.cpp, and produce a new executable.
Unfortunately, while touch
is a standard Unix application and comes with every Unix and Linux distribution, it is not a standard Windows application.5 Fortunately, you can easily find a version of touch
for Windows on the internet. It’s also a relatively simple program to write.
Many common projects reuse code that developers created long ago (or they use code that came from a source outside the developer’s organization). These libraries of code are relatively static: they rarely change during the development of a project that uses the library code. In particular, you would not normally incorporate the building of the libraries into a given project’s makefile. A specific project might list the library files as dependencies in the makefile, but the assumption is that the library files are built elsewhere and supplied as a whole to the project. Beyond that, one major difference exists between a library and a set of object code files: packaging.
Dealing with a myriad of separate object files can become troublesome when you’re working with true sets of library object files. A library may contain tens, hundreds, or even thousands of object files. Listing all of these object files (or even just the ones a project uses) is a lot of work and can lead to consistency errors. A common way to deal with this problem is to combine various object files into a separate package (file) known as a library file. Under Windows, library files typically have a .lib suffix.
For many projects, you will be given a library (.lib) file that packages together a specific library module. You supply this file to the linker when building your program, and the linker automatically picks out the object modules it needs from the library. This is an important point: including a library while building an executable does not automatically insert all of the code from that library into the executable. The linker is smart enough to extract only the object files it needs and to ignore the object files it doesn’t use (remember, a library is just a package containing a bunch of object files).
So the question is, “How do you create a library file?” The short answer is, “By using the Microsoft Library Manager program (lib.exe).” The basic syntax for the lib
program is
lib /out:libname.lib list_of_.obj_files
where libname.lib is the name of the library file you want to produce, and list_of_.obj_files is a (space-separated) list of object filenames you want to collect into the library. For example, if you want to combine the print.obj and getTitle.obj files into a library module (aoalib.lib), here’s the command to do it:
lib /out:aoalib.lib getTitle.obj print.obj
Once you have a library module, you can specify it on a linker (or ml64
or cl
) command line just as you would an object file. For example, to link in the aoalib.lib module with the program in Listing 15-4, you could use the following command:
cl /EHa /Felisting15-4.exe c.cpp listing15-4.obj aoalib.lib
The lib
program supports several command line options. You can get a list of those options by using this command:
lib /?
See the online Microsoft documentation for a description of the various commands. Perhaps the most useful of the options is
lib /list lib_filename.lib
where lib_filename.lib represents a library filename. This will print a list of the object files contained within that library module. For example, lib /list aoalib.lib
produces the following output:
C:\>lib /list aoalib.lib
Microsoft (R) Library Manager Version 14.15.26730.0
Copyright (C) Microsoft Corporation. All rights reserved.
getTitle.obj
print.obj
MASM provides a special directive, includelib
, that lets you specify libraries to include. This directive has the syntax
includelib lib_filename.lib
where lib_filename.lib is the name of the library file you want to include. This directive embeds a command in the object file that MASM produces that passes this library filename along to the linker. The linker will then automatically load the library file when processing the object module containing the includelib
directive.
This activity is identical to manually specifying the library filename to the linker (from the command line). Whether you prefer to put the includelib
directive in a MASM source file, or include the library name on the linker (or ml64
/cl
) command line, is up to you. In my experience, most assembly language programmers (especially when writing stand-alone assembly language programs) prefer the includelib
directive.
The basic unit of linkage in a program is the object file. When combining object files to form an executable, the Microsoft linker will take all of the data from a single object file and merge it into the final executable. This is true even if the main program doesn’t call all the functions (directly or indirectly) in the object module or use all the data in that object file. So, if you put 100 routines in a single assembly language source file and compile them into an object module, the linker will include the code for all 100 routines in your final executable even if you use only one of them.
If you want to avoid this situation, you should break those 100 routines into 100 separate object modules and combine the resulting 100 object files into a single library. When the Microsoft linker processes that library file, it will pick out the single object file containing the function the program uses and incorporate only that object file into the final executable. Generally, this is far more efficient than linking in a single object file with 100 functions buried in it.
The key word in that last sentence is generally. In fact, there are some good reasons for combining multiple functions into a single object file. First of all, consider what happens when the linker merges an object file into an executable. To ensure proper alignment, whenever the linker takes a section or segment (for example, the .code
section) from an object file, it adds sufficient padding so that the data in that section is aligned on that section’s specified alignment boundary. Most sections have a default 16-byte section alignment, so the linker will align each section from the object file it links in on a 16-byte boundary. Normally, this isn’t too bad, especially if your procedures are large. However, suppose those 100 procedures you’ve created are all really short (a few bytes each). Then you wind up wasting a lot of space.
Granted, on modern machines, a few hundred bytes of wasted space won’t amount to much. However, it might be more practical to combine several of these procedures into a single object module (even if you don’t call them all) to fill in some of the wasted space. Don’t go overboard, though; once you’ve gone beyond the alignment, whether you’re wasting space because of padding or wasting space because you’re including code that never gets called, you’re still wasting space.
Although it is an older book, covering MASM version 6, The Waite Group’s Microsoft Macro Assembler Bible by Nabajyoti Barkakati and this author (Sams, 1992) goes into much greater detail about MASM’s external directives (extern
, externdef
, and public
) and include files.
You can also find the MASM 6 manual (the last published edition) online.
For more information about makefiles, check out these resources:
clean
command typically do?