THE ASSEMBLY LANGUAGE LEVEL



 
 
 
 
 Tuning
Macros
 The Assembly Process
 Linking and Loading
 Binding Time and Dynamic Relocation 

Every language must be converted into machine language.  Translators are of two types.  If the source language is a symbolic representation for a numerical machine language (Assembly language) the translator is called an assembler, while for high-level language, it is called a compiler.

In a pure assembly language, each statement produces exactly one machine instruction (one-to-one correspondence between machine instructions and statements in assembly language).  So an n-line assembly language will produce an n-word machine language program.  Assembly language which uses mnemonic codes is easier than machine language using binary or hexadecimal codes.  It is easier to remember ADD, SUB, MUL, or DIV, than their corresponding numerical values in machine language.  Assembly language uses symbolic names for memory locations while machine language needs numerical values.  The assembly language can directly test if there is an overflow bit while a higher level language cannot.  An assembly language can only run on one family of machines (each machine has its own assembly language) while a higher level language can run on many machines.

Why use Assembly Language?

Assembly language is very difficult.  There are two reasons to study it:  performance and access to the machine.  Assemblers are faster and more efficient than compilers.  An expert assembly language programmer can produce smaller and faster code than a high level language.  Where speed and size are critical, such as code on a smart card, the code in a cellular telephone, device drivers, BIOS routines, and the inner loops of performance-critical appliances, here is a big difference.  Also some procedures need complete access to hardware (low-level interrupt and trap handlers), which is impossible with high level languages.

In many programs, a small percentage of the total code is responsible for a large percentage of the execution time.  It is common to have 1% of the program take 50% of the execution time and 10% take 90%.  So a hybrid with that small percentage written in assembly language while the rest of the program uses a high level language is done.  Computers are efficient with assembly language, programmers with high level language.  So this gets the benefit of both worlds.

Assume that it needs 10 programmer-years to write a program in a high level language, and 100 seconds to execute.  The same program written in assembly language requires 50 programmer-years and 33 seconds to execute.  If 10% of the program accounts for 90% of the execution time, then this 10% is rewritten in assembly language.
 
 

 Programmer-years to
 produce the program
 Program execution
 time in seconds
 Assembly language
 High-level language
 50 
10 
33 
100 
Mixed approach before tuning
        Critical 10%
        Other 90%
        Total


10 
90 
10 
100 
 Mixed approach after tuning
         Critical 10%
         Other 90%
         Total


15 
30 
10 
40 

MACROS

If a sequence is repeated frequently, it is tedious to write it several times.  It can be converted into a procedure and called when required.  This strategy needs a procedure call instruction and a return instruction.  If the sequence is short, there may not be an advantage.  Macros provide an easy and efficient solution to the problem.  A macro definition is a way to give a name to a piece of text.  It is an abbreviation for a portion of text.  After a macro has been defined, the programmer can write the macro name instead of the piece program.  For example, without a macro, a program to interchange two values P and Q twice would be:

    MOV    EAX, P
    MOV    EBX, Q
    MOV    Q, EAX
    MOV    P, EBX

    MOV    EAX, P
    MOV    EBX, Q
    MOV    Q, EAX
    MOV    P, EBX

The four duplicate lines may be replaced by SWAP.  A program using a macro with SWAP would be:

    SWAP    MACRO
                   MOV    EAX, P
                   MOV    EBX, Q
                   MOV    Q, EAX
                   MOV    P, EBX
                   ENDM

                   SWAP

                   SWAP

SWAP is an abbreviation for the four statements above.

The basic parts of a macro are:
1.    A macro header with the name of the macro being defined.
2.    The text comprising the body of the macro.
3.    A pseudoinstruction marking the end of the definition (e.g., ENDM for end of Macro)

Whenever the name of the macro appears as an opcode, it is replaced with the macro body.  Macro expansion s occur while assembling and not during executing, and thus produces the same machine language code.  Once completed, the macro definitions are discarded by the assembler.

There is a difference between macro calls and procedure calls.  A macro call is an instruction to the assembler to replace the macro name with the macro body, while a procedure call is a machine instruction inserted into the object program, that will be executed later to call the procedure.  A summary is given below.
 

Item
Macro call
Procedure call
 When is the call made?  During assembly  During program execution
 Is the body inserted into the object
 program every place the call is made?
 Yes  No
 Is a procedure call instruction inserted
 into the object program and later executed?
 No  Yes
 Must a return instruction be used after the call is done?  No  Yes
 How many copies of the body appear in the object program?  One per macro call  1

The assembly process takes place in two passes.  On the first pass, macro definitions are saved and macro calls are expanded, while on the second pass, the resulting text is processed as  part of the original program.

THE ASSEMBLY PROCESS

Two-Pass Assemblers

The assembler just cannot read one statement, translate it into machine language and output the generated machine language onto a file with the corresponding piece of the listing into another file, until the whole program has been translated.  If the first statement is a branch to L.  The assembler cannot assemble this statement until it knows the address of the statement L.  If it is near the end of the program, the assembler must first read the entire program.  This is called the forward reference problem.  as a symbol, L, has been used before it has been defined.  There are two ways to solve this dilemma.  The assembler can read the source program twice.  Each reading is called a pass; any translator that reads the input program twice is called a two-pass translator.  On pass one, the definitions of symbols, including statement labels, are collected and stored in a table.  During the second pass, the values of all symbols are known, and so each statement can be read, assembled, and outputted.  The second approach consists of reading the assembly program once, converting it to an intermediate form in a table in memory.  Then a second pass is made over the table instead of over the source program.  With enough memory, this saves I/O time.

Pass One

Pass one of most assemblers uses at least three tables:  the Symbol table, the pseudoinstruction table and the opcode table.  If needed, a literal table is also kept.  The symbol table has one entry for each symbol, defined either by using them as labels or by explicit definition (EQU for example).  Each symbol table entry contains the symbol itself, its numerical value, and perhaps other information. which includes the length of the data field, the relocation bits, and whether the symbol is accessible outside the procedure.  The opcode table contains at least one entry for each symbolic opcode, which consists of two operands, the opcode's numeric value, the instruction length, and a type number that separates the opcodes into groups depending on the number and kind of operands.

Pass Two

The function of pass two is to generate the object program, print the assembly listing, and output information needed by the linker for linking up procedures assembled at different times into a single executable file.

The Symbol Table

During pass one of the assembly process, the assembler accumulates information about symbols and their values that must be stored in the symbol table for lookup during pass two.  It simulates an associative memory, which is a set of (symbol, value) pairs.  Given the symbol, the associative memory produces the value.  It can be accomplished by implementing the symbol table as an array of pairs, the first element pointing to the symbol, and the second element to its value.  The table is searched linearly until a pair is discovered.  This method is slow as an an average half the table has to be searched.  Another method is to sort the symbols and use a binary search.  The middle entry is compared and half the table is discarded.  Applying it recursively, a table of size n entries needs log2n attempts.  When sorted, it is a much faster method.

Another method is hash coding.  A "hash" function maps symbols into integers in the range 0 to k - 1.  One possible function is to multiply the ASCII codes of the characters in the symbols together, ignoring overflow, and taking the result modulo k or dividing it by a prime number.  Symbols are stored in a table of k buckets numbered 0 to k - 1.  All the (symbol, value) pairs whose symbol hashes to i are stored on a linked list pointed to by slot i in the hash table.  With n symbols and k slots in the hash table, the average list will have length n/k.  By choosing k approximately (but less than) n, only one lookup is needed.  By adjusting k the table size cab be reduced at the expense of slower lookups.

LINKING AND LOADING

Most programs have more than one procedure.  Compilers and assemblers translate one procedure at a time and place the translated output on a disk.  Before the program can run, all the translated procedures must be found and linked together.  The linked program must be loaded into virtual or main memory.  The compiler or assembler compiles or assembles the source procedures, and the linker links the object modules.

The translation from source procedure to object module represents a change of level as the source language and target language have different instructions and notations.  However, the linker's input and output are programs for the same virtual machine.  The linker's function is to collect procedures translated separately and link them together to be run as a unit called an executable binary program.  On MS-DOS, Windows 95/98, and NT the object modules have extension .obj and the executable binary programs have extension .exe.  On UNIX, the object modules are .o, while the executable binary programs have no extension.

Compilers and assemblers translate each source procedure as separate entity, as otherwise changing any statement in one source procedure would require that all source procedures be retranslated.  However, all the object modules would have to be relinked.  Linking is faster than translating.  The linker rings the object modules into main memory to form an image of the executable binary program.    A section of memory starting at address zero is used for interrupt vectors, communication with the operating system, catching uniitialized pointers, etc., so programs start above 0.  However, this causes relocation problems, as each object module represents a separate address space.  In addition, the assembler has no way of knowing what address to insert a CALL instruction, which is not known until linking time, known as the external reference problem.

The linker merges the separate address spaces of the object modules into a single linear address space by first constructing a table of all object modules and their lengths, then, based on this table, it assigns a starting address to each object module, then finds all the instructions that reference memory adds to each relocation constant equal to the starting address of its module, and finally finds all instructions that reference other procedures and inserts the address of these procedures in place.

Binding Time and Dynamic Relocation

In a multiprogramming system, a program can be read into main memory, run for a while, written to disk, and then read back into main memory to be run again.  In a large system with many programs, it is difficult to ensure that a program is read back into the same location every time.  Then all memory addresses will be incorrect.  The relocation information has been discarded, and even if available, the cost of relocating all addresses is high.

When a program is written, it contains symbolic names for memory addresses.  The time at which the actual main memory address corresponding to it is determined is called the binding time, which may be when the program is written, translated, before it is loaded, when it is loaded, when a base register used for addressing is loaded, or when the instruction containing the address is executed.

If an instruction containing a memory address is moved after binding, it will be incorrect, so moving programs after linking fails.  The questions are when symbolic names are bound to virtual addresses, and when virtual addresses are bound to physical addresses.  Binding is complete when bother operations occur.  When the linker merges the separate address spaces of the object modules into a single linear address space, it is creating a virtual address space.

On a computer with virtual memory, completing all linking before before beginning execution does not take advantage of the full capabilities of the virtual memory.  A more flexible way to link separately compiled procedures is dynamic linking when each procedure is linked at the time it is first called.  Here procedure calls in the source language are translated into instructions hat indirectly address the first word of the corresponding linkage block.  The compiler fills this word with either an invalid address or a special bit pattern that forces a trap.

When a procedure in a different segment is called, the attempt to address the invalid word indirectly causes a trap to the dynamic linker, which then finds the character string in the word following the invalid address and searches the user's file directory for a compiled procedure with this name, which is then assigned a virtual address which overwrites the invalid address in the linkage segment.  The instruction causing the linkage fault is executed, allowing the program to continue from the place it was before the trap.  All subsequent references to that procedure will be executed without causing a linkage fault, for the indirect word now contains a valid virtual address.  So, the dynamic linker is invoked only the first time a procedure is called and not thereafter.
 
Top of the page
 On to Parallel Computer Architecture
Back to Schedule and Syllabus