Week 5 - 073857
Week 5 - 073857
Language Programming
CONTENTS
• Overview of assembly language programming
• Data representation
• Syntax and structure of assembly language instructions
• Arithmetic and Logic Instructions
• Control transfer instructions
• Data transfer instructions
• Programming techniques
• Debugging
•Assembly language is a low-level programming language that is used
to write programs for microprocessors and other embedded systems.
•Assembly language programs are written using mnemonics, which are
short, easy-to-remember codes that represent machine language
instructions.
•Assembly language programming is typically used when high
performance or direct hardware control is required, as it provides direct
access to the machine's resources.
•Assembly language is also used when writing device drivers, operating
systems, and other low-level software.
Some key features of assembly language programming
include:
Mnemonics: For example, the mnemonic MOV might represent the instruction
"move data from one location to another".
Registers: use registers to store data and perform calculations.
Direct memory access: programs can directly access memory, which allows
for efficient data processing.
Interrupts: programs can handle interrupts, which are signals that are sent to
the processor to indicate that an event has occurred.
Debugging: programs can be debugged using specialized tools such as
debuggers, which allow developers to step through the program and examine
the state of the processor and memory.
Data representation
• In computing, data is represented in binary form.
• Binary can be difficult to read and write, especially when dealing with
large numbers or data sets.
• To make it easier to work with, hexadecimal (or "hex") notation is
often used.
• Hexadecimal is commonly used in computing for representing
memory addresses, colors in graphics programming, and in many
other applications where it is necessary to represent binary data in a
more readable and compact format.
• In assembly language programming, binary and hexadecimal are
commonly used to represent data and instructions.
• Instructions in assembly language are represented in binary form,
using machine language instructions that are represented as a
sequence of 0s and 1s.
• These instructions are executed by the CPU to perform various
operations, such as moving data between registers, performing
arithmetic operations, or branching to another part of the program.
• Assembly language programs also use binary and hexadecimal to
represent data, such as memory addresses, register values, and
constants.
• For example, in x86 assembly language, the instruction "mov eax,
0x1234" would move the hexadecimal value 0x1234 into the EAX
register.
• Similarly, memory addresses may be represented in hexadecimal
notation, such as "mov [0x12345678], eax", which moves the
contents of the EAX register to the memory location at address
0x12345678.
• In addition to representing data and instructions, hexadecimal is often
used in assembly language programming for debugging and analysis.
• For example, hexadecimal memory dumps can be used to examine
the contents of memory at a specific point in the program's execution,
allowing developers to identify bugs or other issues.
Syntax and Structure
• The syntax and structure of assembly language instructions can vary
depending on the specific architecture and assembler being used, but
there are some common elements that are present in most assembly
languages.
• E.g. MOV AX, BX
• This instruction moves the contents of the BX register into the AX
register.
• MOV is the opcode (or operation code) that specifies the operation to
be performed.
• AX and BX are the operands, which specify the registers involved in
the operation.
• The comma separates the two operands.
• Some instructions may take immediate values (such as constants or
addresses) as operands, which are represented in various formats
depending on the architecture and assembler being used.
• Labels are used to mark a specific location in the program, and can be
used to refer to that location in other parts of the program.
• Comments can be added to assembly language programs to provide
additional information or clarification for human readers. In most
assemblers, comments start with a semicolon (;) or a double slash (//)
and continue until the end of the line.
• Directives are special instructions that are used to provide
information to the assembler, such as defining constants, reserving
memory space, or including external libraries.
Arithmetic and Logic Instructions
MOV EAX, 456 ; move the value 456 into the EAX register
SUB EAX, 78 ; subtract the value 78 from the EAX register and store the result
in the EAX register
Multiplication
MOV AX, 5 ; move the value 5 into the AX register
MOV BX, 3 ; move the value 3 into the BX register
MUL BX ; multiply the contents of the AX and BX registers and store the
result in the DX:AX register pair
MOV CX, AX ; move the low 16 bits of the DX:AX register pair (the result)
into the CX register
MOV EAX, 123 ; move the value 123 into the EAX register
MOV EBX, 456 ; move the value 456 into the EBX register
IMUL EBX ; signed multiply the contents of the EAX and EBX registers and
store the result in the EDX:EAX register pair
MOV ECX, EAX ; move the low 32 bits of the EDX:EAX register pair (the result)
into the ECX register
Division
MOV AX, 10 ; move the value 10 into the AX register
MOV BX, 3 ; move the value 3 into the BX register
DIV BX ; divide the contents of the AX register by the BX register and store
the quotient in the AL register and remainder in the AH register
MOV CX, AX ; move the contents of the AX register (the quotient) into the
CX register
MOV EAX, 456 ; move the value 456 into the EAX register
MOV EBX, 78 ; move the value 78 into the EBX register
IDIV EBX ; signed divide the contents of the EAX register by the EBX
register and store the quotient in the EAX register and remainder in the EDX
register
MOV ECX, EAX ; move the contents of the EAX register (the quotient) into the
ECX register
AND
MOV AX, 5 ; move the value 5 into the AX register
MOV BX, 3 ; move the value 3 into the BX register
AND AX, BX ; perform a bitwise AND between the contents of the AX and
BX registers and store the result in the AX register
MOV CX, AX ; move the contents of the AX register into the CX register
MOV EAX, 0F0h ; move the value 0F0h into the EAX register
MOV EBX, 00Fh ; move the value 00Fh into the EBX register
AND EAX, EBX ; perform a bitwise AND between the contents of the EAX and
EBX registers and store the result in the EAX register
OR
MOV AX, 5 ; move the value 5 into the AX register
MOV BX, 3 ; move the value 3 into the BX register
OR AX, BX ; perform a bitwise OR between the contents of the AX and BX
registers and store the result in the AX register
MOV CX, AX ; move the contents of the AX register into the CX register
MOV EAX, 0F0h ; move the value 0F0h into the EAX register
MOV EBX, 00Fh ; move the value 00Fh into the EBX register
OR EAX, EBX ; perform a bitwise OR between the contents of the EAX and
EBX registers and store the result in the EAX register
XOR
MOV AX, 5 ; move the value 5 into the AX register
MOV BX, 3 ; move the value 3 into the BX register
XOR AX, BX ; perform a bitwise XOR between the contents of the AX and
BX registers and store the result in the AX register
MOV CX, AX ; move the contents of the AX register into the CX register
MOV EAX, 0F0h ; move the value 0F0h into the EAX register
MOV EBX, 00Fh ; move the value 00Fh into the EBX register
XOR EAX, EBX ; perform a bitwise XOR between the contents of the EAX and
EBX registers and store the result in the EAX register
Control transfer instructions
• Allow the program to change the order in which instructions are
executed by transferring control to another part of the program.
• There are two main categories of control transfer instructions:
unconditional and conditional.
• Unconditional control transfer instructions unconditionally transfer
control to a specific location in the program.
• JMP (jump): The JMP instruction transfers control to a specified
memory address, which can be either absolute or relative to the
current instruction pointer.
• CALL (call subroutine): The CALL instruction transfers control to a
subroutine located at a specified memory address. It saves the return
address on the stack so that the program can return to the calling
code after the subroutine completes.
• RET (return from subroutine): The RET instruction returns control to
the calling code after a subroutine completes. It pops the return
address from the stack and transfers control to that address.
• Conditional control transfer instructions transfer control to a specific
location in the program only if a certain condition is met.
• JZ/JNZ (jump if zero/not zero): The JZ and JNZ instructions transfer
control to a specified memory address if the previous arithmetic or
logical operation resulted in zero or not zero, respectively.
• JE/JNE (jump if equal/not equal): The JE and JNE instructions transfer
control to a specified memory address if the previous comparison
operation resulted in equal or not equal, respectively.
• JA/JAE/JB/JBE (jump if above/above or equal/below/below or equal):
These instructions are used for unsigned integer comparisons.
• JA and JAE transfer control if the result is greater than or greater than
or equal to, while JB and JBE transfer control if the result is less than
or less than or equal to.
• JG/JGE/JL/JLE (jump if greater/greater or equal/less/less or equal):
These instructions are used for signed integer comparisons.
• JG and JGE transfer control if the result is greater than or greater than
or equal to, while JL and JLE transfer control if the result is less than or
less than or equal to.
• Control transfer instructions can also be used for implementing
interrupt handlers, error handling routines, and other advanced
program control structures.
• Control transfer instructions are an essential component of
programming that enable a program to make decisions and respond
to input in a dynamic and flexible manner.
Data transfer instructions
• Are used to move data between different memory locations and
registers.
• These instructions are critical for manipulating data in a program and
are used extensively in programming.
• MOV (move):
• The MOV instruction copies the contents of one location to another. It
can move data between memory and registers, registers and
registers, or memory and memory. For example, to move the contents
of the EAX register to the EBX register, you would use the following
code:
MOV EBX, EAX
PUSH EAX
PUSH 1234 ; push the value 1234 onto the stack
• POP (pop):
• The POP instruction pops a value from the stack and stores it in a
register or memory location. This is often used to retrieve register
contents that were saved with the PUSH instruction. For example, to
pop a value from the stack and store it in the EAX register, you would
use the following code: POP EAX
• LEA (load effective address):
• The LEA instruction calculates the effective address of a memory
location and stores it in a register. This is often used for computing the
address of an array or data structure. For example, to compute the
address of the array "my_array" and store it in the EAX register, you
would use the following code:
LEA EAX, my_array
• MOVS (move string):
• The MOVS instruction moves a byte or word from one memory
location to another. This is often used for copying data between
memory locations. For example, to move a byte from the address in
the ESI register to the address in the EDI register, you would use the
following code:
In this example, the loop body will execute 10 times before the loop terminates.
• Branching:
• Branching is the process of changing the normal flow of execution by
jumping to a different part of the code.
• Branching is used to implement conditional statements such as IF-
THEN and SWITCH statements.
• In assembly language, branching is implemented using conditional
jumps and compare instructions, such as CMP (compare) and TEST
(test), that set the flags register based on the results of the
comparison.
MOV AX, 10
CMP AX, 20 ; compare AX to 20
JL LESS_THAN ; jump to LESS_THAN if AX is less than 20
; code to execute if AX is greater than or equal to 20
JMP END_IF
LESS_THAN:
; code to execute if AX is less than 20
END_IF:
• In this example, the code will jump to the LESS_THAN label if AX is
less than 20.
• Otherwise, it will execute the code following the JL instruction.
• The JMP instruction at the end of the code ensures that the code after
the END_IF label is executed, regardless of which branch was taken.
• Debugging is an important part of any programming process,
including assembly language programming.
• It involves identifying and fixing errors or bugs in the code.
• There are several common debugging techniques that can be used to
debug assembly language programs, including:
• Using a debugger:
• A debugger is a tool that allows you to step through the code line by
line and inspect the values of the registers and memory locations.
• This can be useful for identifying where errors occur and
understanding how the program is executing.
• Single-stepping:
• Single-stepping is the process of executing the program one
instruction at a time.
• This can be done using a debugger, which allows you to step through
the program and examine the values of the registers and memory
locations at each step.
• Single-stepping can be helpful in identifying where errors occur and
how the program is executing.
• Breakpoints:
• A breakpoint is a point in the code where the debugger will pause
execution.
• You can set breakpoints at specific points in the code to inspect the
values of the registers and memory locations and identify where
errors occur.
• Breakpoints can be especially useful in complex programs where
single-stepping may be impractical.
• Debugging output:
• Debugging output involves adding print statements to the code to
output the values of registers and memory locations at specific points
in the code.
• This can be helpful in identifying where errors occur and how the
program is executing.
• Stack tracing:
• Stack tracing involves examining the contents of the stack to
determine how the program got to its current state.
• This can be helpful in identifying where errors occur and
understanding the flow of the program.
• Using a memory debugger:
• A memory debugger is a tool that can help you identify memory-
related errors, such as buffer overflows or memory leaks.
• Memory debuggers can be especially useful in large programs or
programs with complex memory usage.
Debuggers
• OllyDbg is a popular and widely used assembly debugger for
Windows. It offers a user-friendly interface with features like code and
memory breakpoints, step-by-step execution, registers and memory
view, disassembly, and more. OllyDbg supports both 32-bit and 64-bit
assembly code debugging.
• WinDbg is a powerful debugger provided by Microsoft for Windows. It
supports assembly code debugging as well as kernel mode debugging.
WinDbg offers advanced features like source-level debugging,
breakpoints, watchpoints, memory analysis, and more. It is commonly
used for debugging drivers and low-level software.
• GDB (GNU Debugger) is a versatile debugger that supports assembly
language debugging for various platforms, including Windows, Linux,
and macOS. It offers a command-line interface and provides features
like breakpoints, stepping, watchpoints, disassembly, and memory
examination. GDB is commonly used for debugging C/C++ programs,
but it can also handle assembly language debugging.
• IDA Pro is a professional disassembler and debugger that supports
analyzing and debugging assembly code for various platforms. It offers
an interactive and graphical interface with features like disassembly,
debugging, cross-references, function call graphs, and more. IDA Pro
is widely used for reverse engineering and malware analysis.
• Radare2 is an open-source reverse engineering framework that
includes a powerful command-line debugger. It supports assembly
code debugging for multiple architectures and offers features like
breakpoints, stepping, tracing, disassembly, and more. Radare2 is
highly extensible and can be customized to fit specific requirements.
• THE END …