How to use Strings in Assembly Language?

A string in assembly language is a sequence of characters that is stored in contiguous memory locations. Strings are used to represent text data, such as names, addresses, and messages.

To declare a string in assembly language, the programmer uses the .ascii or .asciz directive. The directive specifies the characters of the string. The .ascii directive terminates the string with a null character ('\0'), while the .asciz directive does not.

section .data myString db 'Hello, Assembly!', 0 ; Null-terminated string

In this example, myString is a null-terminated string that contains the text "Hello, Assembly!".

Access a character of a string

Once a string has been declared, the programmer can access the characters of the string using an index. The index is a number that identifies the position of the character in the string. The first character of the string has index 0, the second character has index 1, and so on.

To access a character of a string in assembly language, the programmer uses the square bracket ([]) operator. For example, the following assembly instruction loads the character of the string my_string at index 0 into the register al:

mov al, my_string[0]

Calculating String Length

To calculate the length of a string in assembly, you typically iterate through its characters until the null terminator is encountered. Here's an example using the ecx register to store the length:

section .text global _start _start: mov ecx, 0 ; Initialize loop counter find_length: cmp byte [myString + ecx], 0 ; Check for null terminator je found_length ; Jump if null terminator found inc ecx ; Increment loop counter jmp find_length ; Continue loop found_length: ; The length of the string is now in the ecx register

In this code, the find_length loop increments the ecx register until it finds the null terminator, indicating the end of the string. The length of the string is then stored in the ecx register.

Printing a String

Printing a string in assembly often involves using system calls. Here's an example using the write system call to print the string:

; Assuming the length of the string is in the ecx register ; (calculated in the previous step) ; Print the string mov eax, 4 ; syscall: write mov ebx, 1 ; file descriptor: STDOUT mov edx, ecx ; length of the string lea ecx, [myString] ; address of the string int 0x80 ; call kernel

In this code, the write system call is used to print the string to the standard output (STDOUT). The eax, ebx, edx, and ecx registers are used to specify the syscall number, file descriptor, length of the string, and the address of the string, respectively.

Iterate over the characters of a string

Assembly language programmers can also use loops to iterate over the characters of a string. For example, the following assembly instruction loop prints each character of the string my_string to the console:

; Loop over the characters of the string 'my_string'. mov ecx, 0 loop_string: mov al, my_string[ecx] call _putchar inc ecx cmp my_string[ecx], 0 jne loop_string ; The loop is finished.

Exiting the Program

Finally, after performing the necessary operations, you may want to exit the program. Here's an example using the exit system call:

; Exit the program mov eax, 1 ; syscall: exit xor ebx, ebx ; status: 0 int 0x80 ; call kernel

The exit system call is used to terminate the program. The eax register contains the syscall number, and xor ebx, ebx sets the exit status to 0.

Full Source:
section .data myString db 'Hello, Assembly!', 0 ; Null-terminated string section .text global _start _start: ; Calculate string length mov ecx, 0 ; Initialize loop counter find_length: cmp byte [myString + ecx], 0 ; Check for null terminator je found_length ; Jump if null terminator found inc ecx ; Increment loop counter jmp find_length ; Continue loop found_length: ; The length of the string is now in the ecx register ; Print the string mov eax, 4 ; syscall: write mov ebx, 1 ; file descriptor: STDOUT mov edx, ecx ; length of the string lea ecx, [myString] ; address of the string int 0x80 ; call kernel ; Exit the program mov eax, 1 ; syscall: exit xor ebx, ebx ; status: 0 int 0x80 ; call kernel
Output: Hello, Assembly!

Conclusion

Strings are represented as sequences of characters stored in contiguous memory locations, often terminated with a null character. String operations involve iterating through characters, calculating lengths, and using system calls for printing or other manipulations.