In the previous article, we examined how to dynamically link functions in Mach-O libraries. Now letโs move on to practice.
We have a macOS program that’s used by a number of third-party dynamically linked libraries, which, in turn, call each other’s functions.
The task is as follows: we need use a handler to intercept a function call made by one library to another and then call the original function. This article will be useful for Mac software developers who need to redirection imported functions in Mach-O libraries.
Contents:
Test example
Letโs suppose we have a program called test thatโs written in C (test.c) and a shared static library (libtest.c) thatโs compiled beforehand. This library implements one libtest function.
Both the program and the library use the puts function from the standard C library thatโs provided with macOS and is contained in libSystem.B.dylib. Letโs visualize this:
The task is the following:
- Replace the call to the puts function in the libtest.dylib library with a call to the hooked_puts function thatโs implemented in the main program (test.c). The hooked_puts function will then call the original puts function.
- Cancel the previously made changes so that the next call to libtest leads to a call to the original puts function.
To do this, we cannot change the code or recompile the libraries. We can only change the code in and recompile the main program. Call redirection itself should be performed only for a specific library and on the fly, without the program needing to restart.
Redirection algorithm
Letโs describe all of the redirect actions in words, as the code can be hard to follow despite the many comments:
- Find the symbol table and table of strings using data from the LC_SYMTAB loader command.
- From the LC_DYSYMTAB loader command, find out from which element of the symbol table a subset of undefined symbols (the iundefsym field) begins.
- Find the target symbol by name among the subset of undefined symbols in the symbol table.
- Save the index of the target symbol from the beginning of the symbol table.
- Find the table of indirect symbols (the indirectsymoff field) using data from the LC_DYSYMTAB loader command.
- Find out the index from which mapping begins of the import table (contents of the __DATA, __la_symbol_ptr section or __IMPORT, __jump_table) to the table of indirect symbols (the reserved1 field).
- Starting from this index, look through the table of indirect symbols and search for the value that corresponds to the index of the target symbol in the symbol table.
- Save the number of the target symbol from which begins the mapping of the import table to the table of indirect symbols. This saved value is the index of the required element in the import table.
- Find the import table (the offset field) using data from the __la_symbol_ptr section or __jump_table.
- Once X contains the index of the target element, rewrite the address for __la_symbol_ptr to the required value โ or just change the CALL/JMP instruction to JMP with an operand that is the address of the required function (for __jump_table).
Note that you should work with tables of symbols, strings, and indirect symbols only after loading them from the Mach-O file. Also, you should read the contents of sections that describe import tables as well as perform the redirection itself in memory. This is connected with the fact that tables of symbols and tables of strings can be absent or may not display the real state in the target Mach-O file. This is because the dynamic loader has successfully saved all necessary data about symbols without allocating the tables themselves.
Implementing redirection
Now itโs time to turn our thoughts to the code. Letโs divide all operations into three stages to optimize the search for required Mach-O elements:
void *mach_hook_init(char const *library_filename, void const *library_address);
Based on the Mach-O file and how itโs displayed in memory, this function returns some unclear descriptor. Behind this descriptor are offsets of the import table, symbol table, table of strings, and the mapping of indirect symbols from the table of dynamic symbols as well as a number of useful indexes for this module. The descriptor is the following:
struct mach_hook_handle
{
void const *library_address; //base address of a library in memory
char const *string_table; //buffer to read string_table table from file
struct nlist const *symbol_table; //buffer to read symbol table from file
uint32_t const *indirect_table; //buffer to read the indirect symbol table in dynamic symbol table from file
uint32_t undefined_symbols_count; //number of undefined symbols in the symbol table
uint32_t undefined_symbols_index; //position of undefined symbols in the symbol table
uint32_t indirect_symbols_count; //number of indirect symbols in the indirect symbol table of DYSYMTAB
uint32_t indirect_symbols_index; //index of the first imported symbol in the indirect symbol table of DYSYMTAB
uint32_t import_table_offset; //the offset of (__DATA, __la_symbol_ptr) or (__IMPORT, __jump_table)
uint32_t jump_table_present; //special flag to show if we work with (__IMPORT, __jump_table)
};
mach_substitution mach_hook(void const *handle, char const *function_name, mach_substitution substitution);
This function performs the redirection using the algorithm described above and using the existing library descriptor, the name of the target symbol, and the address of the interceptor.
void mach_hook_free(void *handle);
In this way, any descriptor returned by mach_hook_init is cleaned up.
Taking into account these prototypes, we need to rewrite the test program:
#include <stdio.h>
#include <dlfcn.h>
#include "mach_hook.h"
#define LIBTEST_PATH "libtest.dylib"
void libtest(); //from libtest.dylib
int hooked_puts(char const *s)
{
puts(s); //calls the original puts() from libSystem.B.dylib because our main executable module called "test" remains intact
return puts("HOOKED!");
}
int main()
{
void *handle = 0; //handle to store hook-related info
mach_substitution original; //original data for restoration
Dl_info info;
if (!dladdr((void const *)libtest, &info)) //gets an address of the library which contains the libtest() function
{
fprintf(stderr, "Failed to get the base address of a library!\n", LIBTEST_PATH);
goto end;
}
handle = mach_hook_init(LIBTEST_PATH, info.dli_fbase);
if (!handle)
{
fprintf(stderr, "Redirection init failed!\n");
goto end;
}
libtest(); //calls puts() from libSystem.B.dylib
puts("-----------------------------");
original = mach_hook(handle, "puts", (mach_substitution)hooked_puts);
if (!original)
{
fprintf(stderr, "Redirection failed!\n");
goto end;
}
libtest(); //calls hooked_puts()
puts("-----------------------------");
original = mach_hook(handle, "puts", original); //restores the original relocation
if (!original)
{
fprintf(stderr, "Restoration failed!\n");
goto end;
}
libtest(); //again calls puts() from libSystem.B.dylib
end:
mach_hook_free(handle);
handle = 0; //no effect here but advisable to prevent double freeing
return 0;
}
Testing our solution
Let’s initiate the test in the following way:
user@mac$ arch -i386 ./test
libtest: calls the original puts()
-----------------------------
libtest: calls the original puts()
HOOKED!
-----------------------------
libtest: calls the original puts()
user@mac$ arch -x86_64 ./test
libtest: calls the original puts()
-----------------------------
libtest: calls the original puts()
HOOKED!
-----------------------------
libtest: calls the original puts()
The program output indicates the full execution of the task that was formulated in the beginning.
Conclusion
In this article, we provided a practical example of how to redirect an imported function for a macOS program using Mach-O functions. You can download the test example together with the redirection algorithm and the project file at the link below.
Apriorit has a team of dedicated macOS specialists who will be glad to assist you in developing your Mac project. Contact us using the form below to discuss the details.