beta.blog

Archive for April, 2023

Writing a Demangler for C++ symbols in std C

by on Apr.02, 2023, under News

A C++ mangler (also known as a name mangler or symbol mangler) is a process that encodes the names of functions, variables, and other symbols in a C++ program into a format that can be more easily distinguished from each other and that includes information about their type signatures. The mangling process is necessary because C++ allows overloading of function names, which means that multiple functions can have the same name but different parameter types or numbers of parameters. In order to differentiate these functions, the compiler generates a unique name for each function that incorporates its signature.

The exact details of the mangling process are compiler-specific and not standardized, but most compilers use some variant of the Itanium C++ ABI mangling scheme, which was developed by the Itanium C++ ABI Committee. The mangling process typically involves adding special characters and encoding the function signature using a set of rules defined by the mangling scheme. The resulting mangled names are typically long and complex, and are not intended to be human-readable.

We will now write a C program that writes a demangler based on a library of the GNU GCC project:

git clone https://gcc.gnu.org/git/gcc.git
cd gcc/libiberty
./configure
make

Mangled names are used by the compiler to generate object files and executable files, and are also used by linkers to resolve symbols when linking multiple object files together. Because mangled names are not intended to be human-readable, it can be difficult to determine the original name of a function or variable from its mangled name. However, there are tools and libraries available (such as the GNU libiberty library) that can be used to demangle mangled names and convert them back into their original human-readable form.

This gives us the compiled library libiberty.a. We will now write a C program based on this library.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <demangle.h>

int main(int argc, char *argv[]) {
    char *symbol_name = "_ZN6PersonC1EPcii";
    char *demangled_name = cplus_demangle(symbol_name, 0); 

    if (demangled_name != NULL) {
        printf("Demangled name: %s\n", demangled_name);
        free(demangled_name);
    } else {
        printf("Could not demangle name: %s\n", symbol_name);
    }   

    return 0;
}

Assuming that the previously cloned gcc folder is in the same directory, we can compile the program as follows:

clang -I gcc/include/ -L gcc/libiberty/ -liberty demangle.c -o demangle
  • The flag -I is needed to point to the folder containing demangle.h, which we need in our program.
  • The flag -L points to the folder where the compiled library libiberty.a is located.
  • The flag -l (lowercase L) named the name of the library libiberty.a with the lib prefix and the file extension removed.

When we run this program, we get a plain text output for the mangled symbol _ZN6PersonC1EPcii

Demangled name: Person::Person

Leave a Comment more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!