Visual C++ name mangling

Visual C++ name mangling is a mangling (decoration) scheme used in Microsoft's Visual C++ series of compilers. It provides a way of encoding the name and additional information about a function, structure, class or another datatype in order to pass more semantic information from the Microsoft Visual C++ compiler to its linker. Visual Studio and the Windows SDK (which includes the command line compilers) come with the program undname, which may be invoked to obtain the C-style function prototype encoded in a mangled name. The information below has been mostly reverse-engineered; there is no official documentation for the actual algorithm used.

Overview

Any object code produced by the compiler is usually linked with other pieces of object code by the linker. The linker relies on unique object names for identification but C++ (and many modern programming languages) allows different entities to be named with the same identifier as long as they occupy a different namespace. Names need to be mangled by the compiler to make them distinct before reaching the linker. The linker also needs information on each program entity. For example, to correctly link a function it needs its name, the number of arguments and their types. C++ decoration can become complex (storing information about classes, templates, namespaces, operator overloading, etc.).

The C++ language does not define a standard decoration scheme, so each C++ compiler uses its own.

Basic Structure

All mangled C++ names start with ? (question mark). Because all mangled C names start with alphanumeric characters, @ (at-sign) and _ (underscore), C++ names can be distinguished from C names.

The structure of mangled names looks like this:

Prefix ?
Optional: Prefix @?
Qualified name
Type information (see below)

Function

Type information in function names generally looks like this:

Access level and function type
Conditional: CV-class modifier of function, if non-static member function
Function property

Data

Type information in data names looks like this:

Access level and storage class
Data type
CV-class modifier

Elements

Mangled name contains a lot of elements which will be discussed.

Name

Qualified name consists of the following fragments:

Basic name: one of: name fragment or special name
Qualification #1: one of: name fragment, name with template arguments, numbered namespace or back reference
Qualification #2
...
Terminator @

Qualification is written in reversed order. For example myclass::nested::something becomes something@nested@myclass@@.

Name Fragment

A fragment of a name is simply represented as the name with trailing @.

Special Name

Special names are represented as a code with a preceding ?. Most of special names are constructor, destructor, operator and internal symbol. Below is a table for known codes.

Code	Meaning
`0`	Constructor
`1`	Destructor
`2`	`operator new`
`3`	`operator delete`
`4`	`operator =`
`5`	`operator >>`
`6`	`operator <<`
`7`	`operator !`
`8`	`operator ==`
`9`	`operator !=`
`A`	`operator[]`
`B`	`operator returntype`
`C`	`operator ->`
`D`	`operator *`
`E`	`operator ++`
`F`	`operator --`
`G`	`operator -`
`H`	`operator +`
`I`	`operator &`
`J`	`operator ->*`
`K`	`operator /`
`L`	`operator %`
`M`	`operator <`
`N`	`operator <=`
`O`	`operator >`
`P`	`operator >=`
`Q`	`operator,`
`R`	`operator ()`
`S`	`operator ~`
`T`	`operator ^`
`U`	`operator \|`
`V`	`operator &&`
`W`	`operator \|\|`
`X`	`operator *=`
`Y`	`operator +=`
`Z`	`operator -=`
`_0`	`operator /=`
`_1`	`operator %=`
`_2`	`operator >>=`
`_3`	`operator <<=`
`_4`	`operator &=`
`_5`	`operator \|=`
`_6`	`operator ^=`
`_7`	'vftable'
`_8`	'vbtable'
`_9`	'vcall'
`_A`	'typeof'
`_B`	'local static guard'
`_C`	String constant (see below)
`_D`	'vbase destructor'
`_E`	'vector deleting destructor'
`_F`	'default constructor closure'
`_G`	'scalar deleting destructor'
`_H`	'vector constructor iterator'
`_I`	'vector destructor iterator'
`_J`	'vector vbase constructor iterator'
`_K`	'virtual displacement map'
`_L`	'eh vector constructor iterator'
`_M`	'eh vector destructor iterator'
`_N`	'eh vector vbase constructor iterator'
`_O`	'copy constructor closure'
`_P`	'udt returning' (prefix)
`_Q`	Unknown
`_R`	RTTI-related code (see below)
`_S`	'local vftable'
`_T`	'local vftable constructor closure'
`_U`	`operator new[]`
`_V`	`operator delete[]`
`_W`	'omni callsig'
`_X`	'placement delete closure'
`_Y`	'placement delete[] closure'
`__A`	'managed vector constructor iterator'
`__B`	'managed vector destructor iterator'
`__C`	'eh vector copy constructor iterator'
`__D`	'eh vector vbase copy constructor iterator'
`__E`	'dynamic initializer' (Used by CRT entry point to construct non-trivial? global objects)
`__F`	'dynamic atexit destructor` (Used by CRT to destroy non-trivial? global objects on program exit)
`__G`	'vector copy constructor iterator'
`__H`	'vector vbase copy constructor iterator'
`__I`	'managed vector copy constructor iterator'
`__J`	'local static thread guard'
`__K`	user-defined literal operator

Below are the RTTI-related codes (all starting with _R). Some codes have trailing parameters.

Code	Meaning	Trailing Parameters
`_R0`	type 'RTTI Type Descriptor'	Data type type.
`_R1`	'RTTI Base Class Descriptor at (a,b,c,d)'	Four encoded numbers: a, b, c and d.
`_R2`	'RTTI Base Class Array'	None.
`_R3`	'RTTI Class Hierarchy Descriptor'	None.
`_R4`	'RTTI Complete Object Locator'	None.

String constants (all starting with _C@_):

The name corresponds to the value stored in a read-only COMDAT section, in order to avoid duplicate storage of the same string. These sections are generated only if the /GF switch is given to the Microsoft compiler.

The entire name consists of:

_C@_0 or _C@_1. Indicates single- or double-byte characters, respectively.
Length of the string in bytes (encoded number). Includes null terminating character, if any.
A 32-bit value (encoded number). Meaning unknown, presumably a hash of the string.
The bytes of the string (up to the first 32 characters only). For double-byte strings, the bytes are in big-endian order. They can be interpreted as Unicode text using the UTF-16BE encoding. Each byte is encoded as:

Code	meaning
?$xx	2 hex digits encoded as A (which means 0) to P (15).
?0-9	corresponding char in string `",/\:. {ctrl-K}{ctrl-J}'-"`.
?a-p or ?A-P	corresponding ASCII char + hex 80.
(other)	the actual character

Possibly another encoded number, meaning unknown.
Terminating @ character.

For example, the complete name _C@_1CK@EOPGIILJ@?$AAi?$AAn?$AAv?$AAa?$AAl?$AAi?$AAd?$AA?5?$AAn?$AAu?$AAl?$AAl?$AA?5?$AAp?$AAo?$AAi?$AAn?$AAt?$AAe?$AAr?$AA?$AA@ represents the 21-character double-byte string "invalid null pointer\0". All characters have 0 for their high order byte.

It is possible, but very unlikely, for two different strings to be given the same symbol name. The strings would have to have the same first 32 characters, the same length, and the same hash value. The MSVC compiler generates COMDAT sections which tell the linker to "pick any" section with the same symbol name, ignoring the contents. Therefore, the linker will not catch the discrepancy.

Name with Template Arguments

Name fragments starting with ?$ have template arguments. This kind of name looks like this:

Prefix ?$
Name terminated by @
Template argument list

For example, we assume the following prototype.

void __cdecl abc<def<int>,void*>::xyz(void);

The name of this function can be obtained by the following process:

abc<def<int>,void*>::xyz
xyz@ abc<def<int>,void*> @
xyz@ ?$abc@ def<int> void* @ @
xyz@ ?$abc@ V def<int> @ PAX @ @
xyz@ ?$abc@ V ?$def@H@ @ PAX @ @
xyz@?$abc@V?$def@H@@PAX@@

So the mangled name for this function is ?xyz@?$abc@V?$def@H@@PAX@@YAXXZ.

Nested Name

A name fragment starting with ?? denotes a nested name. This is a name inside a local scope which must be exported. Its structure looks like the following:

Optional sequence number for multiple occurrences of same name in the same local scope. This can only happen if the scope is a function, with the name being declared in multiple blocks. It consists of:
- ?
- encoded number.
Prefix ?
C++ Mangled name (so starting with ? again), which names the local scope.

For example, ?nested@??func@@YAXXZ@4HA means variable ?nested@@4HA(int nested) inside ?func@@YAXXZ(void __cdecl func(void)). The UnDecorateSymbolName function returns int `void __cdecl func(void)'::nested for this input.

And ?CONST@?1??main@@9@4HB means constant ?CONST@@4HB (const int CONST) inside main@@9 (main), where the compiler chose the number 2 to associate with it. The UnDecorateSymbolName function returns int const `main'::`2'::CONSTfor this input.

Numbered Namespace

In qualification, a numbered namespace is represented as a preceding ? and an unsigned number. The UnDecorateSymbolName function returns something like '42' for this kind of input.

Exceptionally, if a numbered namespace starts with ?A, it becomes an anonymous namespace ('anonymous namespace').

Back Reference

Decimal digits 0 to 9 refer to the first through 10th shown name fragments. Referred name fragments can be normal name fragments or name fragments with template arguments. For example, in alpha@?1beta@@(beta::'2'::alpha), 0 refers to alpha@, and 1 (not 2) refers to beta@.

Generally, the back reference table is kept during the entire mangling process. This means you can use a back reference to the function name in the function arguments (which appear after the function name). However, in the template argument list, the back reference table is separately created.

For example, assume ?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@ (std::basic_string<unsigned short, std::char_traits<unsigned short>, std::allocator<unsigned short> >). In std::basic_string<...>, 0 refers to basic_string@, 1 refers to ?$char_traits@G@, and 2 refers to std@. This relation doesn't change wherever it appears.

Encoded Number

In name mangling, sometimes numbers must be represented (e.g. array indices). There are simple rules for this:

0 to 9 represents numbers 1 to 10.
num@ represents a hexadecimal number, where num consists of hexadecimal digits A (which means 0) to P (15). For example BCD@ means number 0x123, or 291 in decimal notation.
@ represents the number 0.
If allowed, the prefix ? represents a minus sign. Note that both ?@ and @ represent number 0.

Data Type

The table below shows the various data type and modifiers.

Code	Meaning with no underline	Meaning with preceding underline	Meaning with preceding `$$`
?	Type modifier, Template parameter
$	Type modifier, Template parameter	__w64 (prefix)
0-9	Back reference
A	Type modifier (reference)		Type modifier (function)^[1]
B	Type modifier (volatile reference)		Array type in template
C	signed char		Type modifier
D	char	__int8
E	unsigned char	unsigned __int8
F	short	__int16	Function modifier (managed function [Managed C++ or C++/CLI])^[2]
G	unsigned short	unsigned __int16
H	int	__int32
I	unsigned int	unsigned __int32
J	long	__int64
K	unsigned long	unsigned __int64
L		__int128
M	float	unsigned __int128
N	double	bool
O	long double	Array
P	Type modifier (pointer)
Q	Type modifier (const pointer)	char8_t	Type modifier (rvalue reference)
R	Type modifier (volatile pointer)		Type modifier (volatile rvalue reference)
S	Type modifier (const volatile pointer)	char16_t
T	Complex Type (union)		std::nullptr_t
U	Complex Type (struct)	char32_t
V	Complex Type (class)		Empty type parameter pack
W	Enumerate Type (enum)	wchar_t
X	void, Complex Type (coclass)	Complex Type (coclass)
Y	Complex Type (cointerface)	Complex Type (cointerface)
Z	... (ellipsis)		End template parameter pack

^ Visible when function is passed to typeid operator. Uses pointer type syntax.

^ See Function section.

The code X represents void when it appears in as a return type or pointer type, otherwise it indicates a cointerface. The code Z (meaning ellipsis) appears only at the end of an argument list.

Primitive & Extended Type

Primitive types are represented as one character, and extended types are represented as one character with a preceding _.

Back Reference

Decimal digits 0 to 9 refer to the first through 10th shown type in the argument list. (This means return type cannot be a referent.) Back references can refer to any non-primitive type, including an extended type. Of course back references can refer to prefixed types such as PAVblah@@(class blah *), but cannot refer to prefixless types — say, Vblah@@ in PAVblah@@.

With back references for names, in a template argument list the back reference table is separately created. The function argument list has no such scoping rule, though it can be confuseing sometimes. For example, assume P6AXValpha@@Vbeta@@@Z(void (__cdecl*)(class alpha, class beta)) is the first shown non-primitive type. Then 0 refers to Valpha@@, 1 refers to Vbeta@@, and finally 2 refers to 'function pointer'.

Type Modifier

A type modifier is used to make a pointer or reference. Type modifiers look like this:

Modifier type
Optional: Managed C++ property ($A for __gc, $B for __pin)
CV-class modifier
Optional: Array property (not for functions)
- Prefix Y
- Encoded unsigned number of dimensions
- Array indices as encoded unsigned numbers, one for each dimension
Referred type info (see below)

There are ten types of type modifier:

	none	const	volatile	const volatile
Pointer	`P`	`Q`	`R`	`S`
Reference	`A`		`B`
Rvalue Reference	`$$Q`		`$$R`
none	`?`, `$$C`

For normal types, referred type info is data type. For functions, it looks like the following. (It depends on the CV-class modifier)

Conditional: CV-class modifier, if member function
Function property

Complex Type (union, struct, class, coclass, cointerface)

Complex types look like this:

Kind of complex type (T, U, V, ...)
Qualification without a basic name

Enumerated Type (enum)

An enumerated type starts with the prefix W. It looks like this:

Prefix W
Real type for enum
Qualification without basic name

The real type for an enum is represented as follows:

Code	Corresponding Real Type
`0`	char
`1`	unsigned char
`2`	short
`3`	unsigned short
`4`	int (generally normal "enum")
`5`	unsigned int
`6`	long
`7`	unsigned long

Note that in modern versions of Visual Studio, it will usually (if not always) generate enum symbols with a type symbol of W4, regardless of the real underlying type. Note that this doesn't affect the underlying type in any way, but appears to be for the sake of compiler simplicity.

Array

An array (not pointer to array) starts with the prefix _O. It looks like this:

Prefix _O
CV-class modifier
Data type within array

You can use multi-dimensional array like _OC_OBH, but only the outermost CV-class modifier is affected. (In this case _OC_OBH means int volatile [][], not int const [][])

Template Parameter

Template parameters are used to represent type and non-type template arguments. They can be used only in a template argument list.

The table below is a list of known template parameters. a, b, c represent encoded signed numbers, and x, y, z represent encoded unsigned numbers.

Code	Meaning
`?x`	anonymous type template parameter x (`'template-parameter-x'`)
`$0a`	integer value a ^[3]
`$1s`	constant pointer to mangled symbol s ^[4]
`$212`	real value a × 10^b-k+1, where k is number of decimal digits of a
`$Da`	anonymous type template parameter a (`'template-parametera'`)
`$F12`	2-tuple {a,b} (unknown)
`$G123`	3-tuple {a,b,c} (unknown)
`$H12`	constant pointer to method s (base offset? a, numeric)
`$I123`	constant pointer to method s (offsets? a and b, numeric)
`$J123`	(unknown)
`$Qa`	anonymous non-type template parameter a (`'non-type-template-parametera'`)
`$S`	empty non-type parameter pack

^ Pointer to member variable v in X is represented as the integer offsetof(X, v)

^ The pointer syntax is also used for lvalue references and pointers to member functions.

Argument List

An argument list is a sequence of data types. The list can be one of the following:

X (means void, also terminating list)
arg1 arg2 ... argN @ (meaning a normal list of data types. Note that N can be zero)
arg1 arg2 ... argN Z (meaning a list with trailing ellipsis)

Template Argument List

A template argument list is the same as an argument list, except that template parameters can be used.

CV-class Modifier

The following table shows CV-class modifiers.

	Variable				Function
	none	const	volatile	const volatile	Function
none	`A`	`B`, `J`	`C`, `G`, `K`	`D`, `H`, `L`	`6`, `7`
__based()	`M`	`N`	`O`	`P`	`_A`, `_B`
Member	`Q`, `U`, `Y`	`R`, `V`, `Z`	`S`, `W`, `0`	`T`, `X`, `1`	`8`, `9`
__based() Member	`2`	`3`	`4`	`5`	`_C`, `_D`

CV-class modifier can have zero or more prefixes:

Prefix	Meaning
`E`	type __ptr64
`F`	__unaligned type
`G`	type &
`H`	type &&
`I`	type __restrict

Modifiers have trailing parameters as follows:

Conditional: Qualification without basic name, if member
Conditional: CV-class modifier of function, if member function
Conditional: __based() property, if used

A CV-class modifier is usually used in reference/pointer types, but it is also used in other places with some restrictions:

Modifier of function: can only have const, volatile attribute, optionally with prefixes.
Modifier of data: cannot have function property.

__based() Property

__based() property represents Microsoft's __based() attribute extension to C++. This property can be one of the following:

0 (means __based(void))
2name (means __based(name), where name is a qualification without a basic name)
5 (means no __based())

Function Property

A function property represents the prototype of a function. It looks like this:

Calling convention of function
Data type of returned value, or @ for void
Argument list
throw() attribute

The following table shows calling conventions of functions:

Code	Exported?	Calling Convention
`A`	No	__cdecl
`B`	Yes	__cdecl
`C`	No	__pascal
`D`	Yes	__pascal
`E`	No	__thiscall
`F`	Yes	__thiscall
`G`	No	__stdcall
`H`	Yes	__stdcall
`I`	No	__fastcall
`J`	Yes	__fastcall
`K`	No	none
`L`	Yes	none
`M`	No	__clrcall

The argument list for the throw() attribute is the same as any other argument list, but if this list is Z, it means there is no throw() attribute. If you want to represent throw() you have to use @ to terminate the list.

Function

Typical type information in a function name looks like this:

Optional: Prefix $$F (means function is managed, either as Managed C++ or C++/CLI)
Optional: Prefix _ (means __based() property is used)
Access level and function type
Conditional: __based() property, if used
Conditional: adjustor property (as encoded unsigned number), if thunk function
Conditional: CV-class modifier of function, if non-static member function
Function property

The table below shows codes for access level and function type:

	none	static	virtual	thunk
private:	`A`, `B`	`C`, `D`	`E`, `F`	`G`, `H`
protected:	`I`, `J`	`K`, `L`	`M`, `N`	`O`, `P`
public:	`Q`, `R`	`S`, `T`	`U`, `V`	`W`, `X`
none	`Y`, `Z`

This kind of thunk function is always virtual, and used to represent the logical this adjustor property, which means an offset to the true this value in some multiple inheritance situations.

Data

Type information in a data name looks like this:

Access level
Data type
CV-class modifier

The table below shows codes for access level:

Code	Meaning
0	Private static member
1	Protected static member
2	Public static member
3	Normal variable
4	Normal variable

The CV-class modifier should be appropriate for data (not a 'function' modifier).

Thunk Function

There are several kinds of thunk function.

References

Kang Seonghoon. "Microsoft C++ Name Mangling Scheme". Retrieved 2008-10-05.

External links

Calling conventions for different C++ compilers by Agner Fog contains a description of the name mangling schemes for Visual C++ x86 and x64 (pp. 28–33 in the 2011-06-08 version)
Microsoft C++ Name Mangling Scheme
C++ Name Mangling/Demangling
Geoff Chappell's results
__unDname Wine's __unDname function implementation
PHP UnDecorateSymbolName A PHP Script that demangles Microsoft C++ Names by Timo Stripf
Undname Convert a decorated name to its undecorated form
visual_studio_mangling.py A Python script that demangles Microsoft C++ Names
undname.cA C function that demangles Microsoft C++ Names, found in the zip file downloaded from the Python pdbparse-1.2 package. Package also contains Python code to examine a PDB file.

[1]

[2]

[3]

[4]

Overview

Basic Structure

Function

Data

Elements

Name

Name Fragment

Special Name

Name with Template Arguments

Nested Name

Numbered Namespace

Back Reference

Encoded Number

Data Type

Primitive & Extended Type

Back Reference

Type Modifier

Complex Type (union, struct, class, coclass, cointerface)

Enumerated Type (enum)

Array

Template Parameter

Argument List

Template Argument List

CV-class Modifier

__based() Property

Function Property

Function

Data

Thunk Function

See also

References

External links