Talk:Visual C++ name mangling

From Wikiversity
Latest comment: 2 years ago by PhilipNO in topic Access types
Jump to navigation Jump to search

Data Type[edit source]

It appears that $$B may be a type modifier, but I'm unsure of the specifics, or whether it's actually used by the compiler.

Test Code UNDNAME 0x2000 result Notes
$$BA$CY@@ $CY
$$BB$CY@@ $CY
$$BL$CY@@ $CY
$$BLHello@@ Hello
$$BPAH int * Identical to PAH
$$BP$CGH int volatile % Identical to P$CGH
$$BPCY3HI@I@3H@H int (volatile *)[120][8][4][7] Identical to PCY3HI@I@3H@H
$$BY1HI@I@H int [120][8] Interestingly, Y1HI@I@H by itself isn't a valid type symbol.

When run through UNDNAME (without flags), ?a@@3$$BY1HI@I@HB expands to int (const a)[120][8].

$$B_OA$CY@@ CY[] CY appears to be treated as the type.
$$B_OB$CY@@ CY const[]

I'm not sure, but I think $$B might be a modifier that indicates that the following is to be interpreted as a raw type regardless of anything else, or it might tell the undecorator not to try to expand numbers as abbreviated names. Or it might do something else, I honestly don't know.


[Tested with Visual Studio 2010 UNDNAME. I would test with Studio 2015, but I'm unable to do so directly at the moment, and can't use the online compiler since I don't know how to use __unDName() or __unDNameHelper() in Studio 2015.


It also appears that the code $$F is associated with Managed Extensions for C++ (and possibly C++/CLI), although I'm unsure of its exact meaning.

// cli_ptr_clrold.cpp
// Managed Extensions for C++
// Compile with "/c /clr:oldSyntax" on VS 2013 or earlier.

__gc class GC {
    int i;
  public:
    int getI() { return i; }
};

/* GC member functions:
 *
 * Constructor: ??0GC@@$$FQ$AAM@XZ (public: __clrcall GC::GC(void))
 * getI():      ?getI@GC@@$$FQ$AAMHXZ (public: int __clrcall GC::getI(void))
 */

GC* func() { return new GC; }
/* Two symbols generated for func():
 *
 *   ?func@@$$FYMP$AAVGC@@XZ (class GC ^ __clrcall func(void))
 *   ?func@@YMP$AAVGC@@XZ (class GC ^ __clrcall func(void))
 *
 * One additional MEP symbol generated:
 *   __mep@?func@@$$FYMP$AAVGC@@XZ ([MEP] class GC ^ __clrcall func(void))
 * It binds to the $$F version of func()'s symbol.
 */

GC& cnuf() { return *(new GC); }
/* Two symbols generated for cnuf():
 *
 *   ?cnuf@@$$FYMA$AAVGC@@XZ (class GC % __clrcall cnuf(void))
 *   ?cnuf@@YMA$AAVGC@@XZ (class GC % __clrcall cnuf(void))
 *
 * One additional MEP symbol generated:
 *   __mep@?cnuf@@$$FYMA$AAVGC@@XZ ([MEP] class GC % __clrcall cnuf(void))
 * It, too, binds to the $$F version of the actual function.
 */

Similarly, C++/CLI uses $$F, generating one signature with it and one without.

// cli_ptr_clr.cpp
// C++/CLI
// Compile with "/c /clr" on VS 2005 or later.

ref class REF {
        int i;
    public:
        int getI() { return i; }
};

/* REF member functions:
 *
 * Constructor: ??0REF@@$$FQ$AAM@XZ (public: __clrcall REF::REF(void))
 * getI():      ?getI@REF@@$$FQ$AAMHXZ (public: int __clrcall REF::getI(void))
 */

REF^ funcMP() { return gcnew REF; }
/* Two symbols generated for funcMP():
 *
 *   ?funcMP@@$$FYMP$AAVREF@@XZ (class REF ^ __clrcall funcMP(void))
 *   ?funcMP@@YMP$AAVREF@@XZ (class REF ^ __clrcall funcMP(void))
 *
 * One additional MEP symbol generated:
 *   __mep@?funcMP@@$$FYMP$AAVREF@@XZ ([MEP] class REF ^ __clrcall funcMP(void))
 * It binds to the $$F version of funcMP()'s symbol.
 */

REF% funcMR() { return *(gcnew REF); }
/* Two symbols generated for funcMR():
 *
 *   ?funcMR@@$$FYMA$AAVREF@@XZ (class REF % __clrcall funcMR(void))
 *   ?funcMR@@YMA$AAVREF@@XZ (class REF % __clrcall funcMR(void))
 *
 * One additional MEP symbol generated:
 *   __mep@?funcMR@@$$FYMA$AAVREF@@XZ ([MEP] class REF % __clrcall funcMR(void))
 * It binds to the $$F version of func()'s symbol.
 */

I'm unsure of what exactly this means. I suspect it's part of the "IJW" mechanism, and allows native C++ code to link to the C++/CLI functions, but that's just a guess. 24.222.178.254 (discuss) 16:31, 1 May 2016 (UTC)Reply


Okay, I've found what $$F is for. It indicates that a function symbol is for the managed entry point; the one without $$F is the native entry point (for mixed native C++ and C++/CLI code, and appears to simply pass the call to the __mep version.

To test this, I made a simplified version of the above C++/CLI, that only exports funcMP():

// cli_ptr_funcmp.cpp

ref class REF {
    int i;
  public:
    int getI();
    REF();
};

REF^ funcMP() { return gcnew REF; }

I then compiled it with cl /c /clr cli_ptr_funcmp.cpp /FAs, and obtained the following ASM file:

; Listing generated by Microsoft (R) Optimizing Compiler Version 16.00.40219.01 

; Generated by VC++ for Common Language Runtime
.file "E:\Programs\mangler\cli_ptr_funcmp.cpp"
	.bss
.local	$T6307,0
; Function compile flags: /Odtp
; File e:\programs\mangler\cli_ptr_funcmp.cpp
	.text
.global	?funcMP@@$$FYMP$AAVREF@@XZ			; funcMP
?funcMP@@$$FYMP$AAVREF@@XZ:				; funcMP
;	.proc.def	D:P()

; Function Header:
; max stack depth = 1
; function size = 8 bytes
; local varsig tk = 0x11000001 
; Exception Information:
; 0 handlers, each consisting of filtered handlers

;	.local.i4 0,"$T6306" SIG: class (token:0x34DB71)

;	.proc.beg

; 10   : REF^ funcMP() { return gcnew REF; }

	newobj		??0REF@@$$FQ$AAM@XZ
	stloc.0				; $T6306
	ldloc.0				; $T6306
	ret		
 .end ?funcMP@@$$FYMP$AAVREF@@XZ			; funcMP
;	.proc.end.mptr
_TEXT	ENDS
PUBLIC	__mep@?funcMP@@$$FYMP$AAVREF@@XZ
PUBLIC	?funcMP@@YMP$AAVREF@@XZ				; funcMP
;	COMDAT __mep@?funcMP@@$$FYMP$AAVREF@@XZ
data	SEGMENT
__mep@?funcMP@@$$FYMP$AAVREF@@XZ TOKEN 06000003
; Function compile flags: /Odtp
data	ENDS
;	COMDAT ?funcMP@@YMP$AAVREF@@XZ
_TEXT	SEGMENT
?funcMP@@YMP$AAVREF@@XZ PROC				; funcMP, COMDAT
	jmp	DWORD PTR __mep@?funcMP@@$$FYMP$AAVREF@@XZ
?funcMP@@YMP$AAVREF@@XZ ENDP				; funcMP
_TEXT	ENDS
END

From this, compiling cli_ptr_clr.cpp above with /clr:pure (which only generated $$F symbols, and no __meps), compiling a native function (bool natBool() { return false; }) with /clr (which generated the symbols ?natBool@@$$FYA_NXZ, ?natBool@@YA_NXZ, and __mep@?natBool@@$$FYA_NXZ), and the MSDN "Double Thunking" article, it appears that $$F indicates that a function is managed. It also appears that when managed and native versions of a function are generated, the native one will be a stub that uses the __mep symbol to call the managed one.

That's one symbol down, one to go! 24.222.178.254 (discuss) 18:47, 1 May 2016 (UTC)Reply


Did more testing with UNDNAME, and it appears that the "managed" prefix comes before the "based" prefix.

"?a@@$$F_Y2X@@AXXZ" == "void __cdecl __based(X) a(void)"
"?a@@$$F_Y0AXXZ" == "void __cdecl __based(void) a(void)"
"?a@@_$$FY0AXXZ" == " ?? const volatile ?? ::XZ::& ?? a( ?? ) throw( ?? )"

Updated the main page.

24.222.178.254 (discuss) 19:54, 1 May 2016 (UTC)Reply

Access types[edit source]

Here's a more exhaustive list of the access types taken from Clang's MicrosoftMangle.cpp. Would be great to incorporate these and the other information found in Clang's mangler into the wiki, but that would almost require a total rewrite of the page.
I also think these should be called Declaration Classes instead of Access Types.

text
0  # (1) private static member
1  # (1) protected static member
2  # (1) public static member
3  # (1) global variable
4  # (1) static local variable
5  # (1) static guard variable
6  # (2) ??_7 / 'vftable'
7  # (2) ??_8 / 'vbtable'
8  # (3) ??_R0 / 'RTTI Type Descriptor'
9  # (3) non-mangled extern "C" functions
A  # (4) private: near function
B  # (4) private: far function
C  # (5) private: static near function
D  # (5) private: static far function
E  # (4) private: virtual near function
F  # (4) private: virtual far function
I  # (4) protected: near function
J  # (4) protected: far function
K  # (5) protected: static near function
L  # (5) protected: static far function
M  # (4) protected: virtual near function
N  # (4) protected: virtual far function
Q  # (4) public: near function
R  # (4) public: far function
S  # (5) public: static near function
T  # (5) public: static far function
U  # (4) public: virtual near function
V  # (4) public: virtual far function
Y  # (5) global near function
Z  # (5) global far function


Trailing information for the different access types:
(1) <Data type><CV-class modifier>
(2) <CV-class modifier><optional: Qualified name without basic name>@
    Note: According to clang the CV-class modifier should always be "B" for vftables and vbtables.
(3) None.
(4) See the wiki's resource page for information. These are instance types and therefore have a CV-class modifier.
(5) See the wiki's resource page for information. These are not instance types and therefore does not have a CV-class modifier.


--PhilipNO (discusscontribs) 03:17, 6 June 2021 (UTC)Reply