mirror of
https://github.com/Noratrieb/womangling.git
synced 2026-01-14 08:55:03 +01:00
233 lines
8.5 KiB
HTML
233 lines
8.5 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="UTF-8" />
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
|
<link rel="stylesheet" href="index.css" />
|
|
<title>Womangling</title>
|
|
</head>
|
|
<body>
|
|
<main class="main">
|
|
<div class="content" id="content-area">
|
|
<h1>
|
|
<a class="root-link" href=".">Learn C++ Itanium Symbol Mangling</a>
|
|
</h1>
|
|
<h2>Lesson 1: Basics</h2>
|
|
<noscript>
|
|
<p>
|
|
Warning: You have JavaScript disabled. While the content is still
|
|
viewable, interactive exercises will not work. Consider enabling
|
|
JavaScript for this website.
|
|
</p>
|
|
</noscript>
|
|
|
|
<section data-step="0" class="step">
|
|
<p>
|
|
After getting an understanding of how this guide works and learning
|
|
about the not-mangling of C identifiers, we are ready to dive into
|
|
C++.
|
|
</p>
|
|
<p>
|
|
Every C++ mangled symbol is prefixed with the string
|
|
<code>_Z</code>. This signifies that this is a mangled C++ symbol.
|
|
<code>_Z</code> starts with an underscore followed by an uppercase
|
|
letter. All symbols of that structures are reserved by the C
|
|
standard and cannot be used by programs. This ensures that there are
|
|
no name collisions with normal C functions and mangled C++
|
|
functions.
|
|
</p>
|
|
<p>
|
|
After that, the name of the entity is stored. For now, we will only
|
|
look at functions. For functions, the function type is appended to
|
|
the name to get the full symbol.
|
|
</p>
|
|
<pre class="code">
|
|
void f() {}
|
|
</pre>
|
|
<p>
|
|
This empty function will be mangled to <code>_Z1fv</code>. The
|
|
<code>1f</code> signifies the name (we will look at this in more
|
|
detail later in this lesson) and the <code>v</code> signifies the
|
|
function type.
|
|
</p>
|
|
<p>
|
|
We will see the <code>v</code> function type a lot in the rest of
|
|
this guide. It stands for a function that takes no arguments.
|
|
</p>
|
|
<div class="quiz-section">
|
|
<p>
|
|
Which of these symbols cannot possibly be a mangled C++ symbol?
|
|
Answer with the name of the symbol.
|
|
</p>
|
|
<ul>
|
|
<li><code>_ZN3FooIA4_iE3barE</code></li>
|
|
<li><code>_ZN6System5Sound4beepEv</code></li>
|
|
<li><code>_RN3FooIA4_iE3barE</code></li>
|
|
</ul>
|
|
<form
|
|
data-challenge="1"
|
|
data-answer="_RN3FooIA4_iE3barE"
|
|
data-hint="Look at the prefix"
|
|
>
|
|
<input class="quiz-input" />
|
|
<button
|
|
data-challenge-submit="1"
|
|
class="submit-challenge"
|
|
type="submit"
|
|
>
|
|
Answer
|
|
</button>
|
|
<div class="error"></div>
|
|
</form>
|
|
</div>
|
|
</section>
|
|
<section data-step="1" class="step">
|
|
<p>
|
|
For names, there are two cases to consider for now. Either the name
|
|
is in the global scope, or it is in a namespace.
|
|
</p>
|
|
<p>For global names, we just prefix the name with its length.</p>
|
|
<pre class="code">
|
|
void hello_world() {}
|
|
</pre>
|
|
<p>
|
|
This will therefore get mangled as <code>_Z11hello_worldv</code>.
|
|
The length of <code>hello_world</code> is 11, so we concatenate
|
|
<code>11</code> and <code>hello_world</code>. This entire thing is
|
|
then appended to the previously mentioned prefix <code>_Z</code> and
|
|
then we add the type, which is just <code>v</code> here, at the end.
|
|
</p>
|
|
<div class="quiz-section">
|
|
<p>What is the mangling of the following identifier?</p>
|
|
<pre class="code">
|
|
void meow() {}
|
|
</pre>
|
|
<form
|
|
data-challenge="2"
|
|
data-answer="_Z4meowv"
|
|
data-hint="Remember the prefix and function type"
|
|
>
|
|
<input class="quiz-input" />
|
|
<button
|
|
data-challenge-submit="2"
|
|
class="submit-challenge"
|
|
type="submit"
|
|
>
|
|
Answer
|
|
</button>
|
|
<div class="error"></div>
|
|
</form>
|
|
</div>
|
|
</section>
|
|
<section data-step="2" class="step">
|
|
<p>
|
|
Functions that are declared in a namespace get a bit more
|
|
complicated. They are referred to as <i>nested names</i>, because
|
|
they are <i>nested</i> in a namespace. They can also be nested in
|
|
multiple namespaces, the encoding is the same.
|
|
</p>
|
|
<p>
|
|
Nested names start with an <code>N</code> and end with an
|
|
<code>E</code> (the <code>E</code> stands for "end" and is commonly
|
|
used to end sequences). Between those two letters, the hierarchy of
|
|
the namespace is represented by putting on namespace name after
|
|
another, with the function name last. Every name has the leading
|
|
length and then the name itself, just like with global names.
|
|
</p>
|
|
<pre class="code">
|
|
namespace outer {
|
|
void inner() {}
|
|
}
|
|
</pre>
|
|
<p>
|
|
That means that this function will be mangled as
|
|
<code>_ZN5outer5innerEv</code>. We can decode this into the
|
|
following structure
|
|
</p>
|
|
<ul>
|
|
<li><code>_Z</code>: Prefix</li>
|
|
<li><code>N</code>: Start of nested name</li>
|
|
<li>
|
|
<code>5outer</code>: Outer namespace, name prefixed by length
|
|
</li>
|
|
<li>
|
|
<code>5inner</code>: Inner function, name prefixed by length
|
|
</li>
|
|
<li><code>E</code>: End of nested name</li>
|
|
<li><code>v</code>: Function type</li>
|
|
</ul>
|
|
<p>Nested namespaces follow the same structure.</p>
|
|
<pre class="code">
|
|
namespace a {
|
|
namespace b {
|
|
namespace c {
|
|
void inner() {}
|
|
}
|
|
}
|
|
}
|
|
</pre>
|
|
<p>
|
|
This function will mangle as <code>_ZN1a1b1c5innerEv</code>. We get
|
|
all the concatenated names as <code>1a1b1c5inner</code>, with the
|
|
previously mentioned characters around them.
|
|
</p>
|
|
|
|
<div class="quiz-section">
|
|
<p>What is the mangling of the following identifier?</p>
|
|
<pre class="code">
|
|
namespace cats {
|
|
namespace like {
|
|
void meow() {}
|
|
}
|
|
}
|
|
</pre>
|
|
<form
|
|
data-challenge="3"
|
|
data-answer="_ZN4cats4like4meowEv"
|
|
data-hint="Remember the prefix and function type, and don't forget to wrap it in the nested start and end"
|
|
>
|
|
<input class="quiz-input" />
|
|
<button
|
|
data-challenge-submit="3"
|
|
class="submit-challenge"
|
|
type="submit"
|
|
>
|
|
Answer
|
|
</button>
|
|
<div class="error"></div>
|
|
</form>
|
|
</div>
|
|
</section>
|
|
<section data-step="3" class="step">
|
|
<p>
|
|
Good job! You have successfully answered all the question and now
|
|
know the basic makeup of an Itanium-mangled C++ symbol.
|
|
</p>
|
|
<p>
|
|
In the next lesson, we will use this knowledge to look at basic
|
|
function types beyond <code>v</code>. Mangling function types is
|
|
important for function overloading, but I don't want to overload you
|
|
with information, so feel free to take a break and let the previous
|
|
knowledge sink in.
|
|
</p>
|
|
<p class="lesson-last-paragraph">
|
|
If you want to try out more code and look at its mangling, I
|
|
recommend using Compiler Explorer on
|
|
<a href="https://godbolt.org">godbolt.org</a>. Under "Output", you
|
|
can uncheck the box to demangle identifiers to see the mangled
|
|
identifiers for any C++ code you enter on the left.
|
|
</p>
|
|
<div class="center">
|
|
<a href="lesson-2.html" class="action-button">
|
|
Lesson 2: Arguments
|
|
</a>
|
|
</div>
|
|
</section>
|
|
</div>
|
|
</main>
|
|
<script>
|
|
window.LESSON = 1;
|
|
</script>
|
|
<script type="module" src="lessons.js"></script>
|
|
</body>
|
|
</html>
|