Compare commits

...

2 commits

Author SHA1 Message Date
a54248f099 im gaing
Some checks failed
test / test (push) Has been cancelled
2025-10-01 21:09:24 +02:00
14053a50b9 improve 2025-10-01 08:16:41 +02:00

View file

@ -54,35 +54,22 @@
</div> </div>
</div> </div>
</section> </section>
<section data-markdown>
<textarea data-template>
# speed 🚀
</textarea>
</section>
<section> <section>
<h2>behind cargo build</h2> <h2>behind cargo build</h2>
<p>cargo vs rustc</p> <p>cargo vs rustc</p>
<div style="display: grid; grid-template-columns: 2fr 1fr 2fr"> <div>
<pre><code data-trim> asciinema of cargo build -v
$ cargo build -v
</code></pre>
<div style="display: flex; align-items: center">cargo</div>
<div style="display: flex; flex-direction: column; align-items: flex-start">
<div>rustc (clap)</div>
<div>rustc (tokio)</div>
<div>rustc (tracing)</div>
<div>rustc (yourcrate)</div>
</div>
</div> </div>
</section> </section>
<section> <section>
<h2>what does rustc like, do?</h2> <h2>what does rustc like, do?</h2>
<h4>a quick overview of the compilation phases</h4> <h4>a quick overview of the compilation phases</h4>
</section> </section>
<section>
<h2>it all starts at the source</h2>
<pre><code data-trim>
#[no_mangle]
pub fn add(a: u8, b: u8) -> u8 {
a.wrapping_add(b)
}
</code></pre>
</section>
<section> <section>
<h2>the frontend and the backend</h2> <h2>the frontend and the backend</h2>
<div class="mermaid"> <div class="mermaid">
@ -98,36 +85,35 @@
</div> </div>
</section> </section>
<section> <section>
<h2>lex, parse, resolve, typecheck, all these fancy things</h2> <h2>it all starts at the source</h2>
<div class="mermaid"> <pre><code data-trim>
<pre> pub fn add(a: u8, b: u8) -> u8 {
%%{init: {'theme': 'dark', 'themeVariables': { 'darkMode': true, 'fontSize': '25px' }}}%% a.wrapping_add(b)
flowchart TB }
function --> return </code></pre>
function --> params
params --> a_def[a]
params --> b_def[b]
function --> body
body --> cl["method call"]
cl --> a_use[a]
cl --> wrapping_add
cl --> b_use[b]
</pre>
</div>
</section> </section>
<section> <section>
<h2>so you want to compile a crate</h2> <h2>it gets processed</h2>
<pre><code data-trim>
#[attr = MacroUse {arguments: UseAll}]
extern crate std;
#[prelude_import]
use std::prelude::rust_2024::*;
fn add(a: u8, b: u8) -> u8 { a.wrapping_add(b) }
</code></pre>
</section>
<section>
<h2>until it doesn't even look like Rust anymore</h2>
<p>MIR</p> <p>MIR</p>
<img src="add-runtime-mir.svg" /> <img src="add-runtime-mir.svg" />
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## so you want to compile a crate ## further going to LLVM IR
LLVM IR
``` ```
; meow::add ; meow::add
define noundef i8 @add(i8 noundef %a, i8 noundef %b) unnamed_addr #0 { define noundef i8 @add(i8 noundef %a, i8 noundef %b) #0 {
start: start:
%_0 = add i8 %b, %a %_0 = add i8 %b, %a
ret i8 %_0 ret i8 %_0
@ -137,8 +123,7 @@
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## so you want to compile a crate ## and then you're done
Assembly
``` ```
&lt;add&gt;: &lt;add&gt;:
@ -149,9 +134,7 @@
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## but compiling a ton of crates can't be that simple! ## ok but why does my program compile so slowly now?
- yes
</textarea> </textarea>
</section> </section>
<section data-markdown> <section data-markdown>
@ -161,6 +144,29 @@
- the how, what, and when of invoking LLVM - the how, what, and when of invoking LLVM
</textarea> </textarea>
</section> </section>
<section>
<h2>a crate - the compilation unit</h2>
<p>quite big</p>
<p>in C it's just a single file</p>
</section>
<section>
<h2>a codegen unit</h2>
<p>LLVM is single-threaded</p>
<p>rustc: hi LLVM, look we are like a C file, now be fast</p>
<p>~1-256 depending on size and configuration (⚙️)</p>
<div class="mermaid">
<pre>
%%{init: {'theme': 'dark', 'themeVariables': { 'darkMode': true, 'fontSize': '25px' }}}%%
flowchart LR
crate
crate --> cgu1["Codegen-Unit 1"]
crate --> cgu2["Codegen-Unit 2"]
crate --> cgu3["Codegen-Unit 3"]
crate --> cgu4["Codegen-Unit 4"]
</pre>
</div>
</section>
<section> <section>
<h2>codegen units</h2> <h2>codegen units</h2>
<pre><code> <pre><code>
@ -267,6 +273,14 @@ fn main() { math::add() }
</pre> </pre>
</div> </div>
</section> </section>
<section data-markdown>
<textarea data-template>
# so compile times just depend on the amount of functions?
- yes...
- but not source functions!
</textarea>
</section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## generics ## generics
@ -358,6 +372,19 @@ fn main() { math::add() }
</pre> </pre>
</div> </div>
</section> </section>
<section data-markdown>
<textarea data-template>
# generics are slow to compile
- spend N times optimizing the function
- and there's duplicate instances!
</textarea>
</section>
<section data-markdown>
<textarea data-template>
# and the duplicates get worse
</textarea>
</section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## inlining ## inlining
@ -473,17 +500,29 @@ fn main() { math::add() }
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## `#[inline]` ## `#[inline]` (⚙️)
- `#[inline]` enables cross-crate inlining of non-generic functions - for non-generic functions
- for very small functions, this happens automatically - for very small functions, this happens automatically
- for other functions, it doesn't, because it would be slow (try with `-Zcross-crate-inline-threshold=always`) - for other functions, it doesn't, because it would be slow
- don't over-apply it in a library, but also don't forget about it - don't over-apply it in a library, but also don't forget about it
- benchmark! - benchmark!
</textarea> </textarea>
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## but i want maximal performance... ## being lazy has advantages
- that's why i wrote most of this talk last week
- `#[inline]` means that the function is *never* instantiated if it's never used!
https://blog.rust-lang.org/inside-rust/2025/07/15/call-for-testing-hint-mostly-unused/
</textarea>
</section>
<section data-markdown>
<textarea data-template>
## but performance is great, i love performance
- its ok i can wait forever
</textarea> </textarea>
</section> </section>
<section data-markdown> <section data-markdown>
@ -586,20 +625,40 @@ fn main() { math::add() }
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## inlining across codegen units in the same crate ## there was some LTO all along
- ThinLTO across different codegen units by default - in release mode, automatic ThinLTO across codegen units in the same crate
</textarea> </textarea>
</section> </section>
<section data-markdown> <section data-markdown>
<textarea data-template> <textarea data-template>
## `Cargo.toml` config ## how do i make my program run quickly?
```toml - let the compiler inline functions
[profile.release] - libraries: remember `#[inline]`
lto = "thin" - binaries: you want LTO
codegen-units = 1 - really, it really needs to inline your functions
``` - without it, it's so over
- and read this: https://nnethercote.github.io/perf-book
</textarea>
</section>
<section data-markdown>
<textarea data-template>
## how do i make my program compile quickly?
- reduce the amount and size of functions
- importantly: in LLVM IR, not necessarily source!
- duplicate and frequent instantiations are bad
- and read this:
- https://corrode.dev/blog/tips-for-faster-rust-compile-times
- https://doc.rust-lang.org/nightly/cargo/guide/build-performance.html
</textarea>
</section>
<section data-markdown>
<textarea data-template>
## and both? 🥺👉👈
- no
</textarea> </textarea>
</section> </section>
<section data-markdown> <section data-markdown>