blog/posts/index.xml
2022-07-22 14:40:23 +00:00

161 lines
No EOL
20 KiB
XML

<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts on nilstriebs blog</title><link>/posts/</link><description>Recent content in Posts on nilstriebs blog</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Fri, 22 Jul 2022 00:00:00 +0000</lastBuildDate><atom:link href="/posts/index.xml" rel="self" type="application/rss+xml"/><item><title>Box Is a Unique Type</title><link>/posts/box-is-a-unique-type/</link><pubDate>Fri, 22 Jul 2022 00:00:00 +0000</pubDate><guid>/posts/box-is-a-unique-type/</guid><description>We have all used Box&amp;lt;T&amp;gt; before in our Rust code. It&amp;rsquo;s a glorious type, with great ergonomics and flexibitility. We can use it to put our values on the heap, but it can do even more than that!
struct Fields { a: String, b: String, } let fields = Box::new(Fields { a: &amp;#34;a&amp;#34;.to_string(), b: &amp;#34;b&amp;#34;.to_string() }); let a = fields.a; let b = fields.b; This kind of partial deref move is just one of the spectacular magic tricks box has up its sleeve, and they exist for good reason: They are very useful.</description><content>&lt;p>We have all used &lt;code>Box&amp;lt;T&amp;gt;&lt;/code> before in our Rust code. It&amp;rsquo;s a glorious type, with great ergonomics
and flexibitility. We can use it to put our values on the heap, but it can do even more
than that!&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-rust" data-lang="rust">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">struct&lt;/span> &lt;span style="color:#a6e22e">Fields&lt;/span> {
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> a: String,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> b: String,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>}
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">let&lt;/span> fields &lt;span style="color:#f92672">=&lt;/span> Box::new(Fields {
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> a: &lt;span style="color:#e6db74">&amp;#34;a&amp;#34;&lt;/span>.to_string(),
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> b: &lt;span style="color:#e6db74">&amp;#34;b&amp;#34;&lt;/span>.to_string()
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>});
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">let&lt;/span> a &lt;span style="color:#f92672">=&lt;/span> fields.a;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">let&lt;/span> b &lt;span style="color:#f92672">=&lt;/span> fields.b;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This kind of partial deref move is just one of the spectacular magic tricks box has up its sleeve,
and they exist for good reason: They are very useful. Sadly we have not yet found a way to generalize all
of these to user types as well. Too bad!&lt;/p>
&lt;p>Anyways, this post is about one particularly subtle magic aspect of box. For this, we need to dive
deep into unsafe code, so let&amp;rsquo;s get our hazmat suits on and jump in!&lt;/p>
&lt;h1 id="an-interesting-optimization">An interesting optimization&lt;/h1>
&lt;p>We have this code here:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-rust" data-lang="rust">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">fn&lt;/span> &lt;span style="color:#a6e22e">takes_box_and_ptr_to_it&lt;/span>(&lt;span style="color:#66d9ef">mut&lt;/span> b: Box&lt;span style="color:#f92672">&amp;lt;&lt;/span>&lt;span style="color:#66d9ef">u8&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span>, ptr: &lt;span style="color:#f92672">*&lt;/span>&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">u8&lt;/span>) {
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#66d9ef">let&lt;/span> value &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">unsafe&lt;/span> { &lt;span style="color:#f92672">*&lt;/span>ptr };
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#f92672">*&lt;/span>b &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#66d9ef">let&lt;/span> value2 &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#66d9ef">unsafe&lt;/span> { &lt;span style="color:#f92672">*&lt;/span>ptr };
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> assert_ne!(value, value2);
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>}
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">let&lt;/span> b &lt;span style="color:#f92672">=&lt;/span> Box::new(&lt;span style="color:#ae81ff">0&lt;/span>);
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#66d9ef">let&lt;/span> ptr: &lt;span style="color:#f92672">*&lt;/span>&lt;span style="color:#66d9ef">const&lt;/span> &lt;span style="color:#66d9ef">u8&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#f92672">&amp;amp;*&lt;/span>b;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>takes_box_and_ptr_to_it(b, ptr);
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>There&amp;rsquo;s a function, &lt;code>takes_box_and_ptr_to_it&lt;/code>, that takes a box and a pointer as parameters. Then,
it reads a value from the pointer, writes to the box, and reads a value again. It then asserts that
the two values aren&amp;rsquo;t equal. How can they not be equal? If our box and pointer point to the same
location in memory, writing to the box will cause the pointer to read the new value.&lt;/p>
&lt;p>Now construct a box, get a pointer to it, and pass the two to the function. Run the program&amp;hellip;&lt;/p>
&lt;p>&amp;hellip; and everything is fine. Let&amp;rsquo;s run it in release mode. This should work as well, since the optimizer
isn&amp;rsquo;t allowed to change observable behaviour, and an assert is very observable. Run the progrm&amp;hellip;&lt;/p>
&lt;pre tabindex="0">&lt;code>thread &amp;#39;main&amp;#39; panicked at &amp;#39;assertion failed: `(left != right)`
left: `0`,
right: `0`&amp;#39;, src/main.rs:5:5
&lt;/code>&lt;/pre>&lt;p>Hmm. That&amp;rsquo;s not what I&amp;rsquo;ve told would happen. Is the compiler broken? Is this a miscompilation?
I&amp;rsquo;ve heard that those do sometimes happen, right?&lt;/p>
&lt;p>Trusting our instincts that &amp;ldquo;it&amp;rsquo;s never a miscompilation until it is one&amp;rdquo;, we assume that LLVM behaved
well here. But what allows it to make this optimization? Taking a look at the generated LLVM-IR (by using
&lt;code>--emit llvm-ir -O&lt;/code>, the &lt;code>-O&lt;/code> is important since rustc only emits these attributes with optimizations on)
reveals the solution: (severely shortened to only show the relevant parts)&lt;/p>
&lt;pre tabindex="0">&lt;code class="language-llvmir" data-lang="llvmir">define void @takes_box_and_ptr_to_it(i8* noalias %0, i8* %ptr) {
&lt;/code>&lt;/pre>&lt;p>See the little attribute on the first parameter called &lt;code>noalias&lt;/code>? That&amp;rsquo;s what&amp;rsquo;s doing the magic here.
&lt;code>noalias&lt;/code> is an LLVM attribute on pointers that allows for various optimizations. If there are two pointers,
and at least one of them is &lt;code>noalias&lt;/code>, there are some restrictions around the two. Approximately:&lt;/p>
&lt;ul>
&lt;li>If one of them writes, they must not point to the same value (alias each other)&lt;/li>
&lt;li>If neither of them writes, they can alias just fine.
Therefore, we also apply &lt;code>noalias&lt;/code> to &lt;code>&amp;amp;mut T&lt;/code> and &lt;code>&amp;amp;T&lt;/code> (if it doesn&amp;rsquo;t contain interior mutability through
&lt;code>UnsafeCell&amp;lt;T&amp;gt;&lt;/code>, since they uphold these rules.&lt;/li>
&lt;/ul>
&lt;p>This might sound familiar to you if you&amp;rsquo;re a viewer of &lt;a href="https://twitter.com/jonhoo">Jon Gjengset&lt;/a>&amp;rsquo;s content (which I can highly recommend). Jon has made an entire video about this before, since his crate &lt;code>left-right&lt;/code>
was affected by this (&lt;a href="https://youtu.be/EY7Wi9fV5bk)">https://youtu.be/EY7Wi9fV5bk)&lt;/a>.&lt;/p>
&lt;p>If you&amp;rsquo;re looking for &lt;em>any&lt;/em> hint that using box emits &lt;code>noalias&lt;/code>, you have to look no further than the documentation
for &lt;a href="https://doc.rust-lang.org/nightly/std/boxed/index.html#considerations-for-unsafe-code">&lt;code>std::boxed&lt;/code>&lt;/a>. Well, the nightly or beta docs, because I only added this section very recently. For years, this behaviour was not really documented, and you had to
belong to the arcane circles of the select few who were aware of it. So lots of code was written thinking that box was &amp;ldquo;just an
RAII pointer&amp;rdquo; (a pointer that allocates the value in the constructor, and deallocates it in the destructor on drop) for all
pointers are concerned.&lt;/p>
&lt;h1 id="stacked-borrows-and-miri">Stacked Borrows and Miri&lt;/h1>
&lt;p>TODO: introduce UB by explaining how it allows optimizations like the one above, don&amp;rsquo;t talk in standardese&lt;/p>
&lt;p>&lt;a href="https://github.com/rust-lang/miri">Miri&lt;/a> is an interpreter for Rust code with the goal of finding undefined behaviour.
Undefined behaviour, UB for short, is behaviour of a program upon which no restrictions are imposed. If UB is executed,
&lt;em>anything&lt;/em> can happen, including segmentation faults, silent memory corruption, leakage of private keys or exactly
what you intended to happen. Examples of UB include use-after-free, out of bounds reads or data races.&lt;/p>
&lt;p>I cannot recommend Miri highly enough for all unsafe code you&amp;rsquo;re writing (sadly support for some IO functions
and FFI is still lacking, and it&amp;rsquo;s still very slow).&lt;/p>
&lt;p>So, let&amp;rsquo;s see whether our code contains UB. It has to, since otherwise the optimizer wouldn&amp;rsquo;t be allowed to change
observable behaviour (since the assert doesn&amp;rsquo;t fail in debug mode). &lt;code>$ cargo miri run&lt;/code>&amp;hellip;&lt;/p>
&lt;pre tabindex="0">&lt;code class="language-rust,ignore" data-lang="rust,ignore">error: Undefined Behavior: attempting a read access using &amp;lt;3314&amp;gt; at alloc1722[0x0], but that tag does not exist in the borrow stack for this location
--&amp;gt; src/main.rs:2:26
|
2 | let value = unsafe { *ptr };
| ^^^^
| |
| attempting a read access using &amp;lt;3314&amp;gt; at alloc1722[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of an access at alloc1722[0x0..0x1]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: &amp;lt;3314&amp;gt; was created by a retag at offsets [0x0..0x1]
--&amp;gt; src/main.rs:10:26
|
10 | let ptr: *const u8 = &amp;amp;*b;
| ^^^
help: &amp;lt;3314&amp;gt; was later invalidated at offsets [0x0..0x1]
--&amp;gt; src/main.rs:12:29
|
12 | takes_box_and_ptr_to_it(b, ptr);
| ^
= note: backtrace:
= note: inside `takes_box_and_ptr_to_it` at src/main.rs:2:26
note: inside `main` at src/main.rs:12:5
--&amp;gt; src/main.rs:12:5
|
12 | takes_box_and_ptr_to_it(b, ptr);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
&lt;/code>&lt;/pre>&lt;p>This behaviour does indeed not look very defined at all. But what went wrong? There&amp;rsquo;s a lot of information here.&lt;/p>
&lt;p>First of all, it says that we attempted a read access, and that this access failed because the tag does not exist in the
borrow stack of the byte that was accessed. This is something about stacked borrows, the experimental memory model for Rust
that is implemented in Miri. For an excellent introduction, see this part of the great book &lt;a href="https://rust-unofficial.github.io/too-many-lists/fifth-stacked-borrows.html">Learning Rust With Entirely Too Many Linked Lists&lt;/a>.&lt;/p>
&lt;p>In short: each pointer has a unique tag attached to it. Each byte in memory has its own &amp;lsquo;borrow stack&amp;rsquo; of these tags,
and only the pointers that have their tag in the stack are allowed to access it. Tags can be pushed and popped from the stack through various operations, for example borrowing.&lt;/p>
&lt;p>In the code example above, we get a nice little hint where the tag was created. When we created a reference (that was then
coerced into a raw pointer) from our box, it got a new tag called &lt;code>&amp;lt;3314&amp;gt;&lt;/code>. Then, when we moved the box into the function,
something happened: The tag was popped off the borrow stack and therefore invalidated. That&amp;rsquo;s because box invalidates all tags
when it&amp;rsquo;s moved. The tag was popped off the borrow stack and we tried to read with it anyways - undefined behaviour happened!&lt;/p>
&lt;p>And that&amp;rsquo;s how our code wasn&amp;rsquo;t a miscompilation, but undefined behaviour. Quite surprising, isn&amp;rsquo;t it?&lt;/p>
&lt;h1 id="noalias-nothanks">noalias, nothanks&lt;/h1>
&lt;p>Many people, myself included, don&amp;rsquo;t think that this is a good thing.&lt;/p>
&lt;p>First of all, it introduces more UB that could have been defined behaviour instead. This is true for almost all UB, but usually,
there is something gained from the UB that justifies it. We will look at this later. But allowing such behaviour is fairly easy:
If box didn&amp;rsquo;t invalidate pointers on move and instead behaved like a normal raw pointer, the code above would be sound.&lt;/p>
&lt;p>But more importantly, this is not behaviour generally expected by users. While it can be argued that box is like a &lt;code>T&lt;/code>, but on
the heap, and therefore moving it should invalidate pointers, since moving &lt;code>T&lt;/code> definitely has to invalidate pointers to it,
this comparison doesn&amp;rsquo;t make sense to me. While &lt;code>Box&amp;lt;T&amp;gt;&lt;/code> usually behaves like a &lt;code>T&lt;/code>, it&amp;rsquo;s just a pointer. Writers of unsafe
code &lt;em>know&lt;/em> that box is just a pointer, and will abuse that knowledge, accidentally causing UB with it. While this can be
mitigated with better docs and teaching, like how no one questions the uniqueness of &lt;code>&amp;amp;mut T&lt;/code> (maybe that&amp;rsquo;s also because that
one makes intuitive sense, &amp;ldquo;shared xor mutable&amp;rdquo; is a simple concept), I think it will always be a problem,
because in my opinion, box being unique and invalidating pointers on move is simply not intiutive.&lt;/p>
&lt;p>When a box is moved, the pointer bytes change their location in memory. But the bytes the box points to stay the same. They don&amp;rsquo;t
move in memory. This is the fundamental missing intuition about the box behaviour.&lt;/p>
&lt;p>There are also other reasons why the box behaviour is not desirable. Even people who know about the behaviour of box will want
to write code that goes directly against this behaviour at some point. But usually, fixing it is pretty simple: Storing a raw
pointer (or &lt;code>NonNull&amp;lt;T&amp;gt;&lt;/code>) instead of a box, and using the constructor and drop to allocate and deallocate the backing box.
This is fairly inconvenient, but totally acceptable. There are bigger problems though. There are crates like &lt;code>owning_ref&lt;/code>
that want to expose a generic interface over any type. Users like to choose box, and sometimes &lt;em>have&lt;/em> to chose box because of
other box-exclusive features it offers. Even worse is &lt;code>string_cache&lt;/code>, which is extremely hard to fix.&lt;/p>
&lt;p>Then last but not least, there&amp;rsquo;s the opinionated fact that &lt;code>Box&amp;lt;T&amp;gt;&lt;/code> shall be implementable entirely in user code. While we are
many missing language features away from this being the case, the &lt;code>noalias&lt;/code> case is also magic descended upon box itself, with no
user code ever having access to it.&lt;/p>
&lt;h1 id="noalias-noslow">noalias, noslow&lt;/h1>
&lt;p>There are also several arguments in favour of box being unique and special cased here. To negate the last argument above, it can
be said that &lt;code>Box&amp;lt;T&amp;gt;&lt;/code> &lt;em>is&lt;/em> a very special type. It&amp;rsquo;s just like a &lt;code>T&lt;/code>, but on the heap. Using this mental model, it&amp;rsquo;s very easy to
justify all the box magic and its unique behaviour.&lt;/p>
&lt;p>This mental model is one that many people have, but what does this bring us? This is just one mental model of box, and
there are other mental models of it (like &amp;ldquo;a reference that manages its lifetime itself&amp;rdquo; or &amp;ldquo;a safe RAII pointer&amp;rdquo;).&lt;/p>
&lt;p>There is one clear potential benefit from this box behaviour. ✨Optimizations✨. &lt;code>noalias&lt;/code> doesn&amp;rsquo;t exist for fun, it&amp;rsquo;s something
that can bring clear performance wins (for &lt;code>noalias&lt;/code> on &lt;code>&amp;amp;mut T&lt;/code>, those were measureable). So the only question remains:
How much performance does &lt;code>noalias&lt;/code> on &lt;code>Box&amp;lt;T&amp;gt;&lt;/code> give us now, and how much potential performance improvements could we get in the
future? For the latter, there is no simple answer. For the former, there is. &lt;code>rustc&lt;/code> has &lt;a href="https://github.com/rust-lang/rust/pull/99527">&lt;em>no&lt;/em> performance improvements&lt;/a> from being compiled with &lt;code>noalias&lt;/code> on &lt;code>Box&amp;lt;T&amp;gt;&lt;/code>.&lt;/p>
&lt;p>I have not yet benchmarked ecosystem crates without box noalias and don&amp;rsquo;t have the capacity to do so right now, so I would be very
grateful if anyone wanted to pick that up and report the results.&lt;/p>
&lt;h1 id="a-way-forward">a way forward&lt;/h1>
&lt;p>Based on all of this, I do have a solution that, in opinion, will fix all of this, even potential performance regressions with
box. First of all, I think that even if there are some performance regressions in ecosystem crates, the overall tradeoff goes
against the current box behaviour. Unsafe code wants to use box, and it is reasonable to do so. Therefore I propose to completely
remove all uniqueness from &lt;code>Box&amp;lt;T&amp;gt;&lt;/code>, and treat it just like a &lt;code>*const T&lt;/code> for the purposes of aliasing. This will make it more
predictable for unsafe code, and comes at none or only a minor performance cost.&lt;/p>
&lt;p>But this performance cost may be real, and especially the future optimization value can&amp;rsquo;t be certain. I do think that there
should be a way to get the uniqueness guarantees in some other way than through box. One possibility would be to use a &lt;code>&amp;amp;'static mut T&lt;/code> that is unleaked for drop, but the semantics of this are still &lt;a href="https://github.com/rust-lang/unsafe-code-guidelines/issues/316">unclear&lt;/a>. If that is not possible, maybe exposing &lt;code>std::ptr::Unique&lt;/code> (with it getting boxes aliasing semantics) could be desirable. For this, all existing usages of &lt;code>Unique&lt;/code> inside the standard library would have to be removed though.&lt;/p>
&lt;p>I guess what I am wishing for are some good and flexible raw pointer types. That&amp;rsquo;s still in the stars&amp;hellip;&lt;/p>
&lt;p>For more information about this topic, see &lt;a href="https://github.com/rust-lang/unsafe-code-guidelines/issues/326">https://github.com/rust-lang/unsafe-code-guidelines/issues/326&lt;/a>&lt;/p></content></item></channel></rss>