Let's create a new Linux distro that builds binaries built with
the -fsanitize=address flag by default, so that:
Please read AddressSanitizer: A Fast Address Sanity
Checker which is a compiler flag (in both GCC and Clang) that
makes existing software written in C / C++ / FORTRAN / etc. nearly
memory safe. When ASAN was first invented by Google, it was able to
identify 300 previously unknown bugs in Chromium, e.g. buffer overruns
and stack overflows. It's since been brought to kernelspace by a
project called KASAN. It's
even used
by memory safe languages like Rust to add memory safety to all the
code that uses the unsafe keyword. Therefore ASAN is in
some respects the root of memory safety for modern software.
The tradeoff is it's got an average slowdown of 73% for CPU-bound workloads. Since our intended audience are the operators of backend systems doing I/O-bound workloads of untrusted network data, we feel that the benefits of not getting hacked are worth the negligible cost we see the ASAN runtime imposing on network latencies, and therefore feel it's worthwhile to have the freedom to choose to have the benefits of ASAN apply systemically.
We aim to build Sanitized Linux by forking Alpine Linux and
reconfiguring it to build the kernel and entire userspace using
the -fsanitize=address flag. We will then make installer
ISOs and prebuilt binaries available via a website and an APK-based
package service.
We like the idea of doing ASAN as a Linux distro, because it brings a community mandate for developers to more broadly engage from the bottom-up in finding security bugs and ensuring they get fixed. Right now developers mostly only use ASAN to spot bugs in their own projects. We need a way to keep ASAN plugged in to the broader ecosystem. Linus Torvalds himself once said, many eyeballs make all bugs shallow. Google gave us the microscope by inventing ASAN. Sanitized Linux will bring us the eyeballs.
We like Alpine because (1) it's popular for production containers and (2) the Alpine authors successfully managed to transplant glibc with an alternative more permissively licensed libc, which should be an encouraging sign that the further tuning we'd need to do at the build system level should be feasible. Alternatively, Debian or RPM could be used, since the method of packaging is orthogonal to the runtime hardening benefits that ASAN offers.
One criticism that's been encountered floating this idea is that the libsanitizer runtime (that comes included with libgcc) is 54,000 lines of code which are highly tunable via environmental variables intended for developers, which could be abused if productionized for things like setuid binaries. We intend to address any such concerns by using a trimmed-down runtime without the bells and whistles for release builds. For an example of a freestanding ASAN runtime needing fewer than a thousand lines of code, see //libc/intrin:asan.c from the Cosmopolitan Libc codebase.
Lastly, to safeguard against scenarios where binaries are crashing so
often that the system isn't functional, we intend to add a feature to
htop that lets the system administrator flip a bit in a process that
causes it to go into log mode rather than crash mode. This bit could
be inheritable to child processes and perhaps also require superuser
privileges to set, depending on the user's choices. It could also be
set on processes spawned from the command line in a manner similar to
the nice command. This would enable systems to fallback
to a less secure but more functional state should the need arise.
We believe that we can ship a polished Linux distro for backend serving that's fully hardened by ASAN in less than six months.
Please note that the intended audience for Sanitized Linux is the people who operate production services on the backend. We haven't investigated what it would take to have a fully memory safe Linux Desktop, but we potentially could, if the interest is there.
The way ASAN works for x86 userspace in PIE mode is each time the compiler
generates an instruction that accesses a byte of memory at
address x, a few additional asm opcodes are generated
too, in order to ensure that a "shadow bit" is set at the concomitant
address x>>3.
+---------------------------------------+------------------------------+-------+ | pointer address range | description | size | +=======================================+==============================+=======+ | 0x0000000000000000-0x00000000001fffff | traditional NULL guard pages | 2mb | | 0x0000000000200000-0x000001ffffffffff | unused in -fpie pml4t model | 2tb | | 0x0000020000000000-0x00000fffffffffff | shadow virtual memory bits | 14tb | | 0x0000100000000000-0x00007fffffffffff | program virtual memory bytes | 112tb | +---------------------------------------+------------------------------+-------+
This incurs a cost of 1/8th additional memory required. Please note that the terabyte sizes above are for the entire x86 user virtual memory space, and that programs usually only map a small portion of that at 4096-byte granularity.
For example, where the compiler would normally generate:
mov (%rdi),%rax
GCC and Clang w/ -fsanitize=address will generate code that looks like this instead:
mov %rdi,%rsi shr $3,%rsi cmpb $0,(%rsi) jnz abort mov (%rdi),%rax
By having a Linux distro we're also able to choose which compiler
flags the gcc and g++ commands use by
default. This way we can can ensure that ASAN not only applies to the
binaries we distribute, but also to the binaries that the
conventional ./configure && make && make install workflow
generates too!
This proposal is written by Justine Alexandra Roberts Tunney on February 19th, 2021.
She's the author of Cosmopolitan Libc which makes C a build-once run-anywhere language. Before that she spent six years working at Google on prominent open source projects like TensorFlow and Nomulus. She also spearheaded one of Google's notable open source security improvement initiatives Operation Rosehub. Before Google, she created the OccupyWallSt.org website and Twitter handle which gave a loudspeaker to an international grassroots movement.
Justine can bottom-line Sanitized Linux if tech industry leaders are open to funding its development. She resides in the Bay Area and can be contacted via email:
You can follow her at: