gccrs: An alternative compiler for Rust

⚓ Rust    📅 2024-11-07    👤 freedit    👁️ 72      

freedit

gccrs is a work-in-progress alternative compiler for Rust being developed as part of the GCC project. GCC is a collection of compilers for various programming languages that all share a common compilation framework. You may have heard about gccgo, gfortran, or g++, which are all binaries within that project, the GNU Compiler Collection. The aim of gccrs is to add support for the Rust programming language to that collection, with the goal of having the exact same behavior as rustc.

First and foremost, gccrs was started as a project because it is fun. Compilers are incredibly rewarding pieces of software, and are great fun to put together. The project was started back in 2014, before Rust 1.0 was released, but was quickly put aside due to the shifting nature of the language back then. Around 2019, work on the compiler started again, led by Philip Herron and funded by Open Source Security and Embecosm. Since then, we have kept steadily progressing towards support for the Rust language as a whole, and our team has kept growing with around a dozen contributors working regularly on the project. We have participated in the Google Summer of Code program for the past four years, and multiple students have joined the effort.

The main goal of gccrs is to provide an alternative option for compiling Rust. GCC is an old project, as it was first released in 1987. Over the years, it has accumulated numerous contributions and support for multiple targets, including some not supported by LLVM, the main backend used by rustc. A practical example of that reach is the homebrew Dreamcast scene, where passionate engineers develop games for the Dreamcast console. Its processor architecture, SuperH, is supported by GCC but not by LLVM. This means that Rust is not able to be used on those platforms, except through efforts like gccrs or the rustc-codegen-gcc backend - whose main differences will be explained later.

GCC also benefits from the decades of software written in unsafe languages. As such, a high amount of safety features have been developed for the project as external plugins, or even within the project as static analyzers. These analyzers and plugins are executed on GCC's internal representations, meaning that they are language-agnostic, and can thus be used on all the programming languages supported by GCC. Likewise, many GCC plugins are used for increasing the safety of critical projects such as the Linux kernel, which has recently gained support for the Rust programming language. This makes gccrs a useful tool for analyzing unsafe Rust code, and more generally Rust code which has to interact with existing C code. We also want gccrs to be a useful tool for rustc itself by helping pan out the Rust specification effort with a unique viewpoint - that of a tool trying to replicate another's functionality, oftentimes through careful experimentation and source reading where the existing documentation did not go into enough detail. We are also in the process of developing various tools around gccrs and rustc, for the sole purpose of ensuring gccrs is as correct as rustc - which could help in discovering surprising behavior, unexpected functionality, or unspoken assumptions.

We would like to point out that our goal in aiding the Rust specification effort is not to turn it into a document for certifying alternative compilers as "Rust compilers" - while we believe that the specification will be useful to gccrs, our main goal is to contribute to it, by reviewing and adding to it as much as possible.

Furthermore, the project is still "young", and still requires a huge amount of work. There are a lot of places to make your mark, and a lot of easy things to work on for contributors interested in compilers. We have strived to create a safe, fun, and interesting space for all of our team and our GSoC students. We encourage anyone interested to come chat with us on our various communication platforms, and offer mentorship for you to learn how to contribute to the project and to compilers in general.

Maybe more importantly however, there is a number of things that gccrs is NOT for. The project has multiple explicit non-goals, which we value just as highly as our goals.

The most crucial of these non-goals is for gccrs not to become a gateway for an alternative or extended Rust-like programming language. We do not wish to create a GNU-specific version of Rust, with different semantics or slightly different functionality. gccrs is not a way to introduce new Rust features, and will not be used to circumvent the RFC process - which we will be using, should we want to see something introduced to Rust. Rust is not C, and we do not intend to introduce subtle differences in standard by making some features available only to gccrs users. We know about the pain caused by compiler-specific standards, and have learned from the history of older programming languages.

We do not want gccrs to be a competitor to the rustc_codegen_gcc backend. While both projects will effectively achieve the same goal, which is to compile Rust code using the GCC compiler framework, there are subtle differences in what each of these projects will unlock for the language. For example, rustc_codegen_gcc makes it easy to benefit from all of rustc's amazing diagnostics and helpful error messages, and makes Rust easily usable on GCC-specific platforms. On the other hand, it requires rustc to be available in the first place, whereas gccrs is part of a separate project entirely. This is important for some users and core Linux developers for example, who believe that having the ability to compile the entire kernel (C and Rust parts) using a single compiler is essential. gccrs can also offer more plugin entrypoints by virtue of it being its own separate GCC frontend. It also allows Rust to be used on GCC-specific platforms with an older GCC where libgccjit is not available. Nonetheless, we are very good friends with the folks working on rustc_codegen_gcc, and have helped each other multiple times, especially in dealing with the patch-based contribution process that GCC uses.

All of this ties into a much more global goal, which we could summarize as the following: We do not want to split the Rust ecosystem. We want gccrs to help the language reach even more people, and even more platforms.

To ensure that, we have taken multiple measures to make sure the values of the Rust project are respected and exposed properly. One of the features we feel most strongly about is the addition of a very annoying command line flag to the compiler, -frust-incomplete-and-experimental-compiler-do-not-use. Without it, you are not able to compile any code with gccrs, and the compiler will output the following error message:

crab1: fatal error: gccrs is not yet able to compile Rust code properly. Most of the errors produced will be the fault of gccrs and not the crate you are trying to compile. Because of this, please report errors directly to us instead of opening issues on said crate's repository.

Our github repository: https://github.com/rust-gcc/gccrs

Our bugzilla tracker: https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=__open__&component=rust&product=gcc

If you understand this, and understand that the binaries produced might not behave accordingly, you may attempt to use gccrs in an experimental manner by passing the following flag:

-frust-incomplete-and-experimental-compiler-do-not-use

or by defining the following environment variable (any value will do)

GCCRS_INCOMPLETE_AND_EXPERIMENTAL_COMPILER_DO_NOT_USE

For cargo-gccrs, this means passing

GCCRS_EXTRA_ARGS="-frust-incomplete-and-experimental-compiler-do-not-use"

as an environment variable.

Until the compiler can compile correct Rust and, most importantly, reject incorrect Rust, we will be keeping this command line option in the compiler. The hope is that it will prevent users from potentially annoying existing Rust crate maintainers with issues about code not compiling, when it is most likely our fault for not having implemented part of the language yet. Our goal of creating an alternative compiler for the Rust language must not have a negative effect on any member of the Rust community. Of course, this command line flag is not to the taste of everyone, and there has been significant pushback to its presence... but we believe it to be a good representation of our main values.

In a similar vein, gccrs separates itself from the rest of the GCC project by not using a mailing list as its main mode of communication. The compiler we are building will be used by the Rust community, and we believe we should make it easy for that community to get in touch with us and report the problems they encounter. Since Rustaceans are used to GitHub, this is also the development platform we have been using for the past five years. Similarly, we use a Zulip instance as our main communication platform, and encourage anyone wanting to chat with us to join it. Note that we still have a mailing list, as well as an IRC channel (gcc-rust@gcc.gnu.org and #gccrust on oftc.net), where all are welcome.

To further ensure that gccrs does not create friction in the ecosystem, we want to be extremely careful about the finer details of the compiler, which to us means reusing rustc components where possible, sharing effort on those components, and communicating extensively with Rust experts in the community. Two Rust components are already in use by gccrs: a slightly older version of polonius, the next-generation Rust borrow-checker, and the rustc_parse_format crate of the compiler. There are multiple reasons for reusing these crates, with the main one being correctness. Borrow checking is a complex topic and a pillar of the Rust programming language. Having subtle differences between rustc and gccrs regarding the borrow rules would be annoying and unproductive to users - but by making an effort to start integrating polonius into our compilation pipeline, we help ensure that the results we produce will be equivalent to rustc. You can read more about the various components we use, and we plan to reuse even more here. We would also like to contribute to the polonius project itself and help make it better if possible. This cross-pollination of components will obviously benefit us, but we believe it will also be useful for the Rust project and ecosystem as a whole, and will help strengthen these implementations.

Reusing rustc components could also be extended to other areas of the compiler: Various components of the type system, such as the trait solver, an essential and complex piece of software, could be integrated into gccrs. Simpler things such as parsing, as we have done for the format string parser and inline assembly parser, also make sense to us. They will help ensure that the internal representation we deal with will correspond to the one expected by the Rust standard library.

On a final note, we believe that one of the most important steps we could take to prevent breakage within the Rust ecosystem is to further improve our relationship with the Rust community. The amount of help we have received from Rust folks is great, and we think gccrs can be an interesting project for a wide range of users. We would love to hear about your hopes for the project and your ideas for reducing ecosystem breakage or lowering friction with the crates you have published. We had a great time chatting about gccrs at RustConf 2024, and everyone's interest in the project was heartwarming. Please get in touch with us if you have any ideas on how we could further contribute to Rust.

🏷️ Rust_feed