diff options
| author | Miguel Ojeda <ojeda@kernel.org> | 2025-11-24 17:16:04 +0100 |
|---|---|---|
| committer | Miguel Ojeda <ojeda@kernel.org> | 2025-11-24 17:48:16 +0100 |
| commit | 54e3eae855629702c566bd2e130d9f40e7f35bde (patch) | |
| tree | 66ba144f3a8fca061a2144b79863b4cd718844e0 /rust/proc-macro2/extra.rs | |
| parent | 7a0eae4d43d265c56e9d0b136ec08e35b83525b8 (diff) | |
| parent | 52ba807f1aa6ac16289e9dc9e381475305afd685 (diff) | |
Merge patch series "`syn` support"
This patch series introduces support for `syn` (and its dependencies):
Syn is a parsing library for parsing a stream of Rust tokens into a
syntax tree of Rust source code.
Currently this library is geared toward use in Rust procedural
macros, but contains some APIs that may be useful more generally.
It is the most downloaded Rust crate (according to crates.io), and it
is also used by the Rust compiler itself. Having such support allows to
greatly simplify writing complex macros such as `pin-init`. We will use
it in the `macros` crate too.
Benno has already prepared the `pin-init` version based on this, and on
top of that, we will be able to simplify the `macros` crate too. I think
Jesung is working on updating the `TryFrom` and `Into` upcoming derive
macros to use `syn` too.
The series starts with a few preparation commits (two fixes were already
merged in mainline that were discovered by this series), then each crate
is added. Finally, support for using the new crates from our `macros`
crate is introduced.
This has been a long time coming, e.g. even before Rust for Linux was
merged into the Linux kernel, Gary and Benno have wanted to use `syn`.
The first iterations of this, from 2022 and 2023 (with `serde` too,
another popular crate), are at:
https://github.com/Rust-for-Linux/linux/pull/910
https://github.com/Rust-for-Linux/linux/pull/1007
After those, we considered picking these from the distributions where
possible. However, after discussing it, it is not really worth the
complexity: vendoring makes things less complex and is less fragile.
In particular, we avoid having to support and test several versions,
we avoid having to introduce Cargo just to properly fetch the right
versions from the registry, we can easily customize the crates if needed
(e.g. dropping the `unicode_idents` dependency like it is done in this
series) and we simplify the configuration of the build for users for
which the "default" paths/registries would not have worked.
Moreover, nowadays, the ~57k lines introduced are not that much compared
to years ago (it dwarfed the actual Rust kernel code). Moreover, back
then it wasn't clear the Rust experiment would be a success, so it would
have been a bit pointless/risky to add many lines for nothing. Our macro
needs were also smaller in the early days.
So, finally, in Kangrejos 2025 we discussed going with the original,
simpler approach. Thus here it is the result.
There should not be many updates needed for these, and even if there
are, they should not be too big, e.g. +7k -3k lines across the 3 crates
in the last year.
Note that `syn` does not have all the features enabled, since we do not
need them so far, but they can easily be enabled just adding them to the
list.
Link: https://patch.msgid.link/20251124151837.2184382-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Diffstat (limited to 'rust/proc-macro2/extra.rs')
| -rw-r--r-- | rust/proc-macro2/extra.rs | 153 |
1 files changed, 153 insertions, 0 deletions
diff --git a/rust/proc-macro2/extra.rs b/rust/proc-macro2/extra.rs new file mode 100644 index 000000000000..55feb5ec7526 --- /dev/null +++ b/rust/proc-macro2/extra.rs @@ -0,0 +1,153 @@ +// SPDX-License-Identifier: Apache-2.0 OR MIT + +//! Items which do not have a correspondence to any API in the proc_macro crate, +//! but are necessary to include in proc-macro2. + +use crate::fallback; +use crate::imp; +use crate::marker::{ProcMacroAutoTraits, MARKER}; +use crate::Span; +use core::fmt::{self, Debug}; + +/// Invalidate any `proc_macro2::Span` that exist on the current thread. +/// +/// The implementation of `Span` uses thread-local data structures and this +/// function clears them. Calling any method on a `Span` on the current thread +/// created prior to the invalidation will return incorrect values or crash. +/// +/// This function is useful for programs that process more than 2<sup>32</sup> +/// bytes of Rust source code on the same thread. Just like rustc, proc-macro2 +/// uses 32-bit source locations, and these wrap around when the total source +/// code processed by the same thread exceeds 2<sup>32</sup> bytes (4 +/// gigabytes). After a wraparound, `Span` methods such as `source_text()` can +/// return wrong data. +/// +/// # Example +/// +/// As of late 2023, there is 200 GB of Rust code published on crates.io. +/// Looking at just the newest version of every crate, it is 16 GB of code. So a +/// workload that involves parsing it all would overflow a 32-bit source +/// location unless spans are being invalidated. +/// +/// ``` +/// use flate2::read::GzDecoder; +/// use std::ffi::OsStr; +/// use std::io::{BufReader, Read}; +/// use std::str::FromStr; +/// use tar::Archive; +/// +/// rayon::scope(|s| { +/// for krate in every_version_of_every_crate() { +/// s.spawn(move |_| { +/// proc_macro2::extra::invalidate_current_thread_spans(); +/// +/// let reader = BufReader::new(krate); +/// let tar = GzDecoder::new(reader); +/// let mut archive = Archive::new(tar); +/// for entry in archive.entries().unwrap() { +/// let mut entry = entry.unwrap(); +/// let path = entry.path().unwrap(); +/// if path.extension() != Some(OsStr::new("rs")) { +/// continue; +/// } +/// let mut content = String::new(); +/// entry.read_to_string(&mut content).unwrap(); +/// match proc_macro2::TokenStream::from_str(&content) { +/// Ok(tokens) => {/* ... */}, +/// Err(_) => continue, +/// } +/// } +/// }); +/// } +/// }); +/// # +/// # fn every_version_of_every_crate() -> Vec<std::fs::File> { +/// # Vec::new() +/// # } +/// ``` +/// +/// # Panics +/// +/// This function is not applicable to and will panic if called from a +/// procedural macro. +#[cfg(span_locations)] +#[cfg_attr(docsrs, doc(cfg(feature = "span-locations")))] +pub fn invalidate_current_thread_spans() { + crate::imp::invalidate_current_thread_spans(); +} + +/// An object that holds a [`Group`]'s `span_open()` and `span_close()` together +/// in a more compact representation than holding those 2 spans individually. +/// +/// [`Group`]: crate::Group +#[derive(Copy, Clone)] +pub struct DelimSpan { + inner: DelimSpanEnum, + _marker: ProcMacroAutoTraits, +} + +#[derive(Copy, Clone)] +enum DelimSpanEnum { + #[cfg(wrap_proc_macro)] + Compiler { + join: proc_macro::Span, + open: proc_macro::Span, + close: proc_macro::Span, + }, + Fallback(fallback::Span), +} + +impl DelimSpan { + pub(crate) fn new(group: &imp::Group) -> Self { + #[cfg(wrap_proc_macro)] + let inner = match group { + imp::Group::Compiler(group) => DelimSpanEnum::Compiler { + join: group.span(), + open: group.span_open(), + close: group.span_close(), + }, + imp::Group::Fallback(group) => DelimSpanEnum::Fallback(group.span()), + }; + + #[cfg(not(wrap_proc_macro))] + let inner = DelimSpanEnum::Fallback(group.span()); + + DelimSpan { + inner, + _marker: MARKER, + } + } + + /// Returns a span covering the entire delimited group. + pub fn join(&self) -> Span { + match &self.inner { + #[cfg(wrap_proc_macro)] + DelimSpanEnum::Compiler { join, .. } => Span::_new(imp::Span::Compiler(*join)), + DelimSpanEnum::Fallback(span) => Span::_new_fallback(*span), + } + } + + /// Returns a span for the opening punctuation of the group only. + pub fn open(&self) -> Span { + match &self.inner { + #[cfg(wrap_proc_macro)] + DelimSpanEnum::Compiler { open, .. } => Span::_new(imp::Span::Compiler(*open)), + DelimSpanEnum::Fallback(span) => Span::_new_fallback(span.first_byte()), + } + } + + /// Returns a span for the closing punctuation of the group only. + pub fn close(&self) -> Span { + match &self.inner { + #[cfg(wrap_proc_macro)] + DelimSpanEnum::Compiler { close, .. } => Span::_new(imp::Span::Compiler(*close)), + DelimSpanEnum::Fallback(span) => Span::_new_fallback(span.last_byte()), + } + } +} + +impl Debug for DelimSpan { + fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { + Debug::fmt(&self.join(), f) + } +} |
