Comments (3)
Rustc enforces a file size limit of 4 GB, so a token cannot be bigger than that.
use std::fs::File;
use std::io::Write as _;
fn main() {
let buf = vec![b' '; 1024 * 1024];
let mut file = File::create("spanoverflow.rs").unwrap();
file.write_all(b"fn main() {\n").unwrap();
for _ in 0..4100 {
file.write_all(&buf).unwrap();
}
file.write_all(b"}\n").unwrap();
}
$ ls -lh spanoverflow.rs
-rw-r--r-- 1 dtolnay users 4.1G Oct 9 18:41 spanoverflow.rs
$ rustc spanoverflow.rs
fatal error: rustc does not support files larger than 4GB
from proc-macro2.
There appears to be no limit on the total amount of text parsed by rustc, even though its internal representation for BytePos is 32 bits.
https://github.com/rust-lang/rust/blob/1.73.0/compiler/rustc_span/src/lib.rs#L2010-L2014
If you parse more than 232 bytes, it overflows and you get bogus spans referring to the wrong files.
use std::fs::File;
use std::io::Write as _;
fn main() {
let buf = vec![b' '; 1024 * 1024];
let mut file = File::create("spanoverflow.rs").unwrap();
file.write_all(b"mod module;\n").unwrap();
for _ in 0..2050 {
file.write_all(&buf).unwrap();
}
file.write_all(b"fn main() {}\n").unwrap();
let mut file = File::create("module.rs").unwrap();
for _ in 0..2050 {
file.write_all(&buf).unwrap();
}
file.write_all(b"pub fn f() {}\n").unwrap();
}
According to rustc -Zunpretty=ast-tree,expanded spanoverflow.rs
, this is the location of the f
function (wrong):
Item {
attrs: [],
id: NodeId(10),
span: spanoverflow.rs:2:4194319: 2:4194332 (#0),
ident: f#0,
kind: Fn(
and this is the location of main
(wrong):
Item {
attrs: [],
id: NodeId(12),
span: /home/dtolnay/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/string.rs:2952:2144473580: 2952:2144473592 (#0),
ident: main#0,
kind: Fn(
The correct locations would be module.rs and spanoverflow.rs respectively, which you get if the files do not overflow 232 bytes total size.
from proc-macro2.
For scale, currently there is 200 GB of Rust code published on crates.io. Looking at just the newest version of every crate, it is 16 GB of code. So a workload that involves parsing this, even on multiple threads, would currently hit overflow.
from proc-macro2.
Related Issues (20)
- `Span::source_text` panics with multibyte source HOT 1
- Issue with multibyte chars in source_text() computation HOT 1
- Consider an API to reset thread-local Span data HOT 2
- Make `proc_macro_span` optional with nightly HOT 4
- Enhance documentation with examples.
- panic in fallback.rs:817 HOT 3
- Build script fails when compiling with target-feature=+avx512bw HOT 2
- Linking failed after updating to 1.0.76 HOT 2
- Provide an API to access byte offsets for Spans
- Not compatible with 1.66-nightly HOT 2
- cannot find struct, variant or union type `LineColumn` in crate `proc_macro` HOT 1
- arm64-linux build error HOT 1
- bug?: Build script fails without `strip = "symbols"` HOT 1
- Missed nightly feature `proc_macro_byte_character` HOT 6
- Unsure if 100% the reason: Cargo Build issue for Arduino Uno (MCU= ATmega238p) HOT 3
- Help with debugging build command error HOT 1
- Looking for forks that make `proc_macro::Span` Send + Sync HOT 2
- Tokenize text with unbalanced delimiters HOT 2
- newly added test_size::test_proc_macro2_wrapper_size_without_locations test fails on i686 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from proc-macro2.