Skip to content

b3sum should catch truncation and return an error instead of succumbing to SIGBUS #488

@nabijaczleweli

Description

@nabijaczleweli

truncate -s100G 100G; (sleep 1; > 100G) & target/debug/b3sum 100G succumbs to SIGBUS if it decides to mmap 100G. In principle there's no reason for this to happen, since this is detectable and recoverable (as just "giving bad data" which is what'd happen if it weren't mmapped or "logging an error").

Draft modelled after the implementation in https://git.sr.ht/~nabijaczleweli/voreutils/commit/e4621ad93ae69d28c032932ad617c026790c0a40 (which works), but this doesn't work due to the plethora of threads and tends to crash gdb (though it does actually manage to longjmp in most threads); systems programming in everyone's favourite systems programming language sucks as usual:

diff --git a/b3sum/Cargo.toml b/b3sum/Cargo.toml
index f8c9023..7c85e1c 100644
--- a/b3sum/Cargo.toml
+++ b/b3sum/Cargo.toml
@@ -20,6 +20,8 @@ clap = { version = "4.0.8", features = ["derive", "wrap_help"] }
 hex = "0.4.0"
 rayon-core = "1.12.1"
 wild = "2.0.3"
+libc = "0.2.172"
+cee-scape = "0.2.0"
 
 [dev-dependencies]
 duct = "1.0.0"
diff --git a/b3sum/src/main.rs b/b3sum/src/main.rs
index 69a10c8..d24cea0 100644
--- a/b3sum/src/main.rs
+++ b/b3sum/src/main.rs
@@ -173,6 +173,11 @@ impl Args {
     }
 }
 
+static mut sigbussy: cee_scape::JmpBufFields = unsafe { std::mem::zeroed() };
+unsafe extern "C" fn sigbus(_: i32) {
+    cee_scape::longjmp(&raw const sigbussy, 1);
+}
+
 fn hash_path(args: &Args, path: &Path) -> anyhow::Result<blake3::OutputReader> {
     let mut hasher = args.base_hasher.clone();
     if path == Path::new("-") {
@@ -183,8 +188,28 @@ fn hash_path(args: &Args, path: &Path) -> anyhow::Result<blake3::OutputReader> {
     } else if args.no_mmap() {
         hasher.update_reader(File::open(path)?)?;
     } else {
+        let spawner = std::thread::current().id();
         // The fast path: Try to mmap the file and hash it with multiple threads.
-        hasher.update_mmap_rayon(path)?;
+        unsafe {
+            libc::sigaction(
+                libc::SIGBUS,
+                &libc::sigaction {
+                    sa_sigaction: sigbus as usize,
+                    ..std::mem::zeroed()
+                } as *const _,
+                std::ptr::null_mut(),
+            )
+        };
+        let mut res: std::io::Result<()> = Ok(());
+        if cee_scape::call_with_setjmp(|buf| {
+            unsafe { std::ptr::copy_nonoverlapping(buf as *const _, &raw mut sigbussy, 1) };
+            res = hasher.update_mmap_rayon(path).map(drop);
+            0
+        }) != 0
+        {
+            dbg!(spawner, std::thread::current().id());
+        }
+        res?;
     }
     let mut output_reader = hasher.finalize_xof();
     output_reader.set_position(args.seek());

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions