jacereda / fsatrace Goto Github PK
View Code? Open in Web Editor NEWFilesystem access tracer
License: ISC License
Filesystem access tracer
License: ISC License
At the moment if the process returns a non-zero exit code then fsatrace does nothing. I find that surprising - I'd expect it to always produce the output, and then bubble the exit code back as well. Assuming you want to keep the default behaviour, an option would be useful.
Changes on Ubuntu 20.04 seem to break several pretty basic things.
First, make test
fails because resolv
of __xlstat
fails. Changing line 502 of fsatraceso.c to always use __xstat
causes things to work.
Second, the unlinkat
function (also in fsatraceso.c) contains an unconditional assert(0)
on line 321. I think there should be an else
before that; adding it makes thing work.
If I create a file twice.bat
with contents:
cat foo.txt
cat bar.txt
Then it only records the first call to cat
, not the second. Similarly, if I do gcc -c main.c
then it traces the call to cc1
(which loads main.c
and writes the .s
file), but not the call to as
(which writes main.o
).
The problem (as best I can tell...) is that patchInstalled
asks if this thread has previously installed a patch. But, if this thread previously installed a patch for a different process, it returns True even though the patch isn't valid. I "fixed" this by changing the value stored in the Tls to be the process of the thread passed to NtResumeThread. I have no real idea what I'm doing, but it fixes the problem, and tracing it seems to approximately work. Any advice?
Would it be possible to embed the extra files into fsatrace itself?
It would seem lovely to be able to enable at least the possibility of static linking (when there is, e.g. a C API). Not needed, but it does seem like it could enhance the user experience.
On Windows, if I create a binary Main.exe
(in Haskell, but not sure that matters) and then do:
fsatrace rwmdq output.txt -- Main.exe
I don't get Main.exe
as a file that is ready. Furthermore, doing cmd /c Main.exe
as the command line doesn't get either cmd
or Main
as the detected binaries. It seems that the binary being run doesn't get a "Read" entry?
I sometimes get:
Fatal: fsatrace.c:39: CreateProcessW(0, cmd, 0, 0, 0, CREATE_SUSPENDED, 0, 0, &si, &pi), err: 2
It would be really useful if in those circumstances it could also print the command line used, since that can help with debugging various escaping issues (which I think is the problem I'm having here!)
On Linux, fsatrace doesn't appear to record the executable it is running as a "read".
Tagging @ndmitchell; since we are using fsatrace in Rattle.
Currently, running gcc -c main.c
, I get 139 lines output from fsatrace
. If I remove all lines which are identical to the previous line, I'm left with 15. Reducing the number of lines by a factor of 10 results in less storage, and less requirement for processing downstream. This will probably help with ndmitchell/shake#334
For a paper we're writing about Rattle (which uses fsatrace) we'd like to quantify in some way how much of the filesystem API fsatrace covers. Right now, it looks like 25 functions are overridden on Linux, which is my estimate just by grepping for R(
. Is that accurate? Is there a way to tell how many relevant functions glibc provides? Or anything else in this neighborhood?
I had a look through the test suite. A couple of questions:
RR FilePath
f ('R':'|':xs) = Just $ R xs
. It would be ideal if you guaranteed only to produce one form.os == "mingw32"
- this always scares me that Haskell programs use a value called OS, matching against a string which is clearly not an OS, and is a toolkit not even installed probably at the wrong bit size. I prefer the isWindows
from the extra library - but what you are doing is the Haskell expected pattern, I just don't like it, so go with whatever you prefer.Even after copying the gcc
binary into $TMP
doing gcc -c main.c
doesn't trace the read of main.c
or the write of main.o
. My guess is because gcc
spawns a binary that is itself in system which doesn't get the copy treatment? Anything that can be done about that, short of turning off system protection?
I changed the script to work with GHC 8.0.2. A few minor issues which I'll pull request later, but the real issue was that two files define InterlockedAdd, namely:
I worked around that by adding:
#define __INTRINSIC_DEFINED__InterlockedAdd
#define __INTRINSIC_DEFINED__InterlockedAdd64
In hooks, which is vile. But is it acceptable? Or any other ideas how to avoid it?
First of all, thank you for the great tool!
It seems that fsatrace
doesn't track reads from files that do not exist. For example, if the file 1.c
does not exist then the command
fsatrace verwmdq 1.out -- gcc 1.c
does not list 1.c
in the result 1.out
, which is a problem for my use case. (Here is a blog post about my use case in case you are curious.)
Tested both on Windows and Linux.
How difficult would it be to add support for this?
Given a 32bit binary, I tried with both cat
and sleep
from http://unxutils.sourceforge.net/, if I create foo.bat
:
sleep 0s
Then do fsatrace rwm - -- cmd /c foo.bat
it fails with:
Fatal: src/win/inject.c:44: CreateProcessA(0, helper, 0, 0, 0, 0, 0, 0, &si, &pi), err: 8
Going in to the code and somewhat randomly changing things, if I change https://github.com/jacereda/fsatrace/blob/master/src/win/inject.c#L32 to be if (is32 && 0)
then it works and seemingly traces correctly.
Looking at the code, perhaps you should be using the 64bit technique if either of yourself or the child is 64bit? Or perhaps you should try the else
branch of GetProcAddress
and only if that fails try using fsatracehelper
?
For fac/bigbro, readdir is an important operation to trace. It's needed if you have a build rule such as:
echo *.c > file.dat
in which case the rule needs to be rebuilt if a new file is created in that directory.
I tried what I thought would be the simplest possible example trace on Mac (with SIP turned off; see below), but I only saw a read of the binary I used, not any read/write events associated with the arguments.
$ csrutil status
System Integrity Protection status: disabled.
$ cd $(mktemp -d)
$ touch test_file
$ fsatrace vrwmd - -- cp test_file test_file.copy
argv[0]=cp
argv[1]=test_file
argv[2]=test_file.copy
r|/bin/cp
$
Not sure if this is possible, but it would be good if fsatrace could write some information whenever a file had its modification time queried, something like q|filename. If this worked, then it would be very easy to speed up slow rebuild checkers, using Shake. For example, on my system cabal build
takes 0.625s, but in certain circumstances, ghc --make
can take > 1 min. If you could run fsatrace - -- cabal build
, and then capture everything it reads and queries, and only rerun if any of that changes, you could reduce the rebuild time to 0.01s. The tool ghc-make already does that using custom logic for ghc --make
, but with fsatrace
it could be totally generic and I could kill ghc-make
entirely.
On a GitHub Mac runner I do:
git clone https://github.com/jacereda/fsatrace.git .fsatrace
(cd .fsatrace && make)
fsatrace v - -- echo fsatrace works
But that fails with:
dyld[79178]: terminating because inserted dylib '/Users/runner/work/neil/neil/.fsatrace/fsatrace.so' could not be loaded: tried: '/Users/runner/work/neil/neil/.fsatrace/fsatrace.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'arm64e')), '/System/Volumes/Preboot/Cryptexes/OS/Users/runner/work/neil/neil/.fsatrace/fsatrace.so' (no such file), '/Users/runner/work/neil/neil/.fsatrace/fsatrace.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'arm64e'))
See https://github.com/ndmitchell/neil/actions/runs/9043421759/job/24851038411 for a complete trace.
Given a work Windows 7 64bit machine, with 32bit cygwin and a nasty anti-virus, I tried running some commands from my user directory (C:\User\myusername
), using a local file list.txt
. Observations:
fsatrace foo.txt -- cmd /c "type list.txt"
This segfaults trying to write to a null pointer.
fsatrace foo.txt -- cat list.txt
This never completes. It spawns in infinite number of fsatracehelper.exe
processes. As it spawns each one, they become suspended, and a new one is spawned. I had to kill them with taskkill /FI "IMAGENAME eq fsatracehelper.exe" /f
, but if I didn't know taskkill
it would have required a reboot.
I have VS2008 on my machine. Rebuilding got further if I removed the stdint.h
headers, which aren't available on older versions and don't actually seem to be required. After that I got the errors:
hooks.c(59) : error C2065: 'FILE_DIRECTORY_FILE' : undeclared identifier
hooks.c(61) : error C2065: 'FILE_DELETE_ON_CLOSE' : undeclared identifier
hooks.c(105) : warning C4013: 'NT_SUCCESS' undefined; assuming extern returning
LINK : fatal error LNK1181: cannot open input file 'ntdll.lib'
Not sure if they are solvable or not - VS2008 is quite old now.
Hello, I'm seeing a segfault in the traced app, when it's trying to write to the shared memory buffer back to fsatrace
:
<segv>
#4 emitOp (oc=oc@entry=114, op1=<optimized out>, p2=p2@entry=0x0) at src/emit.c:118
#5 0x00007f6fa63525f3 in fdemit (c=c@entry=114, fd=fd@entry=16) at src/unix/fsatraceso.c:118
#6 0x00007f6fa6352937 in openat64 (fd=-100, p=<optimized out>, f=<optimized out>, m=<optimized out>) at src/unix/fsatraceso.c:269
I don't have much more info at this time, but from looking at the source, is this likely to be running past the end of the buffer?
From my reading of main(), all accesses are buffered in the shared memory buffer until the process is complete, and then written, correct? (no concurrent access)
https://github.com/jacereda/fsatrace/blob/master/src/fsatrace.c#L193-L203
And the default logsize is 1MB of text?
https://github.com/jacereda/fsatrace/blob/master/src/fsatrace.h#L4
Which can be overridden by setting the env var FSAT_BUF_SIZE
?
Removing a symbolic link looks like removing the link's destination (but perhaps should not?).
Demonstration:
$ touch foo
$ ln -s foo bar
$ fsatrace erwdtmq /dev/stdout -- rm -f bar
r|/usr/bin/rm
q|/home/fangism/foo
d|/home/fangism/foo
Destination foo
is unaffected.
I expected something more like:
r|/usr/bin/rm
q|/home/fangism/bar
d|/home/fangism/bar
fsatrace does not appear to trace the mkdir calls, which can create unexpected traces when working with temporary directories. Consider the Rust library tempfile. Here's a simple case where we create a temporary directory, which in turn creates a subdirectory, with a single file in it:
fn main() {
let tmp = tempfile::TempDir::new().unwrap();
let dir = tmp.path().join("dir");
std::fs::create_dir(&dir).unwrap();
std::fs::write(dir.join("hello"), b"hello").unwrap();
}
This produces the following trace:
r|/the-binary
w|/tmp/.tmp5JZCc9/dir/hello
r|/tmp/.tmp5JZCc9
r|/tmp/.tmp5JZCc9/dir
d|/tmp/.tmp5JZCc9/dir/hello
d|/tmp/.tmp5JZCc9/dir
d|/tmp/.tmp5JZCc9
We're not observing the creation of tmp/.tmp5JZCc9/dir
, so when tempfile
starts recursively deleting the temporary directory, the read call of tmp/.tmp5JZCc9/dir
appears to be a unique access, even though the program fully created and cleaned up all these acceses.
To handle this, users of fsatrace
could try to infer that these traces correspond to a directory tree by looking for successful accesses to subdirectories, but that won't work if we just create a directory and don't try to use it. For example, if we modify the previous code to remove the file write:
fn main() {
let tmp = tempfile::TempDir::new().unwrap();
let dir = tmp.path().join("dir");
std::fs::create_dir(&dir).unwrap();
}
We will end up with this stream, that appears to access a file that wasn't created by the program:
r|/the-binary
r|/tmp/.tmpCSPGtB
r|/tmp/.tmpCSPGtB/dir
d|/tmp/.tmpCSPGtB/dir
d|/tmp/.tmpCSPGtB
I'd imagine that tracing the mkdir syscalls would add a w|/tmp/.tmpCSPGtB/dir
event, which would allow us to infer that the program fully handled all these directory or file accesses.
It would be really useful to have a --verbose flag to say exactly what the argv arguments to spawn, or the single command line to CreateProcess, is. At the moment I'm having to guess it.
In trying to support the Shell command in Shake as ndmitchell/shake#308. Part of the problem is that on Windows the built-in process stuff in Haskell does lots of quote mangling, without giving me the chance to opt out. If @foo.txt
as a command line just read foo.txt
and took the command line from there (as many programs already support) then I could skip the Haskell mangling and get exactly what I was after from fsatrace.
Some programs treat @foo.txt
files as having one argument per line, others as one single line. I'd probably be tempted for Windows to join all lines with a space, and for Linux pass each line as a separate argument - that gives full flexibility and is simple.
Not sure if this is a good idea, but make
on Windows could run stack setup
if it couldn't find the necessary compilers but can find stack. Would allow simplifying the instructions.
While taking a look around I saw in win.mk
:
SRCS32=src/win/fsatracedll.c src/win/inject.c src/win/patch.c src/win/hooks.c src/emit.c src/win/shm.c src/win/handle.c src/win/utf8.c src/win/dbg.c src/win/inject.c
SRCS64=$(SRCS32) src/win/inject.c
So SRCS64 = SRCS32 + inject.c. But inject.c is already is SRCS32? Unlikely to be harmful, but I guess it's a mistake?
I'm seeing the following error in Linux/x86_64 using the latest git version of fsatrace:
$ fsatrace rwm /tmp/foo -- cabal unpack -v0 base-orphans-0.5.4
base-orphans-0.5.4/: setModificationTime: invalid argument (Bad file
descriptor)
fsatrace?�~�(1072): error: command failed with code 1
argv[0]=cabal
argv[1]=unpack
argv[2]=-v0
argv[3]=base-orphans-0.5.4
Is that a known problem? Am I doing something wrong?
Since 4f9f599 the first entry on $PATH
is corrupted, which causes the Shake test suites to fail, e.g. https://travis-ci.org/ndmitchell/shake/jobs/574506269. The actual test adds shake_helper
to the start of the $PATH
. Since the above fsatrace changes that to stuff_fsatrace_needs;$PATH
, and the path separator on Linux is :
not ;
, that corrupts the first entry in the $PATH
. I guess if you want to take that route, then you should use :
vs ;
in a platform-specific way? Although encoding information in the $PATH
freaks me out a lot (but I'm guessing you determined that nothing else would do before trying it...).
For Windows users, being able to download a release of fsatrace would be very handy, as compiling requires a bunch of things most Windows users don't have. I'm happy to generate the binary and share, but the GitHub releases page of this repo is the most natural place to host them.
https://docs.microsoft.com/en-us/windows/win32/etw/about-event-tracing - not sure if that would be faster or slower than Kernel hooking. There's a chance it might be simpler though. See https://github.com/lowleveldesign/wtrace for an example of building it up to a full tracing app. I measured 21% overhead using fsatrace on Windows (see https://ndmitchell.com/downloads/paper-build_scripts_with_perfect_dependencies-18_nov_2020.pdf S5.2), although some of that will have been spawning the fsatrace binary.
If Fsatrace could tell me about a file access before it occurred, Shake could do a need before it still built, which would make auto deps much more powerful. Is this feasible? Would require some kind of pipes based protocol probably - you write a line of stdout or to a file, Shake writes something back to say continue.
As an example, given the go
code:
package main
import (
"fmt"
"io/ioutil"
"os"
)
func main() {
b, err := ioutil.ReadFile(os.Args[1])
if err != nil {
fmt.Print(err)
}
fmt.Print(string(b))
}
Save that as main.go
and compile it with go build -o main main.go
. fsatrace does not detect the read. I believe the cause will be that go
does not use dynamic libraries but jumps straight to syscalls.
I think there is a desire to add a C API, and I'd like to unify bigbro with fsatrace (under either one name or the other). How does the bigbro API look?
https://github.com/droundy/bigbro/blob/master/bigbro.h#L3
I could easily see creating more a fine-grained set of output (e.g. separating stat into a separate array), and also allowing null pointers for output that is not desired.
I don't know how portable to windows the file descriptor approach is for redirecting stdout and stderr. Also, this API doesn't support setting the environment for the child, so if that is important, we'd need another argument. Finally, returning the child PID seems important in terms of fac's usage, but I'm not sure how that will work on windows. Maybe we create a second helper function kill_children? Sounds racy.
Another question is how to support the "blocking" mode that Neil wants.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.