Giter Site home page Giter Site logo

cshatag's People

Contributors

bjornfor avatar duncanbarth avatar es80 avatar nicoulaj avatar oriansj avatar p- avatar rfjakob avatar teeed avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cshatag's Issues

Installation problem when target directory doesn't already exist

I noticed that if the /usr/local/bin directory doesn't already exist and you install cshatag with make install, then the directory won't be created and the executable will be copied in its place. So instead of having /usr/local/bin/cshatag you end up having it in /usr/local/bin (a file, not a directory).

thanks for this

Couldn't get shatag working after like an hour trying with extended attributes etc with make-no-sense verbose things like <missing>

Anyway this worked out of the box and I can see them via getfattr

How do I compile for ARM using the makefile? I ran this and it worked fine (and I am using it already, just wondering about the makefile):
env GOOS=linux GOARCH=arm GOARM=7 go build .

but I read the makefile and no idea what's going on in there..

also another question:
https://ostechnix.com/how-to-edit-a-file-without-changing-its-timestamps-in-linux/
doing this makes it think a file is corrupt, there are other things that sometimes change files but not mtime, are there any methods of avoiding this or not really?

Lots of errors are thrown when run on ExFAT drive

Background: I'm first time user of cshatag, so please let me know if this is just user error. I am on M1 Mac and have an external hard drive that's ExFAT formatted

Description of Error:

  1. cshatag seems to run fine when creating checksum using cshatag -recursive, and I can see the the user.shatag.ts and user.shatag.sha256 attributes are written to files when I check using xattr -l.
  2. However, when I run cshatag -recursive -qq then I see lots of errors because of files that start with ._, because it says "operation not permitted". I tried to remove these dot files to see if cshatag can still check for errors, but that also removes the extended attributes.
  3. cshatag -qq runs fine if I check just one file.

Question:

  1. It looks like the extended attributes are written to dot files, and these dot files are throwing errors when cshatag runs a check. Is this a problem because I'm using ExFAT drive on a Mac?
  2. Maybe there's a setting that I've overlooked?

Thanks for helping/explaining how I should be using cshatag

Strange behavior on MacOS + SMB share

While working on a MacOS port, I just found a weird bug when updating tags on a Samba share. I am wondering if this bug is introduced by some MacOS + SMB interaction, or if it is preexisting. Here is the behavior I am seeing:

When updating an outdated tag on the Mac's main filesystem, the update works, and the next run of cshatag reports <ok>:

yemartin at iMac in ~/src/cshatag-master
$ cshatag test.bin
<outdated> test.bin
 stored: d8c2284963814db0cc2e9d49d96af3139bbc1fee0c43f6f45004863c8e10bdc5 1560059961.626701323
 actual: c2bd3e9a8195e2fe3ed99864752c5edb449b8c43beb7a51dcd3b6e28258b955b 1560060120.445768395

yemartin at iMac in ~/src/cshatag-master
$ cshatag test.bin
<ok> test.bin

When doing the same thing on an SMB-mounted Samba share, the update does not work: instead, the xattrs are removed. The next run of cshatag reports a missing flag ("stored: 00000..."), and this time, it sets the xattrs succesfully. Another (third) run of cshatag reports <ok>:

yemartin at iMac in /Volumes/Organizer
$ cshatag test.bin
<outdated> test.bin
 stored: d8c2284963814db0cc2e9d49d96af3139bbc1fee0c43f6f45004863c8e10bdc5 1560059979.000000000
 actual: c2bd3e9a8195e2fe3ed99864752c5edb449b8c43beb7a51dcd3b6e28258b955b 1560060125.000000000

yemartin at iMac in /Volumes/Organizer
$ cshatag test.bin
<outdated> test.bin
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: c2bd3e9a8195e2fe3ed99864752c5edb449b8c43beb7a51dcd3b6e28258b955b 1560060125.000000000

yemartin at iMac in /Volumes/Organizer
$ cshatag test.bin
<ok> test.bin

Is this a known issue with SMB or Samba? If not, can someone confirm whether this can be reproduced with Linux + Samba share?

(In case it matters: the protocol negotiated between the Mac and Samba server was SMB_3.02)

Thank you.

Distinguish between outdated and new files

Hello! cshatag is a really useful program. One thing that would make it more useful for automated monitoring is to have the reporting differentiate between new files and outdated ones.

Right now, the event 'outdated' covers two situations: when the checksum in the attribute changes, and if there is no checksum at all (since it assumes a zeroSha256 by default).

I propose a new event 'new' that gets returned when the attributes don't exist and cshatag is calculating them for the first time.

I'm not really a developer, but I could try to throw together an implementation and submit a pull request if you'd like.

Feature request: Dry run cshatag

To my understanding, if either the mtime or the checksum of a file (or both) have changed, "the status of the file is printed to stdout and the stored checksum is updated" (this last part is taken directly from the readme). This means that if something happened to your file that you weren't aware of (for example, if you edited it by mistake), you'll only find out when running cshatag the first time. The second time, the checksum will already have been updated and everything looks normal. I can foresee many situations where this might be a problem, for example if your computer loses power before you have a chance to inspect the output or even if you forget that you've run the program and run it again.

In rsync, you can dry run the program using the -n flag. This means that you get to see all error messages, warnings, et cetera, without actually executing the core function of the program, namely to copy files. Would it be desirable to have something similar implemented in cshatag?

Updating after corrupt file detected

Hi,

Firstly excellent tool. I have used a few of these/similar tools, the last one saving to a file per directory but I wanted something that moved with the files. Anyway I know this is me not understanding, so what I have done:

  1. Create a test file with a known timestamp and run cshatag
echo test_1 > test
touch -t 202301300000 test
cshatag test

I get the result as I would expect:

<new> test
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: 5a18f75b3ce3ed6550c33f23bb21f833bd63a159cb592a272fd1c61f98de5111 1675036800.000000000
  1. I update the file:
echo test_2 > test
touch -t 202301300001 test 
cshatag test

And as expected get the output:

<outdated> test
 stored: 5a18f75b3ce3ed6550c33f23bb21f833bd63a159cb592a272fd1c61f98de5111 1675036800.000000000
 actual: 8f1d878efe7586c55c8f0d7578ac59efda6831778eb5fba5f68b2f21a3519609 1675036860.000000000
  1. Simulate some corruption:
 echo test_3 > test
 touch -t 202301300001 test
 cshatag test

And as expected it is detected.

 Error: corrupt file "test"
<corrupt> test
 stored: 8f1d878efe7586c55c8f0d7578ac59efda6831778eb5fba5f68b2f21a3519609 1675036860.000000000
 actual: 8f89c43b0cd072e7127bcf26635d4e2febdacbb737bdb44f797e4e96b2408d73 1675036860.000000000
  1. Now I don't touch anything and rerun the command 'cshatag test' I expect to see the same error as above (3), but instead I get:
 <ok> test

I know this is the expected result according to the 'run_tests.sh' script you have. However I am failing to see why. If a file is corrupt then surely the attribute should not get updated, wouldn't you want it to keep showing as corrupt?

Feature request: add --untag flag

Currently one have to read the man page to know how to undo / untag files previously tagged with cshatag. (And the command example given feels like it leaks cshatag implementation detail.) Perhaps adding a --untag flag (or --undo?) would help?

At the same time, we might want to add --help flag too, to more easily find the new flag.

Reduce time stamp precision, take two

This is a follow-up to #12 that was closed with the introduction of <timechange>. But unfortunately, <timechange> does not solve the original problem:

cshatag still cannot detect bit corruption that happens during move or copy operations between two filesystems with different timestamp precisions.

With the same example as in #12, with one added command to simulate corruption during transfer, and with:

  • /tmp on my root filesystem (APFS)
  • /Volumes/Organizer from my NAS, mounted through SMB (SMB_3.1.1)
$ rm /Volumes/Organizer/test.bin \
; touch /tmp/test.bin \
&& cshatag -qq /tmp/test.bin \
&& mv /tmp/test.bin /Volumes/Organizer/ \
&& echo 'CORRUPTION' >> /Volumes/Organizer/test.bin \
&& cshatag /Volumes/Organizer/test.bin

Current behavior

<outdated> /Volumes/Organizer/test.bin
 stored: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1641197558.029917810
 actual: 4ef8ee0f9aaecb1597f22dfd7667af4a9b537e11e3aba08729647a882f9aff6e 1641197558.000000000

Expected behavior

<corrupt> /Volumes/Organizer/test.bin
 stored: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1641197558.029917810
 actual: 4ef8ee0f9aaecb1597f22dfd7667af4a9b537e11e3aba08729647a882f9aff6e 1641197558.000000000

<timechange> was a nice introduction for when the data has not changed. But when the data did change, we still need to ignore small time differences below a certain threshold, to differentiate between a legitimate <outdated>, and a <corrupt> file.

Suggested implementation

As per the discussion in #12, I suggest:

  • ignoring time differences of 2 seconds or less (*1)
  • for this to be the default behavior (*2)
  • and if necessary, to add a command line option to use the original exact timestamp comparison, or even specify a custom threshold.

*1: FAT has a 2 seconds precision on last modified time
*2: With this new behavior as default, users may get a harmless false positive, but the file content is still good. If the behavior is opt-in, users would get false negatives, meaning corruption would go undetected.

Note: to get the false positive, the user would need to make a legitimate edit within 2 seconds of running cshatag against a given file, quite unlikely. And it if does happen, the file content is good anyway, so no harm done.

New status or not?

To keep things simple, I suggest we just do, when data has changed:

  • if time_delta <= threshold: corrupt
  • else (i.e. time_delta > threshold): outdated

but we can also consider introducing a new status, something like:

  • if time_delta == 0: corrupt
  • else if time_delta <= threshold: suspicious
  • else (i.e. time_delta > threshold): outdated

What do you think?

Memory problems

I noticed the return values from malloc are not checked in case the system runs out of memory (which really can happen).
And the allocated memory isn't freed. Now this is not a big problem, since the program can only process one file at a time. But it is not clean.
Also, when the argument is a file which already has a checksum, valgrind notices an "Conditional jump or move depends on uninitialised value".

If this project isn't actively maintained I could fix those problems provided i find the time to do that.

Test file is always corrupt

cshatag always report corruption when I create an empty foo file for test purposes:

$ touch foo
fturco@desktop ~ 21:27:22 0 $ cshatag foo
<outdated> foo
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1550262442.004366404
fturco@desktop ~ 21:27:25 0 $ cshatag foo
Error: corrupt file "foo"
<corrupt> foo
 stored: 0000000000000000000000000000000000000000000000000000000000000000 1550262442.004366404
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1550262442.004366404

Some more details:

$ getfattr foo 
# file: foo
user.shatag.sha256
user.shatag.ts

I'm using the latest cshatag version for this git repository on a Gentoo Linux system.

Reduce time stamp precision

The really nice thing about cshatag, compared to other tags file solutions like chkbit, is that the tag follows the file along when the file is moved or copied, as long as the destination filesystem supports extended attributes.

But this unfortunately breaks when the time resolution of the target filesystem is less that the original filesystem. This would prevent detecting bit corruption that happened during move or copy operations.

For example, using the Go rewrite, and with:

  • /tmp on my root filesystem (APFS)
  • /Volumes/Organizer from my NAS, mounted through SMB (SMB_3.02)
$ rm /Volumes/Organizer/test.bin \
; touch /tmp/test.bin \
&& cshatag /tmp/test.bin \
&& mv /tmp/test.bin /Volumes/Organizer/ \
&& cshatag /Volumes/Organizer/test.bin

remove /Volumes/Organizer/test.bin? y
<outdated> /tmp/test.bin
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.563117837
<outdated> /Volumes/Organizer/test.bin
 stored: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.563117837
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.000000000

The second cshatag call, on the SMB share, considers the tag outdated. If corruption had happened during the move operation, cshatag would have missed it.

Suggestion: if I remember well, FAT was probably the lowest denominator, with 2 seconds resolution timestamps. So to ensure maximum compatibility, cshatag should consider the file unchanged if the file timestamp is within +/- 2 seconds of the tag timestamp.

So, to sum it bug-report style:

Current behavior

<outdated> /Volumes/Organizer/test.bin
 stored: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.563117837
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.000000000

Expected behavior

<ok> /Volumes/Organizer/test.bin

Do you think this makes sense, and this is possible to add to the Go rewrite?

Storing attributes to special file that's immutable, with a timestamp

Hello.

Thanks for your work, works great.

We run Linux LX zones in Illumnos VMs (OmniOS), and the base filesystem there is ZFS. ZFS On Linux (ZOL) already has fix for use of xattrs, but this fails on the LX VMs because xattr are not accessible in the VM because of the way ZFS stores file attributes. I'd like to propose a work-around - kindly allow storage of the attributes to a hidden/immutable file in the same directory as the file as an option.

The attached screenshot is from the OmniOS gitter discussion group.

Thanks.

Screenshot 2021-04-12 at 23 31 23

Ambiguous licensing

The readme says

COPYRIGHT
Copyright 2012 Jakob Unterwurzacher. License GPLv2+.

but the LICENSE file contains the MIT license. Under which license is this project now?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.