Giter Site home page Giter Site logo

asdfjkl / yahb Goto Github PK

View Code? Open in Web Editor NEW
13.0 6.0 2.0 4.33 MB

Deduplicating File-Copy/Backup Tool (Commandline)

Home Page: https://github.com/asdfjkl/yahb

License: GNU General Public License v3.0

C# 95.30% Inno Setup 4.70%
ntfs hardlink backup commandline volume-shadow-copy rsync-backup rsyncbackup windows

yahb's People

Contributors

asdfjkl avatar creedflan738 avatar thomasschroeder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

yahb's Issues

Check for double-call of yahb

If by design or errouneously one calls yahb twice in the same minute, it treats the backup folder both as source for the Hardlink-similarity check and target folder. I guess it would be easy to check against that.

Error: Division by zero

Hallo,

I wanted to try yahb, but it did not really work.

Scenario: In Windows 10 a share from my Synology NAS is mounted as J:, it contains a directory named Backups. (Maybe it's important: Windows 10 runs in a virtual box).

I tried
yahb c:\Users\joerg\Documents j:\Backups /s

Output:
creating list of directories ...
ERR:c:\Users\joerg\Documents\Eigene Videos:Der Zugriff auf den Pfad "c:\Users\joerg\Documents\Eigene Videos" wurde verweigert.
ERR:c:\Users\joerg\Documents\Eigene Musik:Der Zugriff auf den Pfad "c:\Users\joerg\Documents\Eigene Musik" wurde verweigert.
ERR:c:\Users\joerg\Documents\Eigene Bilder:Der Zugriff auf den Pfad "c:\Users\joerg\Documents\Eigene Bilder" wurde verweigert.
creating list of directories ... DONE
creating list of files ...
creating list of files ... DONE
unable to identify a previous backup location, copying all
copying files: [ ] 0%
Unbehandelte Ausnahme: System.DivideByZeroException: Es wurde versucht, durch 0 (null) zu teilen.
bei yahb.CopyModule.doCopy()
bei yahb.Program.Main(String[] args)

Result: The backup directory ist created as expected:
j:\Backups\202004031108\c__\Users\joerg\Documents
But this contains only the directories of the source, which are all empty. No files were copied.

Do I make a mistake here? Should I change or check something? Or is there some bug?

I hope my comment helps to improve your project,
regards and good health,
Jörg

Backup failed, SystemIO error

The full error message in the output after about 68% of successful backup is below. There is enough space in the drive.

unable to create hardlink, copying instead

Unbehandelte Ausnahme: System.IO.IOException: Nicht genügend Systemressourcen, um den angeforderten Dienst auszuführen.

bei System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
bei System.IO.__ConsoleStream.Write(Byte[] buffer, Int32 offset, Int32 count)
bei System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
bei System.IO.StreamWriter.Write(Char[] buffer, Int32 index, Int32 count)
bei System.IO.TextWriter.SyncTextWriter.WriteLine(String value)
bei System.Console.WriteLine(String value)
bei yahb.Config.addToLog(String message)
bei yahb.CopyModule.doCopy()
bei yahb.Program.Main(String[] args)

report errors

currently still old setting (i.e. report error only on verbose); reverse behaviour.

yahb dies with "UnauthorizedAccessException"

I guess the file should be skipped and the Error logged.

If you should fix this error I would be really thankful for a Win7-Release. (Would be a pity if I couldn't use this very nice solution...)

yahb C:\ K:\YAHB\C /vss /+log:K:\YAHB\C.log /s /x
f:*.tmp;tmp;temp
copying files: [####      ]  40%  ETR: 03:18:53
Unbehandelte Ausnahme: System.UnauthorizedAccessException: Der Zugriff auf den Pfad "C:\Users\All Users\Microsoft\Diagnosis\events00.rbs" wurde verweigert.
   bei System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   bei System.IO.File.InternalCopy(String sourceFileName, String destFileName, Boolean overwrite, Boolean checkHost)
   bei yahb.CopyModule.doCopy()
   bei yahb.Program.Main(String[] args)

Problem with R/O Files

There is a problem with (incremental?) Backups. if the source file is readonly flagged.
Even with admin-right an error is generated. If on remove the readonly-flag from the source file, everything is fine:

Unbehandelte Ausnahme: System.UnauthorizedAccessException: Der Zugriff auf den Pfad "w:\BACKUP\202002182353\d__\test.pdf" wurde verweigert.
bei System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
bei System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
bei System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize)
bei System.IO.File.OpenFile(String path, FileAccess access, SafeFileHandle& handle)
bei System.IO.File.SetCreationTimeUtc(String path, DateTime creationTimeUtc)
bei System.IO.FileSystemInfo.set_CreationTimeUtc(DateTime value)
bei System.IO.FileSystemInfo.set_CreationTime(DateTime value)
bei yahb.CopyModule.doCopy()
bei yahb.Program.Main(String[] args)

return an errorlevel

If something goes wrong it would be helpful to get back an errorlevel <> 0.

greetings gmlltg

System.ArgumentOutOfRangeException: Ungültige Win32-FileTime

Thank you for this nice piece of code!
Unfortunately, after the second run of a backup, I receive this error message after some percentages:


copying files: [## ] 21% ETR: 00:03:43
Unbehandelte Ausnahme: System.ArgumentOutOfRangeException: Ungültige Win32-FileTime.
Parametername: fileTime
bei System.DateTime.FromFileTimeUtc(Int64 fileTime)
bei yahb.CopyModule.doCopy()
bei yahb.Program.Main(String[] args)

Unfortunately no informations about which file is affected in the log file.

Can you help me with this?
Thank you very much in advance!

YAHB doesn't create Hardlinks

My drive is NTFS and I'm using Windows 10. Everything looks normal, but the space used shows no hard-links have been used. I used DU by sysinternals to check. Flag -u results in the same count and size as without flag.

Control over memory usage

I use yahb to backup results of an indefinite long algorithm. Right now it has 7GB of results in a folder, which are refined again and again and keep growing. As loading / saving needs additional time, I designed my program to carefully keep ~3 GB of the currently used results in memory.

YAHB takes ~800 MB to backup that folder. Most often this results in >4 GB total memory usage, so Windows activates the Auslagerungsdatei. Of course I could tell my program to use less memory, but that would slow it down even more.

Best would be an option to tell YAHB to use a maximum of X MB for operation, like 500 MB in this case.

Copying too slow

I accidentally backupped to a new folder, so all files are copied. This takes 5 hours exactly. When copying all files with robocopy, it takes ~30 minutes. So copying is much slower than strictly necessary.

Idea: If it's too difficult making your algorithm more efficient you could let it do all the hardlinks and then copy the remaining files with robocopy.

System.IO.IOException on yahb.CopyModule.createDirectoryList()

There is a problem with recent Adobe Acrobat Licensing Service that makes yahb and other backup software break when trying to access the log folders of the software for backup purposes. This has been reported to Adobe and will hopefully be worked on (https://community.adobe.com/t5/illustrator-discussions/com-adobe-dunamis-folder-cannot-be-backed-up-by-backup-solutions/td-p/14559757)

However, it would probably be possible to fix this (or work around it) in yahb as well in case this happens in the future with this or other software.

Here is the output from yahb just before the crash

Unbehandelte Ausnahme: System.IO.IOException: Der Prozess kann nicht auf die Datei "C:\Users\flo\AppData\Roaming\com.adobe.dunamis\d225650b-9c1f-4738-97e3-94e805951049\v1\0" zugreifen, da sie von einem anderen Prozess verwendet wird.
   bei System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   bei System.IO.FileSystemEnumerableIterator`1.CommonInit()
   bei System.IO.FileSystemEnumerableIterator`1..ctor(String path, String originalUserPath, String searchPattern, SearchOption searchOption, SearchResultHandler`1 resultHandler, Boolean checkHost)
   bei System.IO.Directory.EnumerateDirectories(String path)
   bei yahb.CopyModule.createDirectoryList()
   bei yahb.Program.Main(String[] args)

Whatever Acrobat Licensing Service does here, there is currently no catch for IOException in createDirectoryList(). So maybe add this and keep working on the remaining directories as is already done for some other exceptions.

Maybe if I get this right we could also try a fix with different EnumerationOptions.
Update: Turns out after getting to the point in the .NET code where the exception is thrown we do not have an option to prevent this with different EnumerationOptions. I'll leave the analysis here anyway in case someone wants to follow the thought process.

yahb calls EnumerateDirectories() with only one argument, the directory to start from

subdirs = new List<string>(Directory.EnumerateDirectories(currentDir));

This means that the unused parameters are filled with default options, the EnumerationOptions being set to EnumerationOptions.Compatible
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Directory.cs#L216

Notably EnumerationOptions.Compatible means that IgnoreInaccessible is set to false
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/EnumerationOptions.cs#L20-L21

We thus end up calling EnumerateDirectories() with all available parameters internally
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Directory.cs#L223-L224

Which then calls InternalEnumeratePaths() defined just above the different EnumerateDirectories() definitions
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Directory.cs#L196-L214

This leads to another internal call to FileSystemEnumerableFactory.UserDirectories() defined here
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerableFactory.cs#L128-L140

Creating a new FileSystemEnumerable instance defined here
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerable.cs#L14-L38

And here is the interesting part, at the end of the constructor we create a DelegateEnumerator which according to the source code comment ensures that we get possible IO exceptions for the target directory right at the beginning
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerable.cs#L35-L37

This DelegateEnumerator creates a FileSystemEnumerator
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerable.cs#L60-L68

Ath the end of the FileSystemEnumerator constructor it calls its method Init()
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerator.cs#L31-L43

Which is implemented in the Windows specific file and creates a directory handle to check for any IO exceptions
https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerator.Windows.cs#L48-L50

So unfortunately there is no try/catch here and changing the EnumerationOptions won't help, we just have to catch the IOException in yahb.

Maybe custom EnumerationOptions would help at locations where IgnoreInaccessible is actually used?
https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEnumerator.Windows.cs#L115

I think with IgnoreInaccessible and RecurseSubdirectories set to true in EnumerationOptions we may get the full directory list with much less going back and forth between yahb's createDirectoryList() and the .NET functions.

But certainly not related to this bug then.

/verbose:level

A verbose-level will help to reduce the log-file size. Something like
/verbose -> all operations (as now implemented)
/verbose:1 -> only new
/verbose:2 -> only new and non existent
something like this.

greetings

Check for old Backup

This is a wish for one additional feature: I'd like a flag to determine a maximum age x. If the last backup file is older than x, then copy instead of hardlink.

The reason: I feel unsafe if an important file was written to memory only once years ago and is only hardlinked since then. I'd feel better if it's written anew once in a while. Of course this relies on the original file aging better than the old Backup copy. So addtionally it would make sense to verify the versions against each other, but I guess that is a lot more work than my proposal above.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.