Giter Site home page Giter Site logo

golang-github-saracen-walker's Introduction

walker

walker is a faster, parallel version, of filepath.Walk.

// walk function called for every path found
walkFn := func(pathname string, fi os.FileInfo) error {
    fmt.Printf("%s: %d bytes\n", pathname, fi.Size())
    return nil
}

// error function called for every error encountered
errorCallbackOption := walker.WithErrorCallback(func(pathname string, err error) error {
    // ignore permissione errors
    if os.IsPermission(err) {
        return nil
    }
    // halt traversal on any other error
    return err
})

walker.Walk("/tmp", walkFn, errorCallbackOption)

Benchmarks

  • Standard library (filepath.Walk) is FilepathWalk.
  • This library is WalkerWalk
  • FastwalkWalk is fastwalk.
  • GodirwalkWalk is godirwalk.

This library and filepath.Walk both perform os.Lstat calls and provide a full os.FileInfo structure to the callback. BenchmarkFastwalkWalkLstat and BenchmarkGodirwalkWalkLstat include this stat call for better comparison with BenchmarkFilepathWalk and BenchmarkWalkerWalk.

This library and fastwalk both require the callback to be safe for concurrent use. BenchmarkFilepathWalkAppend, BenchmarkWalkerWalkAppend, BenchmarkFastwalkWalkAppend and BenchmarkGodirwalkWalkAppend append the paths found to a string slice. The callback, for the libraries that require it, use a mutex, for better comparison with the libraries that require no locking.

This library will not always be the best/fastest option. In general, if you're on Windows, or performing lstat calls, it does a pretty decent job. If you're not, I've found fastwalk to perform better on machines with fewer cores.

These benchmarks were performed with a warm cache.

goos: linux
goarch: amd64
pkg: github.com/saracen/walker
BenchmarkFilepathWalk-16                       1        1437919955 ns/op        340100304 B/op    775525 allocs/op
BenchmarkFilepathWalkAppend-16                 1        1226169600 ns/op        351722832 B/op    775556 allocs/op
BenchmarkWalkerWalk-16                         8         133364860 ns/op        92611308 B/op     734674 allocs/op
BenchmarkWalkerWalkAppend-16                   7         166917499 ns/op        104231474 B/op    734693 allocs/op
BenchmarkFastwalkWalk-16                       6         241763690 ns/op        25257176 B/op     309423 allocs/op
BenchmarkFastwalkWalkAppend-16                 4         285673715 ns/op        36898800 B/op     309456 allocs/op
BenchmarkFastwalkWalkLstat-16                  6         176641625 ns/op        73769765 B/op     592980 allocs/op
BenchmarkGodirwalkWalk-16                      2         714625929 ns/op        145340576 B/op    723225 allocs/op
BenchmarkGodirwalkWalkAppend-16                2         597653802 ns/op        156963288 B/op    723256 allocs/op
BenchmarkGodirwalkWalkLstat-16                 1        1186956102 ns/op        193724464 B/op   1006727 allocs/op
goos: windows
goarch: amd64
pkg: github.com/saracen/walker
BenchmarkFilepathWalk-16                       1        1268606000 ns/op        101248040 B/op    650718 allocs/op
BenchmarkFilepathWalkAppend-16                 1        1276617400 ns/op        107079288 B/op    650744 allocs/op
BenchmarkWalkerWalk-16                        12          98901983 ns/op        52393125 B/op     382836 allocs/op
BenchmarkWalkerWalkAppend-16                  12          99733117 ns/op        58220869 B/op     382853 allocs/op
BenchmarkFastwalkWalk-16                      10         109107980 ns/op        53032702 B/op     401320 allocs/op
BenchmarkFastwalkWalkAppend-16                10         107512330 ns/op        58853827 B/op     401336 allocs/op
BenchmarkFastwalkWalkLstat-16                  3         379318333 ns/op        100606232 B/op    653931 allocs/op
BenchmarkGodirwalkWalk-16                      3         466418533 ns/op        42955197 B/op     579974 allocs/op
BenchmarkGodirwalkWalkAppend-16                3         476391833 ns/op        48786530 B/op     580002 allocs/op
BenchmarkGodirwalkWalkLstat-16                 1        1250652800 ns/op        90536184 B/op     832562 allocs/op

Performing benchmarks without having the OS cache the directory information isn't straight forward, but to get a sense of the performance, we can flush the cache and roughly time how long it took to walk a directory:

filepath.Walk

$ sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'; go test -run TestFilepathWalkDir -benchdir $GOPATH
ok      github.com/saracen/walker       3.846s

walker

$ sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'; go test -run TestWalkerWalkDir -benchdir $GOPATH
ok      github.com/saracen/walker       0.353s

fastwalk

$ sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'; go test -run TestFastwalkWalkDir -benchdir $GOPATH
ok      github.com/saracen/walker       0.306s

fastwalk (lstat)

$ sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'; go test -run TestFastwalkWalkLstatDir -benchdir $GOPATH
ok      github.com/saracen/walker       0.339s

godirwalk

$ sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'; go test -run TestGodirwalkWalkDir -benchdir $GOPATH
ok      github.com/saracen/walker       3.208s

golang-github-saracen-walker's People

Contributors

saracen avatar neolynx avatar shogo82148 avatar lebauce avatar weebney avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.