thecodrr / fdir Goto Github PK
View Code? Open in Web Editor NEW⚡ The fastest directory crawler & globbing library for NodeJS. Crawls 1m files in < 1s
Home Page: https://thecodrr.github.io/fdir/
License: MIT License
⚡ The fastest directory crawler & globbing library for NodeJS. Crawls 1m files in < 1s
Home Page: https://thecodrr.github.io/fdir/
License: MIT License
Hi. It's "apanzzon" from Reddit ;-)
A fdir-cli version would be interesting.
Due to corona chaos, I currently have very little spare time.
Would avoid using a framework, to avoid dependencies, or split it up to a secondary repo
System: Windows Server 2019 Version 1809 (Build 17763.737)
Node.js Version: 12.18.2
fdir Version: 3.4.3
In my project, I have a OneDrive sync, mapped as Network drive.
When i run fdir on this, I only get the files in the root directory of the Network Drive.
But, when i map another folder as network drive, with the same code,
fdir outputs all dirs and files, as it should.
My Code:
const fdir = require("fdir")
var paths = "R:\\"
var files = new fdir().withDirs().withMaxDepth(3).crawlWithOptions(paths,{includeBasePath:true, includeDirs:true, suppressErrors:false}).sync();
files.forEach(file_path => {
console.log(file_path)
})
Screenshot of the execution in Drive R:\ :
Screenshot of the execution in Drive S:\ :
Do you have an idea of a fix for the issue?
How can i debug fdir? How can i see errors?
I've had an issue opened over on a repo - glenn2223/vscode-live-sass-compiler#145
I can't for the life of me figure out why fdir has returned a file count of 0 when the filters applied all return true for one file
They have an SSH connection in VS Code but I don't see why that would be a problem
I have copied out a section of code that produces the below output (copied from this comment)
Code
const isMatch = picomatch(fileList, { ignore: excludeItems, dot: true, nocase: true });
OutputWindow.Show(OutputLevel.Trace, "Searching folder", null, false);
const searchLogs: Map<string, string[]> = new Map<string, string[]>();
const searchFileCount = (
(await new fdir()
.crawlWithOptions(basePath, {
filters: [
(filePath) =>
filePath.toLowerCase().endsWith(".scss") ||
filePath.toLowerCase().endsWith(".sass"),
(filePath) => {
const result = isMatch(path.relative(basePath, filePath));
searchLogs.set(`Path: ${filePath}`, [
` isMatch: ${result}`,
` - Base path: ${basePath}`,
` - Rela path: ${path.relative(basePath, filePath)}`,
]);
return result;
},
(filePath) => {
const result =
path
.toNamespacedPath(filePath)
.localeCompare(path.toNamespacedPath(sassPath), undefined, {
sensitivity: "accent",
}) === 0;
searchLogs
.get(`Path: ${filePath}`)
?.push(
` compare: ${result}`,
` - Orig file path: ${filePath}`,
` - Orig sass path: ${sassPath}`
);
return result;
},
],
includeBasePath: true,
onlyCounts: true,
resolvePaths: true,
suppressErrors: true,
})
.withPromise()) as OnlyCountsOutput
).files;
const x = await new fdir()
.crawlWithOptions(basePath, {
includeBasePath: true,
group: true,
resolvePaths: true,
suppressErrors: true,
})
.withPromise();
OutputWindow.Show(OutputLevel.Trace, "FDIR OUTPUT", [JSON.stringify(x)]);
OutputWindow.Show(OutputLevel.Trace, "Search results", undefined, false);
searchLogs.forEach((logs, key) => {
OutputWindow.Show(OutputLevel.Trace, key, logs, false);
});
And here is the section from the output
Searching folder
FDIR OUTPUT
[
{
"dir":"/home/dave/MagentoAPI",
"files":[ ]
},
{
"dir":"/home/dave/MagentoAPI/dev",
"files":[
"/home/dave/MagentoAPI/dev/docker-compose.yml",
"/home/dave/MagentoAPI/dev/dockerfile"
]
},
{
"dir":"/home/dave/MagentoAPI/dev/.app",
"files":[
"/home/dave/MagentoAPI/dev/.app/composer.json",
"/home/dave/MagentoAPI/dev/.app/composer.lock"
]
},
{
"dir":"/home/dave/MagentoAPI/dev/.vscode",
"files":[
"/home/dave/MagentoAPI/dev/.vscode/settings.json"
]
},
{
"dir":"/home/dave/MagentoAPI/dev/mysql-data",
"files":[
/// Bunch of SQL files removed for brevity
]
},
{
"dir":"/home/dave/MagentoAPI/dev/php-confs",
"files":[
"/home/dave/MagentoAPI/dev/php-confs/override.ini"
]
},
{
"dir":"/home/dave/MagentoAPI/dev/src",
"files":[
/// Bunch of PHP files removed for brevity
"/home/dave/MagentoAPI/dev/src/AttributesSamplePayload.json",
"/home/dave/MagentoAPI/dev/src/MageAPIdev.code-workspace",
"/home/dave/MagentoAPI/dev/src/newfile.txt",
"/home/dave/MagentoAPI/dev/src/style.css",
"/home/dave/MagentoAPI/dev/src/workspace.code-workspace"
]
},
{
"dir":"/home/dave/MagentoAPI/dev/.app/vendor",
"files":[
"/home/dave/MagentoAPI/dev/.app/vendor/autoload.php"
]
}
]
--------------------
Search results
Path: /home/dave/MagentoAPI/dev/src/styles/main.scss
isMatch: true
- Base path: /home/dave/MagentoAPI
- Rela path: dev/src/styles/main.scss
compare: true
- Orig file path: /home/dave/MagentoAPI/dev/src/styles/main.scss
- Orig sass path: /home/dave/MagentoAPI/dev/src/styles/main.scss
Hello! This issue is just to easily track what we discussed in a comment:
As sync
now returns Output
, we need a way to tell the compiler which output it is.
One thing we could do is just cast x = x as PathsOutput;
but that needs the types combined to create Output
to be exported in the .d.ts
file.
Hi! 👋 I'm shopping around for a fast npm globber to use in a visual studio code extension. The other npm libraries are still too slow for our very large project.
This extension needs only directories, not files. I see the withDirs
function, but capturing files seems deep in the shared code:
https://github.com/thecodrr/fdir/blob/master/src/api/shared.js#L43-L44
How hard would it be to make an "onlyDirs()" builder or something? Is this something you think would be a good addition? I didn't see anything in filter
either that would allow this to work.
Thanks!
fdir
has reached maximum performance. I am not saying that to discourage others from trying to make a faster directory crawler. Not at all. But I have stripped fdir
down to its bones and haven't noticed any significant increase in speed. The only space left for improvement is directly in NodeJS internals. So where does that leave us?
I had initially planned to freeze the API of fdir
but fdir
has no proper "API". What I would be freezing, most probably, would be more features. I can't do that though. Not yet, anyway. Why?
fdir
is not feature complete. I don't want to make just the fastest directory crawler but also the best one.fdir
to ever feel like it can't do X or Z. An impossible feat but...So what's the plan for v6.0
?
fdir
with a free pass. The result is that there is a lot going on at one time. It is time to simplify things...and that brings me to:fdir
pluggable — after some initial thought, this is the best way forward. The idea is to allow anyone to directly tap into the crawling process and control it.Does this mean that the Builder API is dying? Nope. I like the Builder API and it'll stay. It will act as a kind of a proxy for the plugins underneath. That means the Builder API will also need to be extensible.
The end goal is to make the following features possible:
fdir
to fs
is a bad idea)In the end, the idea is to reduce complexity, increase flexibility, improve readability, and maintain the performance (and become the de-facto
directory crawler in the Nodeverse).
In any case, that's the plan. I would love to hear what you guys think about this. Ideas, suggestions, possible ways to implement a plugin system, ideas for plugins, etc, etc, etc. is all welcome!
I have a need to list the path to a symlink itself rather than the resolved path of the symlink.
Currently symlinks are ignored unless withSymLinks()
is used, however if you use withSymLinks()
, then the resolved path of the symlink is returned instead.
From what I can see it may be possible to list the symlink path in Walker
, perhaps with a new option/chain.
Thank you for this fast and useful library.
I have a problem, as VSCode is not happy that you declare that you will return String[]
from fdir.async
method. If it is string[]
then the issue is no longer there.
fdir is without a doubt the fastest globbing library as well (I benchmarked) but evidence needs to be published before it can be advertised.
crawlWithOptions
API in combination with withPromise
no longer outputs a GroupOutput
as per the typings.
The test script:
const { fdir } = require('../');
const test = async (dir) => {
const files = await new fdir()
.crawlWithOptions(dir, {
includeBasePath: true,
group: true,
})
.withPromise();
console.log(files);
};
test(`${__dirname}/dir`);
$ tree fdir-api-change/
fdir-api-change/
├── dir
│ ├── a
│ │ ├── a.txt
│ │ └── b
│ │ ├── b.txt
│ │ └── x.txt
│ └── dir.txt
└── test.js
$ git checkout 8f1c4b9 && node fdir-api-change/test.js
HEAD is now at 8f1c4b9 fix: make all tests pass
[
'home/source/fdir/fdir-api-change/dir/': [ 'home/source/fdir/fdir-api-change/dir/dir.txt' ],
'home/source/fdir/fdir-api-change/dir/a/': [ 'home/source/fdir/fdir-api-change/dir/a/a.txt' ],
'home/source/fdir/fdir-api-change/dir/a/b/': [
'home/source/fdir/fdir-api-change/dir/a/b/b.txt',
'home/source/fdir/fdir-api-change/dir/a/b/x.txt'
]
]
$ git checkout 8f1c4b9^ && node fdir-api-change/test.js
Previous HEAD position was 8f1c4b9 fix: make all tests pass
HEAD is now at 0cafa69 feat: refactor & minor performance improvements
[]
$ git checkout 8f1c4b9^^ && node fdir-api-change/test.js
Previous HEAD position was 0cafa69 feat: refactor & minor performance improvements
HEAD is now at 16d0790 feat: add withRelativePaths option (fix #51)
[
{
dir: 'home/source/fdir/fdir-api-change/dir',
files: [ 'home/source/fdir/fdir-api-change/dir/dir.txt' ]
},
{
dir: 'home/source/fdir/fdir-api-change/dir/a',
files: [ 'home/source/fdir/fdir-api-change/dir/a/a.txt' ]
},
{
dir: 'home/source/fdir/fdir-api-change/dir/a/b',
files: [
'home/source/fdir/fdir-api-change/dir/a/b/b.txt',
'home/source/fdir/fdir-api-change/dir/a/b/x.txt'
]
}
]
Unless I'm missing something very obvious (entirely possible), I have to lock at 5.1.0 and miss out on performance boosts or change my api to call Object.entries, which I assume is much slower than the original version.
https://github.com/thecodrr/fdir/blob/master/documentation.md#excludefunction
Seems to suggest, that you only match on the deepest folder, but it doesn't it seems to hand you the complete path to match on. It also seems to change depending on other parameters.
const elmJsonGlob = `${globUri}/**/elm.json`;
let x = new fdir()
.glob(elmJsonGlob)
.exclude(
(dir) =>
dir.startsWith(".") ||
dir === "node_modules" ||
dir === "elm-stuff",
)
.crawl(".")
.sync();
this will lead to dir
in the exclude
function being something like ./path/.git/file
const elmJsonGlob = `${globUri}/**/elm.json`;
let x = new fdir()
.glob(elmJsonGlob)
.exclude(
(dir) =>
dir.startsWith(".") ||
dir === "node_modules" ||
dir === "elm-stuff",
)
.withFullPaths()
.crawl(".")
.sync();
this will lead to dir
in the exclude
function being something like home/razze/Development/elm-pages-starter/node_modules/chalk
on linux
I feel like that would at least need to be pointed out.
Something like this:
fdir.withDirs()
.withCounts()
.withSearch()
.withBasePath()
.crawl("node_modules");
Currently, fdir is the fastest directory crawler in the Node.js world even with filtering/globbing. However, the filtering performance is not up to par with non-filtering performance i.e., the gap is too big.
Current:
Running "Synchronous (2642 files, 330 folders)" suite...
fdir simple sync:
283 ops/s, ±0.63% | fastest
fdir filter sync:
278 ops/s, ±0.35% | 1.77% slower
fdir glob sync:
259 ops/s, ±0.33% | slowest, 8.48% slower
Running "Asynchronous (2642 files, 330 folders)" suite...
fdir simple async:
468 ops/s, ±2.32% | fastest
fdir filter async:
428 ops/s, ±2.55% | 8.55% slower
fdir glob async:
378 ops/s, ±2.45% | slowest, 19.23% slower
Okay, filter performance is ~2-10% slower while glob performance is ~10-20% slower. That is quite slow relatively.
So the question is: How do we reduce this performance gap?
Sample code to reproduce:
const { fdir } = require("fdir");
const api = new fdir()
.withFullPaths()
.filter((filePath) => {
if (Math.random() > 0.5) {
throw new Error("Oh crap!");
}
return true;
})
.crawl("./client");
console.log(api.sync().length);
You can run this over and over and it will just console log out a different number each time.
Suppose it wasn't if (Math.random() > 0.5) {
but something like return doubleCheck(fiePath)
you wouldn't get any feedback about the accidental typo of fiePath
.
How to search for a folder by its name and return the folder path instead of returning all the parent directories?
const files = new fdir()
.withBasePath()
// .withDirs()
.withFullPaths()
// .glob("F:/CGLibrary/**/*Video*")
.filter((path) => path.indexOf("Video") != -1)
.crawl("F:/")
.sync();
console.log(files);
Ideally, symlinks would be followable or this would be an option. At the moment, they are completely ignored:
https://github.com/thecodrr/fdir/blob/master/src/api/shared.js#L48-L54
This appears to be causing this downstream issue in snowpack: FredKSchott/snowpack#2969
I'm having trouble getting fdir to work like globby does, as I'm trying to replace it. In fact it seems to fail pretty randomly.
[Info - 8:40:11 PM] Glob /home/razze/Development/elm-spa-example/src/**/*.elm
[Info - 8:40:11 PM] Globby 33 - /home/razze/Development/elm-spa-example/src/Api.elm,/home/razze/Development/elm-spa-example/src/Article.elm,/home/razze/Development/elm-spa-example/src/Asset.elm,/home/razze/Development/elm-spa-example/src/Author.elm,/home/razze/Development/elm-spa-example/src/Avatar.elm,/home/razze/Development/elm-spa-example/src/CommentId.elm,/home/razze/Development/elm-spa-example/src/Email.elm,/home/razze/Development/elm-spa-example/src/Loading.elm,/home/razze/Development/elm-spa-example/src/Log.elm,/home/razze/Development/elm-spa-example/src/Main.elm,/home/razze/Development/elm-spa-example/src/Page.elm,/home/razze/Development/elm-spa-example/src/PaginatedList.elm,/home/razze/Development/elm-spa-example/src/Profile.elm,/home/razze/Development/elm-spa-example/src/Route.elm,/home/razze/Development/elm-spa-example/src/Session.elm,/home/razze/Development/elm-spa-example/src/Timestamp.elm,/home/razze/Development/elm-spa-example/src/Username.elm,/home/razze/Development/elm-spa-example/src/Viewer.elm,/home/razze/Development/elm-spa-example/src/Api/Endpoint.elm,/home/razze/Development/elm-spa-example/src/Article/Body.elm,/home/razze/Development/elm-spa-example/src/Article/Comment.elm,/home/razze/Development/elm-spa-example/src/Article/Feed.elm,/home/razze/Development/elm-spa-example/src/Article/Slug.elm,/home/razze/Development/elm-spa-example/src/Article/Tag.elm,/home/razze/Development/elm-spa-example/src/Page/Article.elm,/home/razze/Development/elm-spa-example/src/Page/Blank.elm,/home/razze/Development/elm-spa-example/src/Page/Home.elm,/home/razze/Development/elm-spa-example/src/Page/Login.elm,/home/razze/Development/elm-spa-example/src/Page/NotFound.elm,/home/razze/Development/elm-spa-example/src/Page/Profile.elm,/home/razze/Development/elm-spa-example/src/Page/Register.elm,/home/razze/Development/elm-spa-example/src/Page/Settings.elm,/home/razze/Development/elm-spa-example/src/Page/Article/Editor.elm
[Info - 8:40:11 PM] Fdir 33 - /home/razze/Development/elm-spa-example/src/Api/Endpoint.elm,/home/razze/Development/elm-spa-example/src/Api.elm,/home/razze/Development/elm-spa-example/src/Article/Body.elm,/home/razze/Development/elm-spa-example/src/Article/Comment.elm,/home/razze/Development/elm-spa-example/src/Article/Feed.elm,/home/razze/Development/elm-spa-example/src/Article/Slug.elm,/home/razze/Development/elm-spa-example/src/Article/Tag.elm,/home/razze/Development/elm-spa-example/src/Article.elm,/home/razze/Development/elm-spa-example/src/Asset.elm,/home/razze/Development/elm-spa-example/src/Author.elm,/home/razze/Development/elm-spa-example/src/Avatar.elm,/home/razze/Development/elm-spa-example/src/CommentId.elm,/home/razze/Development/elm-spa-example/src/Email.elm,/home/razze/Development/elm-spa-example/src/Loading.elm,/home/razze/Development/elm-spa-example/src/Log.elm,/home/razze/Development/elm-spa-example/src/Main.elm,/home/razze/Development/elm-spa-example/src/Page/Article/Editor.elm,/home/razze/Development/elm-spa-example/src/Page/Article.elm,/home/razze/Development/elm-spa-example/src/Page/Blank.elm,/home/razze/Development/elm-spa-example/src/Page/Home.elm,/home/razze/Development/elm-spa-example/src/Page/Login.elm,/home/razze/Development/elm-spa-example/src/Page/NotFound.elm,/home/razze/Development/elm-spa-example/src/Page/Profile.elm,/home/razze/Development/elm-spa-example/src/Page/Register.elm,/home/razze/Development/elm-spa-example/src/Page/Settings.elm,/home/razze/Development/elm-spa-example/src/Page.elm,/home/razze/Development/elm-spa-example/src/PaginatedList.elm,/home/razze/Development/elm-spa-example/src/Profile.elm,/home/razze/Development/elm-spa-example/src/Route.elm,/home/razze/Development/elm-spa-example/src/Session.elm,/home/razze/Development/elm-spa-example/src/Timestamp.elm,/home/razze/Development/elm-spa-example/src/Username.elm,/home/razze/Development/elm-spa-example/src/Viewer.elm
[Info - 8:40:11 PM] Glob /home/razze/Development/elm-spa-example/tests/**/*.elm
[Info - 8:40:11 PM] Globby 1 - /home/razze/Development/elm-spa-example/tests/RoutingTests.elm
[Info - 8:40:11 PM] Fdir 1 - /home/razze/Development/elm-spa-example/tests/RoutingTests.elm
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/NoRedInk/elm-json-decode-pipeline/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/NoRedInk/elm-json-decode-pipeline/1.0.0/src/Json/Decode/Pipeline.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 11 - /home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/AnimationManager.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/Dom.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/Events.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/Navigation.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Expando.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/History.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Main.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Metadata.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Overlay.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Report.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 18 - /home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Array.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Basics.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Bitwise.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Char.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Debug.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Dict.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/List.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Maybe.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Platform.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Process.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Result.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Set.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/String.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Task.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Tuple.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Elm/JsArray.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Platform/Cmd.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Platform/Sub.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 5 - /home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Attributes.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Events.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Keyed.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Lazy.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 3 - /home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/Http.elm,/home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/Http/Internal.elm,/home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/Http/Progress.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/json/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 2 - /home/razze/.elm/0.19.1/packages/elm/json/1.0.0/src/Json/Decode.elm,/home/razze/.elm/0.19.1/packages/elm/json/1.0.0/src/Json/Encode.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/time/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/elm/time/1.0.0/src/Time.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 5 - /home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Builder.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Parser.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Parser/Internal.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Parser/Query.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm-explorations/markdown/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/elm-explorations/markdown/1.0.0/src/Markdown.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/rtfeldman/elm-iso8601-date-strings/1.1.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/rtfeldman/elm-iso8601-date-strings/1.1.0/src/Iso8601.elm
[Info - 8:40:11 PM] Fdir 0 -
[Info - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/**/*.elm
[Info - 8:40:11 PM] Globby 15 - /home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Expect.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Float.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Fuzz.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Lazy.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/MicroRandomExtra.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/RoseTree.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Shrink.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Fuzz/Internal.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Lazy/List.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Expectation.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Fuzz.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Internal.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Runner.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Runner/Failure.elm
[Info - 8:40:11 PM] Fdir 0 -
const x = new fdir().glob(`**/*.elm`).withFullPaths().crawl(globUri).sync();
const y = globby
.sync(`${globUri}/**/*.elm`, { suppressErrors: true });
I also tried const x = new fdir().glob(
${globUri}/**/*.elm).withFullPaths().crawl(".").sync();
which seemed to also not find the same things.
I tried to extend the depths, but that did also not help.
Notice the double slash in the filename.
import * as fdir from 'fdir';
const go = async () => {
const files = await fdir.async('/Users/xo/Downloads/', {});
console.log(files); // ['/Users/xo/Downloads//Download.PDF']
};
go().catch(error => {
console.error(error);
});
Code could be simplified by removing the callback API. There are native promise apis for most functions now.
Not sure how performance is effected though - this would be the primary concern.
import { readdir } from 'fs/promises';
try {
const files = await readdir(path);
for (const file of files)
console.log(file);
} catch (err) {
console.error(err);
}
Would pave the way for supporting modern API interfaces such as Node Streams, Web Streams, AsyncInterators. AbortController could also be used for early-exit scenarios.
Hi, I'm attempting to convert from globby to fdir, and I'm curious if you have any ideas for a good way to mock fdir returns. Previously, we had:
(globby.sync as jest.Mock).mockReturnValueOnce([]);
But, since fdir uses a fluent api, I can't just mock fdir.sync
. Is this something you've run into before? And if so, how did you handle it?
Hello, I was wondering if there will be a way to sort based on date, name and size in a future version.
Thank you
Hello, first of all thanks for this awesome package!!
The documentation says that
Stream API will be added soon.
Is it already a work in progress? I would really like to see this.
Also, I want to suggest providing an Async Iterator API instead of a stream API. The reason is that an Async Iterator can be easily converted into a readable stream using into-stream without loss of performance (into-stream
automatically handles backpressure and all stream quirks), while the opposite conversion is very nontrivial (actually I think it's impossible, since readable streams start filling their internal buffers at soon as they begin flowing and therefore can't be converted to a one-step-at-a-time async iterator).
I have a pretty thick skin, and I like that you're pushing the envelope with performance here. You seem like a really smart guy, but the usage of the words shit come off a bit crass. Just my opinion, take it or leave it 😄.
Hi there!
I'm getting this error while trying to create a new builder. Any ideas what's going on?
Thanks
I would like to use a chain of multiple .filter()
operations, but have them successively refine the results. For example, I would like to find files where:
.html
/src/
/trunk/
test
I would like to implement that via this code:
const crawler = new fdir()
.withBasePath()
.filter(path => path.match(/\.html/))
.filter(path => path.contains('/src/'))
.filter(path => path.contains('/trunk/'))
.filter(path => !path.contains('test'));
The current implementation of multi-filters treats this like an OR, which gives me many undesired results.
Thank you for considering my request!
What is the correct way to utilize the globbing functionality in fdir? When I run the following test, it does not properly filter the results.
const fs = require('fs');
const { fdir } = require('fdir');
//make some temp files to scan
if (!fs.existsSync('./temp')) {
fs.mkdirSync('./temp');
fs.writeFileSync('./temp/alpha.json', '');
fs.writeFileSync('./temp/beta.txt', '');
}
//find only the files that match the glob
const results = new fdir().glob('*.txt').crawlWithOptions('./temp', {}).sync();
console.log(results);
The glob seems to have no effect, and I'm instead given all of the files in the directory.
Am I using it wrong? I've installed picomatch, and tried this on fdir 5.2.0, 5.1.0, 5.0.0, 4.1.0, and 4.0.0, all with the same results.
Running the same test but fast-glob (which also uses picomatch) returns the expected results.
const results = require('fast-glob').sync('*.txt', { cwd: './temp' });
Hello, can you please provide an option for a max file limit that once X files have been found, stop further scanning operations?
I understand an operation may be in progress that returns file lists greater than the limit, so if it can just discard the excess.
Use case is scanning directories with large number of files but you only want to pull a few of them out at a time, for example a "drop/pickup" directory where external sources drop huge numbers of files into the directory and the node process scans and it and picks up work out of it, we may have millions of files but only want to process a few thousand at a time.
One primary reason is that memory usage is really high when the folder is large due to generating strings for all of the file names and holding them in memory during the operations, and would like to constrain that.
I tried to implement count limits by filtering like so but this doesn't stop the library from further work, it just more discards the work.
export async function walkDir(dir: string, filter: (file: string, isDirectory: boolean) => boolean | void, limit?: number): Promise<Array<string>> {
let apiBuilder = (new fdir()).withFullPaths();
if (limit) {
let count = 0;
apiBuilder = apiBuilder.filter(function (file, isDirectory) {
return (isDirectory ? count < limit : count++ < limit) && (!filter || filter(file, isDirectory));
});
} else if (filter) {
apiBuilder = apiBuilder.filter(filter);
}
return apiBuilder.crawl(dir).withPromise();
}
So a way to improve this memory wise would be nice.
Hey there,
do you have any plan for when you would like to do the next release?
Cheers
like glob, can not found.
I am interested in how much faster it could be using native code. Via child_process
or N-API.
Someone already mentioned Rust's https://github.com/jessegrosjean/jwalk. Some other Rust discussion here. Would be cool to have a napi-rs
wrapper on it and see the speed diff.
What’s the fastest way to read a lot of files?
crossbeam
for parallelism and ignore
for recursive dir walker.ignore
and a lot of ripgrep
util crates.ignore
and walkdir
.gitstatusd
the_silver_searcher
globrex
~350% faster than node-glob and ~230% faster than fast-glob
micromatch
, can also use picomatch
No implementation exists yet.
Would be good to try with AssemblyScript.
Can it return the size of each file
Hi,
I really appreciate the work you did. Maybe a Promise support could be good for a more flexible code?
Thanks for your work, Olyno.
Cheekily, just wanted to abuse an issue for a good cause.
MDN Web Docs (now) uses fdir
and it's looking extremely promising.
We used to use glob.sync()
and a whole mixed bag of other path.join()
and tricky stuff.
Today I was able to replace all of it with fdir
and I did it in such a way that I kept the old function so I could compare.
All the numbers are in mdn/yari#3537 and it might be a bit hard to read, but "So basically, the new function is 10x faster." is easy to understand :)
Awesome work!
Also, how I wish Node could get a low-level C++ native implementation, in the standard library, of path.join()
and those guys to make this problem go away.
Hi. I'm an active user of glob
and 92% faster seems neat.
In order to switch, I need to know that my existing patterns will be maintained. Will you please create a near-top readme
section telling me how the two libraries' pattern outlay lines up?
Some of the fuss between glob
and other competitors was when glob
chose to remove features to come into line with other standards. I'd like to know where those topics stand, and if they line up, I will probably switch
Thanks for listening
Currently, the excludeFn
function I pass to crawler.exclude(excludeFn)
receives only the basename of the directory as a parameter. I would like to decide to exclude or not based on the full path. I propose passing also path
as a second parameter to it, so it is not a breaking change.
See this example file:
const { fdir } = require("fdir");
async function main() {
const files = await new fdir()
.withBasePath()
.crawl("non-existent") // directory does not exist!
.withPromise();
console.log("this will never be written");
}
if (require.main === module) {
main().catch((e) => {
process.exitCode = 1;
console.error(e);
});
}
(save as index.js
, for example, and run node ./index.js
)
The Node.js process just exits (exit code 0) and no error is shown. The message "this will never be written" is also not shown.
As the docs say that all errors are suppressed, I would expect files
to be an empty array and program flow to continue normally (printing the "this will never be written" message).
I prefer not to say.
$ node -v
v15.4.0
$ cat node_modules/fdir/package.json | grep version
"version": "4.1.0",
When adding .withErrors()
, the error is reported as expected ("ENOENT: no such file or directory, scandir 'non-existent'") so using that option, together with a try-catch or .catch() could be a possible workaround.
I know you're actively developing this however the documentation on the readme doesn't work out of the box
it requires new fdir() not fdir.new()
also I noticed you went through and tried to add an error if they use glob, however these checks don't work and tries to load picomatch regardless which means on a simple npm i fdir
and running const fdir = require("fdir").default;
results in Error: Cannot find module 'picomatch'
Also good work on this, I would like to add an enhancement which is case insensitivity :)
@thecodrr any chance of the latest commit being pushed as a new version on npm?
Hi,
Assuming current cwd is /Users/user/Development/project
, crawl("src/target")
returns:
withFullPaths()
: /Users/user/Development/project/src/target/a/b/c
,
withBasePath()
: src/target/a/b/c
,
It would be very nice to have one of the following:
withRelativePath()
: a/b/c
withRelativePath("src")
: target/a/b/c
.Thanks,
As noted in #23 (comment), currently globs match against resolved symlink paths, not the symlink paths themselves. This is a bit counter-intuitive, and I think that the symlink paths should be matched and returned instead, perhaps with an option to return real paths instead.
I like to use options like:
import {fdir, Options} from 'fdir'
const opts: Options = {
includeBasePath: true,
exclude: p => {
return p.indexOf('node_modules') > -1
},
filters: [p => p.endsWith('package.json')],
}
const files = new fdir().crawlWithOptions(dir, opts).sync()
return files
TS2459: Module '"fdir"' declares 'Options' locally, but it is not exported.
Also, thoughts on runtime validation for this? With zod
or something.
I think a lot of my early testing of this package was using sync
api which made me think it was not that fast.
It's common for a Node dev to reach for sync apis for file access in cli tools...as you wouldn't normally think async would give speed gains, unless you are running in a webserver where you don't want to block the event loop.
But for fdir
it offers huge benefits in speed when the threadpool is increased. On macOS at least.
UV_THREADPOOL_SIZE=8
776457
fdir: 2.336s
776457
fdir.sync: 8.799s
UV_THREADPOOL_SIZE=2
776457
fdir: 6.870s
776457
fdir.sync: 7.804s
Crazy its set so low. https://www.sebastienvercammen.be/your-libuv-thread-pool-size-is-too-small/
Also, make a note of increasing UV_THREADPOOL_SIZE
.
Will this package ever have typescript support?
depends on 'fdir'. CommonJS or AMD dependencies can cause optimization bailouts.
Currently the API allows a .glob(...patterns)
method in the Builder alongside the crawl
one. All the globbing libraries (fast-glob & glob) that I have used do not allow the user to specify the source directory.
Maybe glob
and crawl
should be at an equal API level.
Permission errors don't seem to be handled.
import * as fdir from 'fdir';
fdir.async('/', {}).then(files => {
console.log(files);
});
TypeError: Cannot read property 'length' of undefined
at /Users/xo/code/r/node_modules/fdir/index.js:40:39
at fs.js:153:23
at FSReqCallback.req.oncomplete (fs.js:778:9)
Changing the affected line to this.
fs.readdir(dir, readdirOpts, function(_, dirents) {
console.log({_,dirents});
for (var j = 0; j < dirents.length; ++j) {
I get the following.
{
_: [Error: EACCES: permission denied, scandir '//.fseventsd'] {
errno: -13,
code: 'EACCES',
syscall: 'scandir',
path: '//.fseventsd'
},
dirents: undefined
}
The shared.js
file contains some "dummy variables" that are later filled according to the options; however, if I make two (or more) calls concurrently, the second call may overwrite those variables while the first call did not finish yet, and then we get garbage results for the first call.
const fdir = require('fdir')
fdir.async('.', { maxDepth: 99 }).then((res) => {
console.log(res, res.length)
})
In a rather large folder structure, the async & sync method with maxDepth doesn't work properly:
[
'/Users/johan/test/test-repos/.DS_Store',
'/Users/johan/test/test-repos/.gitignore',
'/Users/johan/test/test-repos/.jscpd.json',
'/Users/johan/test/test-repos/.mrconfig',
'/Users/johan/test/test-repos/Jakefile',
'/Users/johan/test/test-repos/Makefile',
'/Users/johan/test/test-repos/README.md',
'/Users/johan/test/test-repos/async.js',
'/Users/johan/test/test-repos/distribution.txt',
'/Users/johan/test/test-repos/gamerules.txt',
'/Users/johan/test/test-repos/games.txt',
'/Users/johan/test/test-repos/gfw.txt',
'/Users/johan/test/test-repos/langlib.txt',
'/Users/johan/test/test-repos/package.json',
'/Users/johan/test/test-repos/project.txt',
'/Users/johan/test/test-repos/stats.txt',
'/Users/johan/test/test-repos/yarn.lock',
'/Users/johan/test/test-repos/.git/COMMIT_EDITMSG',
'/Users/johan/test/test-repos/.git/FETCH_HEAD',
'/Users/johan/test/test-repos/.git/HEAD',
'/Users/johan/test/test-repos/.git/MERGE_RR',
'/Users/johan/test/test-repos/.git/ORIG_HEAD',
'/Users/johan/test/test-repos/.git/config',
'/Users/johan/test/test-repos/.git/index',
'/Users/johan/test/test-repos/.git/packed-refs',
'/Users/johan/test/test-repos/.git/pre-commit',
'/Users/johan/test/test-repos/.git/smartgit.config',
'/Users/johan/test/test-repos/node_modules/.yarn-integrity',
'/Users/johan/test/test-repos/games/.DS_Store',
'/Users/johan/test/test-repos/games/README.md'
] 30
Without the maxDepth option, fdir properly scans through and finds 204516
files
OSX Catalina / APFS
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.