Comments (7)
No, the interior paths dealt with by s3fs (and almost all other backends) do not include the protocol, but "bucket/path/file"; most methods also strip the protocol prefix on the first line.
Noting that
>>> "s3://".join(("bucket", "key"))
'buckets3://key'
I am puzzled how this might have worked for you.
from s3fs.
I am puzzled how this might have worked for you.
lol! yeah that was a silly "fix" on my part...
It looks like the reason it worked is because it causes
Line 1355 in a28863f
KeyError
s which are caught and ignored byhttps://github.com/fsspec/filesystem_spec/blob/59a2d648b43c12a5db7b7eac834d7af33df8a1fe/fsspec/spec.py#L376 which then allows execution to reach
Lines 1372 to 1391 in a28863f
In the current state (i.e., without breaking the following line)
Line 1355 in a28863f
execution reaches
https://github.com/fsspec/filesystem_spec/blob/59a2d648b43c12a5db7b7eac834d7af33df8a1fe/fsspec/spec.py#L372-L374
which raises the
FileNotFoundError
from s3fs.
Note that the two paths required by rsync
:
must be a directory, but do not include any terminating "/" character
Can you please introspect the state of fs.dircache at the time of your call? _ls_from_cache
raising implies that the parent directory was already listed, but the requested file does not exist in it.
from s3fs.
@martindurant Looks like path
is testbucket/rsynctest/testfile1.txt
dircache[parent]
is [{'Key': 'testbucket/rsynctest/testfile1.txt', 'LastModified': datetime.datetime(2024, 6, 6, 15, 18, 4, tzinfo=tzutc()), 'ETag': '"54b0c58c7ce9f2a8b551351102ee0938"', 'Size': 14, 'StorageClass': 'STANDARD', 'type': 'file', 'size': 14, 'name': 's3://testbucket/rsynctest/testfile1.txt'}, {'Key': 'testbucket/rsynctest/testfile2.txt', 'LastModified': datetime.datetime(2024, 6, 6, 15, 18, 10, tzinfo=tzutc()), 'ETag': '"d158a5494cf76bf2cbbe40a7aa674543"', 'Size': 20, 'StorageClass': 'STANDARD', 'type': 'file', 'size': 20, 'name': 's3://testbucket/rsynctest/testfile2.txt'}, {'Key': 'testbucket/rsynctest/testfile3.txt', 'LastModified': datetime.datetime(2024, 6, 6, 15, 18, 15, tzinfo=tzutc()), 'ETag': '"d285a2d347411ad4726fc350077138be"', 'Size': 24, 'StorageClass': 'STANDARD', 'type': 'file', 'size': 24, 'name': 's3://testbucket/rsynctest/testfile3.txt'}]
So it looks like
files = [
f
for f in self.dircache[parent]
if f["name"] == path
or (f["name"] == path.rstrip("/") and f["type"] == "directory")
]
is checking f["name"]
which includes the s3://
prefix against path
which does not include the prefix, but either needs to check against f["Key"]
or add the prefix to path
if continuing to use f["name"]
PS. I can confirm that my paths passed to rsync do not have terminating /
characters
from s3fs.
@martindurant thoughts on this? Should I PR a check on "Key" or is something else wrong here?
from s3fs.
I think this is it:
--- a/fsspec/generic.py
+++ b/fsspec/generic.py
@@ -197,6 +197,7 @@ class GenericFileSystem(AsyncFileSystem):
)
result = {}
for k, v in out.items():
+ v = v.copy() # don't corrupt target FS dircache
name = fs.unstrip_protocol(k)
v["name"] = name
result[name] = v
from s3fs.
(actually, I see there are a couple of other instances of this in the class; the one I picked is probably the one you were bitten by)
from s3fs.
Related Issues (20)
- Inconsistent recursive `put` behavior when running an identical command twice successively HOT 1
- open_async file is closed on arrival HOT 1
- set_session does not seem to be thread / jobs safe HOT 4
- Random XAmzContentSHA256Mismatch Errors HOT 6
- Access denied when providing an authentication token associated with a set of permission policies to S3FileSystem HOT 3
- calling flush on s3fs fails HOT 2
- s3fs 2024.3.0 fails reading glob patterns through pandas HOT 12
- Question: is awscrt useful ? HOT 2
- Errors when installing s3fs on Sagemaker Studio HOT 1
- Why isn't Pathlib supported yet? HOT 1
- Working example of using Async/Await HOT 7
- Custom s3 compatible https endpoint not working, port forwarded to localhost works HOT 9
- How to Increase async httpconnection limit? HOT 7
- Does aioboto3 Support Authentication with EC2 IAM Roles? HOT 3
- upload function didn't recognize the file path having "[]". HOT 4
- How to upload a list of files from local fs to cloud s3 fs async? HOT 3
- Writing metadata with underscores fail silently HOT 1
- s3fs.exists incorrectly returns False after calling glob
- When requesting the wrong version of an existing file, the `FileNotFoundError` could be more informative. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from s3fs.