Comments (10)
Hello Andrea,
I see you are using the -vvv
option. Can you send the debug output into a file and send it to us? (e-mail it to [email protected] as to not upload the file here, given it may contain secrets)
$ gfal-copy -vvv --log-file=gfal2.log <src> <dst>
from gfal2.
Hello,
I've managed to reproduce this problem and here are my findings:
After the SRM PUT, the HTTPS TURL that Gfal2 receives shows that the destination file already exists (with size 0) on the HTTP storage. This causes problems for Gfal2 (and underlying Davix library), which wants the file to not exist in order to perform an upload. This is also behind the cryptic error message:
Transfer failed with: Impossible to write, no buffer. (file was opened only for reading?)
Example:
- Checking manually that the file doesn't exist
$ davix-http -X HEAD --trace header --cert /tmp/x509up_u0 --insecure https://transfer-test.cr.cnaf.infn.it:8443/disk/test-mipatras
> HEAD /disk/test-mipatras HTTP/1.1
< HTTP/1.1 404 Not Found
- The same stat requests, but happening after the SRM PUT
$ gfal-copy -vv --force gsiftp://transfer-test.cr.cnaf.infn.it:2811//storage/gemss_test1/dteam/folder/test-andre srm://storm-test.cr.cnaf.infn.it:8444/disk/test-mipatras
...
INFO Davix: > HEAD /disk/test-mipatras HTTP/1.1
INFO Davix: < HTTP/1.1 200 OK
...
It looks to me that StoRM creates a 0-size file when the SRM PUT is done.
Can you confirm?
Cheers,
Mihai
from gfal2.
Hello Mihai,
I confirm that StoRM creates a 0-size file before performing the real transfer.
Actually, I guess a transfer from srm+https to srm+https succeeds in overwriting the 0-size file because a PUT command is issued by gfal. Is this correct?
Cheers,
Andrea
from gfal2.
Hi all,
I understood that:
- with a http|https TURL (returned after a srmPtP with http|https specified as transfer protocol) gfal (using davix library below) cannot assume it's a WebDAV endpoint and tries a HEAD (that finds a file) + POST (which fails because POST is not admitted by WebDAV)
- using
--force
lets gfal do a PUT instead of a POST (?)
- using
- with a dav|davs TURL (not returned by our StoRM srmPtP) gfal issues a HEAD (that finds a file) + PUT instead of a POST because it "recognises" that is a WebDAV TURL (no need to
--force
the request at gfal side)
is this correct?
If yes I think that from our side (StoRM side) we could work to return a dav TURL if it's explicitly requested as transfer protocol. I wrote only "dav" because I cannot find "davs" into the official IANA registered schemes https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml.
But this could be enough I think.
Cheers,
Enrico
from gfal2.
Hi,
unfortunately, adding the --force
option the transfer fails anyway:
gfal-copy -f srm://storm-test.cr.cnaf.infn.it:8444/folder/test-andre srm://storm-test.cr.cnaf.infn.it:8444/disk
Copying srm://storm-test.cr.cnaf.infn.it:8444/folder/test-andre [FAILED] after 6s
gfal-copy error: 5 (Input/output error) - Impossible to write, no buffer. (file was opened only for reading?)
Cheers,
Andrea
from gfal2.
Hello both,
@enricovianello I would view the "dav://" and "davs://" protocol schemas used in the Grid world more as "HTTP Grid Storage" then actual "Webdav" (as in the IANA document)
For the problem at hand, I have to explain a bit on how Gfal2 works with SRM:
SRM is not a data/transfer protocol. It is used for metadata operations, but when it comes to reading/writing data, one has to ask the SRM server for a TransferURL (TURL
) or Transfer URL 3rd Party (TURL_3RD_PARTY
).
When you instruct Gfal2 to copy involving SRM, Gfal2 will ask the SRM server for the TURL_3RD_PARTY
, then perform the copy again with the resolved URL replaced. Example:
gfal2.copy(<SRM_src>, <SRM_dst>)
-- gfal2.resolve_SRM_TURL(<SRM_src>) --> GridFTP_src
-- gfal2.resolve_SRM_TURL(<SRM_dst>) --> HTTPS_dst
gfal2.copy(<GridFTP_src>, <HTTPS_dst>)
So far, I see 3 scenarios here:
gfal2.copy(<SRM_src>, <SRM_dst>)
gfal2.copy(<GridFTP_src>, <SRM_dst>)
gfal2.copy(<HTTPS_src>, <SRM_dst>)
Also important to note:
<SRM_src>
will resolve to<GridFTP_src>
(so scenarios1.
and2.
can be treated the same)<SRM_dst>
will resolve to an HTTPS destination URL
In scenarios 1.
/ 2.
, we are dealing with what we call a protocol translation transfer: GridFTP --> HTTPS
.
In scenario 3.
, we handle same-protocol transfer: HTTPS --> HTTPS
.
Gfal2 offers much better support for same protocol transfers. It allows us to do TPC (ThirdPartyCopy) and even if we have to stream the data, there are optimizations that can be done. For the protocol translation case, we are forced to read segments from source (via one protocol) and write them to the destination (via the other protocol).
Scenario 3.
works, as it boils down to an HTTPS --> HTTPS
transfer. I've tested both TPC and streaming and they both work.
Scenarios 1.
/ 2.
don't work as Gfal2 ends up doing a GridFTP --> HTTPS
copy. Here, Gfal2 has to do protocol translation and it will do the following steps:
- Check that the GridFTP source exists
- Check that the HTTPs destination doesn't exist
a. Gfal2 expects the destination file to not exist so it can start the upload
b. Gfal2 encounters an existing file at the destination (0-size file)
c. Gfal2 is forced to open the destination file only with Read permissions - When attempting to write, we hit the following error:
Impossible to write, no buffer. (file was opened only for reading?)
What's the reason for SRM creating the 0-size file on the SRM PrepareToPut call?
I've tested this behavior with dCache SRM as well, where the 0-size file does not get created.
Cheers,
Mihai
from gfal2.
Ah, something else to add. The HTTP POST
calls are to request macaroon tokens:
INFO Davix: > POST /disk/test-mipatras HTTP/1.1
> Content-Type: application/macaroon-request
They don't interfere with why the copy operation fails.
If you want, you can disable them via the RETRIEVE_BEARER_TOKEN=false
configuration, either via the Gfal2 config file (/etc/gfal2.d/http_plugin.conf
) or the command line:
$ gfal-copy -D"HTTP PLUGIN:RETRIEVE_BEARER_TOKEN=false" <src> <dst>
from gfal2.
Hi Mihai,
thank you very much for the clear explanation.
StoRM creates a 0-size file with the proper ACL in order to let GridFTP overwrite it with the right permissions.
However, we noticed that in the third-party copies from srm+https to srm+https, or obviously from srm+gsiftp to srm+gsiftp, StoRM/Gfal2 check the (non-)existence of the destination file issuing an srmLs command, without involving WebDAV.
On the other hand, if we try to perform a TPC from srm+gsfitp to srm+https, we respectively see into the StoRM backend and StoRM WebDAV logs both the srmLs check and the HEAD request:
11:23:15.170 - INFO [xmlrpc-27] - srmLs: user </DC=org/DC=terena/DC=tcs/C=IT/O=Istituto Nazionale di Fisica Nucleare/CN=Andrea Rendina [email protected]> Request for [SURL: [srm://storm-test.cr.cnaf.infn.it/disk/test-andre-dst]] failed with: [status: SRM_FAILURE: All requests failed]
131.154.161.121 8443 "DC=org,DC=terena,DC=tcs,C=IT,O=Istituto Nazionale di Fisica Nucleare,CN=Andrea Rendina [email protected]" 2023-07-18T09:23:19.733Z "2befed3e-2456-4720-999c-3bb734806e8a" "HEAD /disk/test-andre-dst HTTP/1.1" 200 0 69
So it seems that in a mixed TPC like this one Gfal2 checks that the HTTPs destination doesn't exist twice.
Is this expected?
Cheers,
Andrea
from gfal2.
Hello Andrea,
Yes, it checks twice, initially at the SRM level. At this point, the destination file does not exist. Then Gfal2 resolves the SRM TURL into an https
URL. Gfal2 will initiate the copy involving the https
destination URL. In the copy, the destination file is checked again (using the https
URL) and the 0-size file is found, which is why the operation stops.
- Is there a reason why the 0-size file is created?
- If you move directly
srm+https
-->srm+https
, you won't have this problem. It's only the protocol translation transfers that face this (eg.:srm+gsiftp
-->srm+https
). Any chance of moving directly tosrm+https
-->srm+https
?
from gfal2.
Hello Mihai,
sorry for the late reply. As I explained in a previuos comment, StoRM creates a 0-size file with the proper ACL in order to let GridFTP overwrite it with the right permissions. In fact, a user with the own DN is mapped into an account which must be able to write.
Unfortunately, we cannot move directely from srm+gsiftp
--> srm+https
to srm+https
--> srm+https
(other sites could still use srm+gsiftp
).
So, If the gfal double check (srmLs
+HEAD
) cannot be disabled, a possible solution is to remove the 0-size file creation via srm by StoRM only if on the destination GridFTP is not enabled, because otherwise the srm+gsiftp
--> srm+gsiftp
TPCs would fail.
Cheers,
Andrea
from gfal2.
Related Issues (14)
- gfal-ls wont display directory contents for distributed xrootd system
- gfal-copy and xrootd: fchmod error
- gfal2 behaves differently between "https" and "davs" HOT 5
- A nix derivation to build gfal2 HOT 2
- Segmentation errors for gfal-copy's GridFTP plugin on RHEL8 HOT 7
- dependencies not fully resolved for EL9/EPEL packages HOT 1
- gfal2 availability in Homebrew HOT 11
- incomplete data displayed for root protocol by gfal-ls
- gfal-bringonline does not work with StoRM Tape REST API HOT 2
- IPv6 flag does not have an affect with xrootd as protocol HOT 4
- Wrong cert variable name in DEBUG output HOT 1
- Logging color not resetting after INFO line
- Filesize mismatch for root transfers from XrootD to dCache site HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gfal2.