Giter Site home page Giter Site logo

cern-fts / gfal2 Goto Github PK

View Code? Open in Web Editor NEW
7.0 8.0 13.0 6.14 MB

Multi-protocol data management library

Home Page: https://dmc-docs.web.cern.ch/dmc-docs/

License: Other

CMake 8.62% Shell 0.47% Dockerfile 0.18% Makefile 0.18% C 49.00% C++ 40.87% Python 0.67%
http xrootd client srm gridftp ftp dcap rfio lfc fts

gfal2's People

Contributors

aangelog avatar adevress avatar andrea-manzi avatar ashleylst avatar ayllon avatar bbockelm avatar chrisburr avatar dynamic-entropy avatar eddambik avatar edwardkaravakis avatar ellert avatar gbitzes avatar glpatcern avatar joaopblopes avatar mpatrascoiu avatar musicinmybrain avatar okeeble avatar shamrocklee avatar simonmichal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gfal2's Issues

gfal2 behaves differently between "https" and "davs"

Dear all,

at INFN-T1 we are experiencing the following behaviour with gfal 2.21.4.
Managing a StoRM WebDAV storage element in only-write mode, we noticed a different behaviour of gfal2 between the https protocol and the davs one.

In fact, with the davs protocol the local file uploads, for instance
gfal-copy check.sh davs://xs-606.cr.cnaf.infn.it:8443/muone-tape/test-andrea-2607,
work well:

131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:41.874Z "b23d9aaf-0fa6-43ea-b9de-6e16feb6276d" "PROPFIND /muone-tape/test-andrea-2607 HTTP/1.1" 404 79 12
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:41.896Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 4
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:41.936Z "e362adbe-ef78-47c9-9d3b-a605407a18b6" "PROPFIND /muone-tape/test-andrea-2607 HTTP/1.1" 404 79 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:41.954Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 3
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:41.989Z "6ce0a01c-51e8-44e0-b6c5-48ebdb904942" "PROPFIND /muone-tape/test-andrea-2607 HTTP/1.1" 404 79 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:42.017Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 3
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:12:42.061Z "a2d092c7-617b-4f46-9f66-dea838921366" "PUT /muone-tape/test-andrea-2607 HTTP/1.1" 201 0 4 

Whereas with the https protocol the uploads fail:

131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.044Z "77b293b9-9570-4bbf-8b70-d2ef6984c332" "PROPFIND /muone-tape/test-andrea-2607 HTTP/1.1" 404 79 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.061Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 3
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.085Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 3
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.107Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.144Z "fad89005-86ba-4b49-8e90-a2b3677c8f9e" "PROPFIND /muone-tape/test-andrea-2607 HTTP/1.1" 404 79 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.156Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.178Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.200Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 3
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.226Z "0e464398-bf91-4df7-b75a-f3d4076ad3b5" "PROPFIND /muone-tape/test-andrea-2607 HTTP/1.1" 404 79 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.239Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 3
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.267Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 2
131.154.193.196 8443 "bddf1bbc-474f-48d5-936e-6e18729c4427@https://iam-t1-computing.cloud.cnaf.infn.it/" 2023-07-26T11:13:58.292Z "-" "HEAD /muone-tape/test-andrea-2607 HTTP/1.1" 403 0 2 

As you can see from the logs above, it is correct that the HEAD requests fail, but with davs protocol gfal performs a PUT command anyway and the copy successes.

Is this expecetd?
How does the gfal workflow change according to the protocol?

Thank you,
cheers,
Andrea

IPv6 flag does not have an affect with xrootd as protocol

Hi,

it seems that with gfal-copy version [1] the IP version switch does not have an affect.

We are currently debugging a routing problem and have noticed, that even with explicitly requirering IPv4 the IPv6 route is used
E.g.,

> gfal-copy -f -4 -vv root://dcache-door-cms20.desy.de:1094/store/foo/baz  /tmp/hartmath.1 

resolving the A/AAAA records explicitly takes the expected protocol, i.e.,

> gfal-copy -f  -vv root://131.169.191.216:1094/store/foo/baz  /tmp/hartmath.1
> gfal-copy -f  -vv root://[2001:638:700:10bf::1:d8]::1094/store/foo/baz  /tmp/hartmath.1

the problem appears to be similar with the xrd-tools (v4.11.3), e.g.,
> xrdcp --debug 3 root://[2001:638:700:10bf::1:d8]:1094///store/foo/baz /tmp/hartmath.1
(where xrd-tools seem to have no IP version switch)

Cheers,
Thomas

[1]

gfal-copy --version
gfal2-util version 1.5.3 (gfal2 2.17.2)
dcap-2.17.2
file-2.17.2
gridftp-2.17.2
http-2.17.2
lfc-2.17.2
rfio-2.17.2
sftp-2.17.2
srm-2.17.2
xrootd-2.17.2
from /cvmfs/grid.cern.ch/centos7-umd4-ui-4_200423/usr/bin/gfal-copy

gfal2 availability in Homebrew

On macOS, gfal2 and the other related packages should be install-able via the procedure shown here: https://github.com/cern-fts/homebrew-dmc

Unfortunately the whole install process fails due to missing formula for globus-toolkit in home-brew.
The legacy formula for GT is still relative to the old Globus website and repository, now 404.

Would it be possible to fix this install process to allow gfal2 usage on macOS?
This is highly required for tools like rucio which are getting widely used.
I am available to handle this task.

incomplete data displayed for root protocol by gfal-ls

When querying a cluster via gfal-ls the expected output is the content of all servers as it is done by xrdfs:

mebert@heplw65:~$ xrdfs   root://rdc-redirector.belle.uvic.ca:1093 ls /TMP/belle/test   
/TMP/belle/test/100MBfile
/TMP/belle/test/1MBfile
/TMP/belle/test/250kBfile
/TMP/belle/test/2MBfile
/TMP/belle/test/312kBfile
/TMP/belle/test/328kBfile
/TMP/belle/test/3MBfile
/TMP/belle/test/4MBfile
/TMP/belle/test/5MBfile
/TMP/belle/test/DC
/TMP/belle/test/DC24
/TMP/belle/test/TP
/TMP/belle/test/TPC
/TMP/belle/test/aaa
/TMP/belle/test/bbb
/TMP/belle/test/hRaw
/TMP/belle/test/kekcc
/TMP/belle/test/test001
/TMP/belle/test/test002

However, gfal-ls does not combine the output of multiple servers and from the xrootd logs it looks like the query sent is also different than what xrdfs sends. There are currently 2 servers behind the redirector and gfal-ls only displays the content of a single server each time it is issued:

mebert@heplw65:~$ gfal-ls  root://rdc-redirector.belle.uvic.ca:1093//TMP/belle/test
kekcc
250kBfile
TPC
TP
5MBfile
328kBfile
100MBfile
4MBfile
3MBfile
2MBfile
1MBfile

mebert@heplw65:~$ gfal-ls  root://rdc-redirector.belle.uvic.ca:1093//TMP/belle/test
TPC
aaa
hRaw
test002
kekcc
DC24
test001
312kBfile
DC
bbb

Could this behaviour please be changed to be consistent with xrdfs?

dependencies not fully resolved for EL9/EPEL packages

Dear gfal2 developers,

on EL9 (Alma 9.3 flavoured) there seems to be a dependency missing on the python packages.

Installing from EPEL

  • gfal2.x86_64 2.22.1-1.el9
  • gfal2-all-2.22.1-1.el9.x86_64
  • ...

does not pull the python packages, resulting in the tools fail with

Traceback (most recent call last):
  File "/usr/bin/gfal-ls", line 30, in <module>
    from gfal2_util.shell import Gfal2Shell
ModuleNotFoundError: No module named 'gfal2_util'

After a manual dep resolution installing python3-gfal2 python3-gfal2-util, the tools work as expected.

Probably the dependency on the python package is missing in the gfal2-packages?

Cheers,
Thomas

Segmentation errors for gfal-copy's GridFTP plugin on RHEL8

(moved from cern-fts/gfal2-util)

Hello GFAL developers,

we've installed a new VM with RHEL-8 to run our regular transfer tests using the gfal2-tools. They used to run just fine on SL-7, but now our gfal-copy commands randomly fail with Segmentation faults exclusively with the gsiftp transfer protocol.

For some reason that I'm unable to figure out, no core dump files are produced, while that should be possible (I can generate one myself).

$ grep systemd-coredump /var/log/messages | tail
Dec 19 23:16:02 gm-1-kit-e systemd-coredump[3646763]: Process 3646718 (python3) of user 1000 dumped core.
Dec 19 23:16:02 gm-1-kit-e systemd[1]: [email protected]: Succeeded.
Dec 20 01:16:02 gm-1-kit-e systemd-coredump[3658820]: Resource limits disable core dumping for process 3658734 (python3).
Dec 20 01:16:02 gm-1-kit-e systemd-coredump[3658820]: Process 3658734 (python3) of user 1000 dumped core.
Dec 20 01:16:02 gm-1-kit-e systemd[1]: [email protected]: Succeeded.
Dec 20 08:31:02 gm-1-kit-e systemd-coredump[3702089]: Resource limits disable core dumping for process 3702066 (python3).
Dec 20 08:31:02 gm-1-kit-e systemd-coredump[3702089]: Process 3702066 (python3) of user 1000 dumped core.
Dec 20 08:31:02 gm-1-kit-e systemd[1]: [email protected]: Succeeded.
Dec 20 08:39:36 gm-1-kit-e systemd-coredump[3703091]: Process 3703089 (sleep) of user 1000 dumped core.#012#012Stack trace of thread 3703089:#012#0  0x00007fe5fe35d8e8 __nanosleep (libc.so.6)#012#1  0x0000564612d25b47 rpl_nanosleep (sleep)#012#2  0x0000564612d25920 xnanosleep (sleep)#012#3  0x0000564612d22a88 main (sleep)#012#4  0x00007fe5fe29ed85 __libc_start_main (libc.so.6)#012#5  0x0000564612d22b5e _start (sleep)
Dec 20 08:39:36 gm-1-kit-e systemd[1]: [email protected]: Succeeded.

Other information you might find useful...

$ python3 --version
Python 3.6.8

$ rpm -qa gfal2*
gfal2-plugin-file-2.21.5-1.el8.x86_64
gfal2-plugin-gridftp-2.21.5-1.el8.x86_64
gfal2-2.21.5-1.el8.x86_64
gfal2-plugin-http-2.21.5-1.el8.x86_64
gfal2-plugin-srm-2.21.5-1.el8.x86_64
gfal2-plugin-xrootd-2.21.5-1.el8.x86_64
gfal2-util-scripts-1.8.0-1.el8.noarch

$ uname -a
Linux gm-1-kit-e 4.18.0-513.5.1.el8_9.x86_64 #1 SMP Fri Sep 29 05:21:10 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

Do you have other hints on what we should check to find out more information for you? If you know how to ensure core dumps are actually produced, that would be worth sharing.

Best regards,
Xavier.

gfal-copy and xrootd: fchmod error

Dear all,

I am running

gfal-copy --version
gfal2-util version 1.7.1 (gfal2 2.21.0)
	dcap-2.21.0
	file-2.21.0
	gridftp-2.21.0
	http-2.21.0
	lfc-2.21.0
	mock-2.21.0
	rfio-2.21.0
	sftp-2.21.0
	srm-2.21.0
	xrootd-2.21.0

And trying to copy a file to a storage endpoint via root://.

  • gfal-copy --just-copy file:///usr/bin/bash davs://xrootd.phy.bris.ac.uk:1094/xrootd/ops/test_file_xrootd_ops2.outWORKS (HTTP)
  • xrdcp file:///usr/bin/bash root://xrootd.phy.bris.ac.uk:1094//xrootd/ops/test_file_xrootd_ops2.outWORKS (root:// via xrdcp
  • gfal-copy --just-copy -v file:///usr/bin/bash root://xrootd.phy.bris.ac.uk:1094//xrootd/ops/test_file_xrootd_ops2.outFAILS (root:// via xrdcp):
Copying 0 bytes file:///usr/bin/bash => root://xrootd.phy.bris.ac.uk:1094//xrootd/ops/test_file_xrootd_ops2.out
event: [1662129567321] BOTH   GFAL2:CORE:COPY	LIST:ENTER	
event: [1662129567322] BOTH   GFAL2:CORE:COPY	LIST:ITEM	file:///usr/bin/bash => root://xrootd.phy.bris.ac.uk:1094//xrootd/ops/test_file_xrootd_ops2.out
event: [1662129567322] BOTH   GFAL2:CORE:COPY	LIST:EXIT	
event: [1662129567324] BOTH   xroot	TRANSFER:ENTER	file://localhost///usr/bin/bash?xrd.gsiusrpxy=/tmp/x509up_u31423&xrdcl.intent=tpc => root://xrootd.phy.bris.ac.uk:1094///xrootd/ops/test_file_xrootd_ops2.out?xrd.gsiusrpxy=/tmp/x509up_u31423&xrdcl.intent=tpc
event: [1662129567324] BOTH   xroot	TRANSFER:TYPE	streamed
event: [1662129569751] BOTH   xroot	TRANSFER:EXIT	Job finished, [ERROR] Server responded with an error: [3016] Unable to fchmod /xrootd/ops/test_file_xrootd_ops2.out; is a directory (destination)

gfal-copy error: 21 (Is a directory) - Error on XrdCl::CopyProcess::Run(): [ERROR] Server responded with an error: [3016] Unable to fchmod /xrootd/ops/test_file_xrootd_ops2.out; is a directory (destination)

Since the gfal http plugin does not do any chmod but the xrootd one does, is the issue there?

A nix derivation to build gfal2

I noticed that a nix derivation was not yet available, so I created one (see below).

  • It is quite minimal, but most things seem to work for me
  • plugins for which no nix package is available are disabled
  • I have issues with gfal2 + xrootd + proxy certificate, which gives the following error:
    gfal-ls error: 52 (Invalid exchange) - [gfal2_stat][gfal_plugin_statG][gfal_xrootd_statG] Failed to stat file (Invalid exchange)
    

I hope this derivation can be of use to others as well, if anybody wants to try and merge it into nixpkgs Feel free to do so.

{ lib
, stdenv
, fetchFromGitHub
, cmake
, pkg-config
, glib
, dcap
, pcre2
, json_c
, doxygen
, pugixml
, davix-copy
, cryptopp
, libssh2
, xrootd
, gtest
, openldap
, libuuid
}:


stdenv.mkDerivation rec {
  pname = "gfal2";
  version = "2.22.0";

  src = fetchFromGitHub {
    owner = "cern-fts";
    repo = "${pname}";
    rev = "v${version}";
    sha256 = "198ql2qa9f8x77x6mhx0g004kyimvf45gy5ikwcx1v1p20avfs33";
  };

  nativeBuildInputs = [
    cmake
    pkg-config
    glib
    pcre2.dev
    json_c.dev
    doxygen
    pugixml
    cryptopp.dev
    gtest.dev
    openldap.dev
    libuuid.dev
    xrootd.dev
    davix-copy
    libssh2.dev
    dcap
  ];


  cmakeFlags = [
    "-DPLUGIN_LFC=FALSE"
    "-DPLUGIN_SRM=FALSE"
    "-DPLUGIN_RFIO=FALSE"
    "-DPLUGIN_GRIDFTP=FALSE"
    "-DCRYPTOPP_LOCATION=${cryptopp.dev}/include/cryptopp"
    "-DXROOTD_LOCATION=${xrootd.dev}/include/xrootd"
    "-DGTEST_LOCATION=${gtest.dev}/include/gtest"
  ];


  meta = with lib; {
    description = "Multi-protocol data management library";
    homepage = "https://cern-fts/gfal2";
    maintainers = [ ];
    license = licenses.asl20;
  };
}

Logging color not resetting after INFO line

When running gfal2 with flag -vvv the INFO messages are making the shell output color stuck.
I guess adding "\033[0m" at the end of any logging line should do the trick, but I am afraid this is more of a glib2 bug...

As a reference:
image

gfal-ls wont display directory contents for distributed xrootd system

We have a distributed xrootd system with one data server and one redirector. When we list the contents of a directory with xrdfs we can see the contents of the directory but if we use gfal-ls the command returns 0 but does not display the directory contents so this seems to be an issue with gfal-ls. If instead of querying redirector I query the data server directly with gfal-ls I can see the contents.

Looking into the xrootd logs it seems like in the cases where I do see the contents (xrdfs or gfal-ls directly to the data server) the stat request gets interpreted as a dirlist request and so the contents is listed:
command issued xrdfs xroot://myredirector:1094//; ls rdc/

220406 14:38:58 9413 XrootdXeq: mfens98.7444:[email protected] pub IPv4 login as 830f7767.0
220406 14:38:58 9413 mfens98.7444:[email protected] Xrootd_Protocol: 0100 req=stat dlen=4
220406 14:38:58 9413 mfens98.7444:[email protected] Xrootd_Protocol: 0100 rc=0 stat /rdc
220406 14:38:58 9444 mfens98.7444:[email protected] Xrootd_Protocol: 0000 more auth requested; sz=3162
220406 14:38:58 9444 mfens98.7444:[email protected] Xrootd_Response: 0000 sending OK
...
220406 14:38:58 9444 XrootdXeq: mfens98.7444:[email protected] pub IPv4 login as 830f7767.0
220406 14:38:58 9444 mfens98.7444:[email protected] Xrootd_Protocol: 0100 req=dirlist dlen=4
220406 14:38:58 9444 mfens98.7444:[email protected] Xrootd_Response: 0100 sending 15 data bytes
220406 14:38:58 9444 mfens98.7444:[email protected] Xrootd_Protocol: 0100 dirlist entries=2 path=/rdc

the log looks the same when I gfal-ls to the redirector except the req=stat stays and does not become req=dirlist

This seems like a gfal issue since xrdfs works fine but this could be an issue with my xrootd. what is going on when I run gfal-ls? How is it different when I query the redirector vs. when I query the data server directly?

gfal-copy transfers fail from srm+gsiftp to srm+https

Dear all,

at INFN-T1 we are experiencing the following behaviour with gfal 2.21.4.
Issuing a third-party gfal-copy from srm+gsiftp to srm+https a strange error occurs and the transfer fails, for example:

gfal-copy -vvv srm://storm-test.cr.cnaf.infn.it:8444/folder/test-andre srm://storm-test.cr.cnaf.infn.it:8444/disk/test-andre
[...]
WARNING  Transfer failed with: [srm_do_transfer][gfalt_copy_file][perform_copy][perform_local_copy][streamed_copy][gfal_plugin_writeG][davix2gliberr] Impossible to write, no buffer. (file was opened only for reading?)

After a deep investigation, we have figured out that the problem is probably related to the creation of an https TURL triggered by the srm request. Apparently, with an https TURL like this one, gfal tries to perform the transfer issuing only HEAD and POST commands.

131.154.161.121 8443 "DC=org,DC=terena,DC=tcs,C=IT,O=Istituto Nazionale di Fisica Nucleare,CN=Andrea Rendina [email protected]" 2023-07-12T12:44:35.915Z "d914e441-b1eb-4675-a196-996a8f9243b6" "POST /disk/test-andre HTTP/1.1" 200 959 29
131.154.161.121 8443 "DC=org,DC=terena,DC=tcs,C=IT,O=Istituto Nazionale di Fisica Nucleare,CN=Andrea Rendina [email protected]" 2023-07-12T12:44:36.142Z "996d9883-0fb6-4c26-8036-8aa81ed89592" "HEAD /disk/test-andre HTTP/1.1" 200 0 133

Indeed when we try to perform the same transfer using the davs/https protocol as destination

gfal-copy -vvv srm://storm-test.cr.cnaf.infn.it:8444/folder/test-andre davs://transfer-test.cr.cnaf.infn.it:8443/disk/test-andre

gfal issues also a PUT command and the transfer succeeds.

We have noticed the same thing directly using the protocols gsiftp -> davs/https without involving srm:

gfal-copy -v gsiftp://transfer-test.cr.cnaf.infn.it:2811//storage/gemss_test1/dteam/folder/test-andre davs://transfer-test.cr.cnaf.infn.it:8443/disk/test-andre

Why in an srm+gsiftp -> srm+https transfer the PUT command is not perfomed by gfal?
Is this expected?

For reference, we are investigating this issue also with the StoRM developers:
https://issues.infn.it/jira/browse/STOR-1569

Please, correct me if I am wrong or I made any mistake.
Thank you very much for your help!

Andrea

Filesize mismatch for root transfers from XrootD to dCache site

Hi,
I had opened in issue in Xrootd: xrootd/xrootd#1454 on observed failures in TPC root transfer between Xrootd site (running 5.3.0) and dCache sites. The issue was first seen in the DOMA TPC tests (on kibana), but can be reproduced on lxplus, e.g:

gfal-copy -vvv --copy-mode=pull root://ceph-gw8.gridpp.rl.ac.uk:1094/dteam:test1/domatest/jwalder/HTTP_1GB root://dcache-se-doma.desy.de:1094/dteam/tpctest/jwtest1GB

In the DOMA tests at least it seemed specific between xrootd and dcache, but I could not guess why ... ?

monitor: root://ceph-gw8.gridpp.rl.ac.uk:1094///dteam:test1/domatest/jwalder/HTTP_1GB?xrd.gsiusrpxy=/tmp/x509up_u28239&xrdcl.intent=tpc root://dcache-se-doma.desy.de:1094///dteam/tpctest/jwtest1GB?xrd.gsiusrpxy=/tmp/x509up_u28239&xrdcl.intent=tpc 8353361 8353361 994050048 119
monitor: root://ceph-gw8.gridpp.rl.ac.uk:1094///dteam:test1/domatest/jwalder/HTTP_1GB?xrd.gsiusrpxy=/tmp/x509up_u28239&xrdcl.intent=tpc root://dcache-se-doma.desy.de:1094///dteam/tpctest/jwtest1GB?xrd.gsiusrpxy=/tmp/x509up_u28239&xrdcl.intent=tpc 8216710 8216710 1002438656 122
event: [1627637292603] BOTH   xroot	TRANSFER:EXIT	Job finished, [ERROR] Server responded with an error: [3019] File size mismatch (expected=1000000000, actual=1002438656) (destination)

INFO     Event triggered: BOTH xroot TRANSFER:EXIT Job finished, [ERROR] Server responded with an error: [3019] File size mismatch (expected=1000000000, actual=1002438656) (destination)

DEBUG    Xrootd Query URI: xrd.gsiusrpxy=/tmp/x509up_u28239
INFO     Destination file removed
DEBUG     <- Gfal::Transfer::FileCopy
gfal-copy error: 33 (Numerical argument out of domain) - [gfalt_copy_file][perform_copy][gfal_xrootd_3rd_copy][gfal_xrootd_3rd_copy_bulk] Error on XrdCl::CopyProcess::Run(): [ERROR] Server responded with an error: [3019] File size mismatch (expected=1000000000, actual=1002438656) (destination)

From the Xrootd devs and logs, it is suggested that there's something perhaps happening in gFAL to misreport (or misread?) the final size?
More information is available in the GitHub issue mentioned above.
On lxplus the gFal Client is:

gfal-copy -V
gfal2-util version 1.6.0 (gfal2 2.19.2)
	dcap-2.19.2
	file-2.19.2
	gridftp-2.19.2
	http-2.19.2
	lfc-2.19.2
	rfio-2.19.2
	sftp-2.19.2
	srm-2.19.2
	xrootd-2.19.2

Please let me know if more info is helpful.
Thanks,
James

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.