Giter Site home page Giter Site logo

dcache-cta's Introduction

dCache Nearline Storage Driver for CTA

This is nearline storage plugin for dCache.

DOI

To compile the plugin, run:

mvn package

This produces a tarball in the target directory containing the plugin.

Using the plugin with dCache

To use this plugin with dCache, place the directory containing this file in /usr/local/share/dcache/plugins/ on a dCache pool. Restart the pool to load the plugin.

To verify that the plugin is loaded, navigate to the pool in the dCache admin shell and issue the command:

hsm show providers

The plugin should be listed as dcache-cta.

To activate the plugin, create an HSM instance using:

hsm create osm name dcache-cta [-key=value]...

example:

hsm create osm cta dcache-cta -cta-user=userA \
     -cta-group=groupA -cta-instance-name=instanceA \
     -cta-frontend-addr=cta-forntend-host:17017 \
     -cta-use-tls=true \
     -io-endpoint=a.b.c.d -io-port=1094

The dCache files stored in CTA will have hsm uri in form

<hsmType>://<hsmName>/<pnfsid>?arichiveid=<id>

Where id represents the CTA internal archiveId

for example:

osm://cta/00001D43C0C086CA459298C634D67F68AB6B?archiveid=8402

In the CTA catalog the dCache's pnfsids are referenced as disk_file_id field:

db => select * from archive_file where disk_file_id = '00001D43C0C086CA459298C634D67F68AB6B';
-[ RECORD 1 ]-------+-------------------------------------
archive_file_id     | 8402
disk_instance_name  | instanceA
disk_file_id        | 00001D43C0C086CA459298C634D67F68AB6B
disk_file_uid       | 1
disk_file_gid       | 1
size_in_bytes       | 10482
checksum_blob       | \x0a08080112043309f498
checksum_adler32    | 2566129971
storage_class_id    | 81
creation_time       | 1635150624
reconciliation_time | 1635150624
is_deleted          | 0
collocation_hint    |

NOTE: dCache-CTA driver doesn't preserve file's ownership, thus values of uid and gid fields contain arbitrary values.

dCache pool configuration

As CTA has its own scheduler and flush/restore queue the dCache pools should be configured to submit as much request as possible. Thus grouping and collecting request on the pool side should be disabled:

queue define class -expire=0 -pending=0 -total=0 -open <hsmType> *

The available configuration options:

Name Description required default
cta-instance-name The dCache instance name configured in CTA yes -
cta-frontend-addr The CTA cta-dcache endpoint yes -
cta-user The dCache instance associated user in CTA yes -
cta-group The dCache instance associated group in CTA yes -
cta-ca-chain The path to CA root chain for use with TLS no -
cta-use-tls A switch (true/false) to enable TLS for CTA control connection no false
cta-frontend-timeout How log dCache waits in seconds for CTA frontend to reply no 30
io-endpoint The hostname or IP offered by dCache for IO by CTA no hostname
io-port The TCP port offered by dCache for IO by CTA no -
restore-success-on-close A swith to enable/disable success on close for restores. Backward compatibility with CTA pre-5.11 no true

Acknowledgements

This driver is based and inspired by cta-communicator by Lea Morschel and sapphire by Svenja Meyer

dcache-cta's People

Contributors

kofemann avatar svemeyer avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

dcache-cta's Issues

Provide performance metricks

To identify bottlenecks the following performance metrics are required:

  • Requests per seconds
  • Disk IO
  • Network performance
  • CTA gRPC request processing time

Plugin needs a percistent storage for pending requests

As CTA has it's own job queue, if dCache is restarted after request submition we might get an IO request and/or status update for a request, that is not in the dCache's flush/stage pool's queue. Thus we need a persistent store for all on-the-flight requests.

Update gRPC channel to support lazy connection

The currently used ManagedChannel is connected at configure stage and plugin fails to start if connection can't be created. more over, the re-connect on failure as well as error handling are missing.

Handle is closed exception during flush

04 Jan 2022 11:36:58 [nioEventLoopGroup-3-8] [] Opening /osmcache/dcache/dcache-tpm103/data/0000F3B008DCC88C45028E0CDDA9A5ED004B for reading.
04 Jan 2022 11:37:26 [nioEventLoopGroup-3-8] [] xrootd server error while processing org.dcache.xrootd.protocol.messages.CloseRequest@362106c3 (please report this to [email protected])java.lang.IllegalStateException: Handle is closed
	at org.dcache.pool.repository.v5.ReadHandleImpl.getReplicaFile(ReadHandleImpl.java:102)
	at org.dcache.pool.nearline.NearlineStorageHandler$FlushRequestImpl.getReplicaUri(NearlineStorageHandler.java:969)
	at org.dcache.nearline.cta.ForwardingFlushRequest.getReplicaUri(ForwardingFlushRequest.java:26)
	at org.dcache.nearline.cta.xrootd.DataServerHandler.getFile(DataServerHandler.java:490)
	at org.dcache.nearline.cta.xrootd.DataServerHandler.doOnClose(DataServerHandler.java:333)
	at org.dcache.nearline.cta.xrootd.DataServerHandler.doOnClose(DataServerHandler.java:84)
	at org.dcache.xrootd.core.XrootdRequestHandler.requestReceived(XrootdRequestHandler.java:186)
	at org.dcache.xrootd.core.XrootdRequestHandler.channelRead(XrootdRequestHandler.java:99)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
	at org.dcache.xrootd.core.XrootdAuthenticationHandler.channelRead(XrootdAuthenticationHandler.java:216)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:829)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.