neicnordic / endit Goto Github PK

View Code? Open in Web Editor NEW

6.0 7.0 1.0 507 KB

Efficient Northern dCache Interface to TSM

License: GNU General Public License v3.0

Perl 100.00%

dcache spectrum-protect tsm-server

endit's Issues

Review and clean up installation instructions

There are install instructions in both README and INSTALL, not necessarily synchronized.

Best way forward is likely to scrap INSTALL and collect all info in README.

Prometheus counters for bytes stored and retrieved

It would be useful to have these two metrics, maybe something like:

# HELP endit_archiver_stored_bytes The number of bytes stored to tape by this ENDIT process.
# TYPE endit_archiver_stored_bytes counter

# HELP endit_retriever_restored_bytes The number of bytes restored from tape by this ENDIT process.
# TYPE endit_retriever_restored_bytes counter

Add force-flush/recall via signal handler

There are corner cases when a site admin is stuck waiting for tsmarchiver/tsmretriever to perform actions (usually tests or error recovery), but the operation gets delayed by ENDIT applying the various timeout/delays configured. The only options today are either to wait, or to edit endit.conf lowering the relevant timeout/delays, restart the ENDIT daemons, let the operation complete, and then revert settings and restart again.

Forcing a flush/recalll in tsmarchiver/tsmretriever should be relatively simple to implement by adding a signal handler for a suitable signal, for example USR1, that sets a variable for the state machines to act on. This would let the admin to simply do kill -USR1 daemonpid to bypass waiting and force immediate action.

tsmretriever: cleanup of in/ on startup

The in/ directory can have stray files from occasions when crashes/restarts have caused things not to clean up normally.

Implement a rudimentary cleanup in tsmretriever that simply stats all files and unlinks those more than a month old according to ctime.

Refactor archiver to spawn multiple single-drive dsmc processes instead of varying drive use of a single process

Currently the archiver spawns a single dsmc process, but has the capability of varying the arguments used for that process which is used to specify different resourceutilization which translates to using a different number of drives concurrently.

This works OK in the trivial usecase, but there are a number of corner cases when this is suboptimal:

Inflow of data increases rapidly just after a single-drive archive session has started. Currently we have to wait for the running dsmc to finish before we are able to start using more drives.
If we request use of multiple drives, but not enough drives are free, we can end up with dsmc allocating a subset of the needed drives and then just sit and wait until enough drives are free before continuing. Not using the idle drive not only hurts us, but also other tsm server operations that could make use of that drive.

We need to refactor the archiver to:

Be able to spawn and track multiple dsmc processes, each with a unique subset of files to be archived.
Be able to first spawn one process, and if data inflow increases spawn another one if needed.

Doing this would also enable us to better handle datasets, if support for that eventually shows up in dCache.

Things to remember to cater for:

Related config file changes, document how to migrate settings and handle old settings appropriately (warn or error out?)

Handle Server disabled errors more gracefully.

When server access is disabled the error message is:
ANS1355E Session rejected: Server disabled

tsm* processes needs to be aware of this error.

Add possibility to use dsmc query archive -detail output for tapehints

Fairly recently dsmc query backup/archive -detail gained the possibility to list file locations.

Typical output:

             Size  Archive Date - Time    File - Expires on - Description
             ----  -------------------    -------------------------------
        72 191  B  2016-04-01 04:26:47    /grid/pool/out/00000000045ECF57483D99E8FFB9FEC78EF3 Never endit  RetInit:STARTED  ObjHeld:NO
         Modified: 2016-04-01 01:53:38  Accessed: 2016-04-01 01:13:43  Inode changed: 2016-04-01 02:24:42
         Compression Type: None  Encryption Type:        None  Client-deduplicated: NO
  Media Class: Library  Volume ID: 724403  Restore Order: 00000000-00000002-00000000-02ABF65E

We should be able use this to provide this as an alternative method to produce tape hint files.

Add cputime limit to dsmc processes

dsmc has a few corner case bugs when it can get stuck in a loop consuming 100% cpu and never recover.

Work around this by adding a cputime limit to spawned dsmc:s, 48 cpuhours or so should be enough to not trigger in normal usecases.

Reload config automatically/dynamically

Currently config changes requires a restart of the ENDIT daemons. In order to implement backoff mechanisms for sites where lots of reads queued results in starving writes it would be beneficial to be able to dynamically reload parts of, or the entire configuration.

Points to consider:

Automatically detect config file change or require a SIGHUP?
Allow to do this for the entire configuration, or just selected items?
Given that we want to automate the changing of retriever_maxworkers, perhaps only allow defining an override config file (typically in /run/ somewhere) and just dynamically reload that?

Make tsmdeleter volume-aware

Deletions are commonly done in wide campaigns which results in deletions trickling in to us over a longer time period (days/weeks)
TSM space reclamation is rather naive and kicks in when a reclamation limit is reached
- If deletions are still ongoing for a volume this results in newly reclaimed tape volumes with gaps due to files deleted post-reclamation
Workaround is to make tsmdeleter volume-aware
- Require that no deletions has happened on a volume for a set amount of time (one week by default?)
- Or that we will delete all files on a volume
Deletions seems to end up on all involved ENDIT daemon instances now, is this something we can change so all deletions end up on the same instance?
- If not, we can at least sync all instances so deletes happen at the same time.

Packaging endit scripts

It would be nice if the endit could be packaged since it will make endit updates easier at local site.
The RPM installation path may be different, but if a RPM build script (or template) would be provided, local site admin could alter the script (e.g. changing installation path) to build a customised RPM for local site.

Properly document Prometheus stats file dirs

7a3d50a and 5c5596c adds generation of Prometheus style *.prom files.

README needs to be updated with this, the tmpfiles.d example from the runtime config change thing modified to match. It's likely counter-productive to document the metrics while it's a somewhat moving target until we narrow down exactly what's useful.

Multiple endits on one host

When running multiple read/write pools on the same machine it would be nice to be able to have just one installed version of endit with multiple config files. With the current version you need to install endit several times in different locations.

Add configurable short/long descriptions

Prepare for centralized logging by adding short/long descriptions to config.

Short: Printed in every log message, typical value same as dCache hsminstance.

Long: Printed on startup (or daily?), might be used for descriptive text or passing metadata to central logs (ie. hsminstance=ops_tape_read tapesize=15T tapesallocated=700 tapespeedmbps=400 or similar).

Document how to find/delete duplicate archived files

Older ENDIT versions seems to have been prone to archive files multiple times when certain error conditions were triggered.

This should be due to old bugs, but we need to document how to detect if this has happened and most important how to cleanup afterwards.

Procedure is something like:

dsmc q archive -asnode=NODENAME '/path/out/*' and filter out the duplicates
Determine whether we should keep oldest or newest file (likely should not matter, but let's settle for the file that matches the operation that dCache deems successful, probably the last one)
Delete the file. If the descriptions are identical, this will require using dsmc delete archive -pick '/path/to/file' in order to be able to select just one of the duplicates.

Handle multiple flushes of the same file

Right now if dCache flushes a file multiple times it'll get multiple copies in the TSM archive which leads to waste of space and slower retrieves.

Do chdir / on startup to avoid cwd being in deleted directory

If the ENDIT daemons are started in a directory that is subsequently deleted, store/retrieve operations will fail with errors similar to:

dsmc retrieve failure volume default file list /tapecache/requestlists/default.OQ_knf: child exited with value 8
STDERR: shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

The obvious workaround is to do chdir / on startup like proper daemons do.

Investigate removing use of IPC::Run3

At a quick glance, the use of IPC::Run3 is motivated by answering A to a replace-question in tsmretriever.pl.

Investigate whether we should use -replace=All and/or -ifnewer instead.

Implement backoff when retrying dsmc operations

Currently all retries of dsmc operations are done after a fixed retry period.

In cases where big operations fail (ie. due to server malfunction, broken tape, etc) this can result in logs filling up when using the default retry delay of 60 seconds.

Investigate if we can do exponential backoff on "large" failures, should be identifiable with either dsmc return codes/messages or simply by realising that we're retrying the same files over and over...

Centralised logging

Tape usage statistics will need some centralized logging of data to generate https://twiki.cern.ch/twiki/bin/view/HEPTape/TapeMetricsJSON

Things to include: hsminstance, total number of files, number of tapes

Each tape mount with: time, number of files, number of bytes

Errors (i.e. all local logging should also be sent remote), for OoD enjoyment

Idea: See if we can use dcache's log4j central logging stuff, so we only have to implement a sender, not a collector.

Revert tsmarchiver to old behaviour of day/month in description of archived files

In the beginning of time ENDIT archived files with a description on the form 'endit YYYY-MM'. This was however changed to just use the description of 'endit' due to performance limitations of the old TSM (v5 and earlier) server database.

Since v6 TSM is using DB2 so that old issue is moot, and we should revert to the way we were doing this previously.

Having a somewhat unique description helps when needing to delete duplicates, and in general helps to quickly assess file age when viewing dsmc q archive output.

tsmarchiver: Be more aggressive when retrying

Currently we honor archiver_timeout also for retries when amount is below archiver_threshold1_usage.

However, if set to a large value this might cause stores to fail due to retry happening too late so the store times out on a dCache level.

Suggestion: Introduce a retry timeout (say 1 hour by default) and use the lowest of archiver_timeout and retry timeout when doing retry of stores.

neicnordic / endit Goto Github PK

endit's Issues

Recommend Projects

Recommend Topics

Recommend Org