Giter Site home page Giter Site logo

13amp's People

Contributors

jrandall avatar xophmeister avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

violethaze74

13amp's Issues

Segfault when accessing FUSE context from threads

When trying to access members of the FUSE context structure, within threads, a segmentation fault is raised... This is most likely due to me doing something stupid, given my basic knowledge of pthreads!

Recursive mounting

If the source directory contains the mount directory, then bad things happen if you try to access the mounted mount. Presumably FUSE gets stuck in an infinite loop, but it bails out and crashes after a timeout.

Ideally, one would check to see if the source directories within the mount resolve to the mount directory. However, for this to work, you would need to know the mount point (i.e., to compare against). Unfortunately, this is no way to ascertain this from within FUSE -- as it's not necessarily static or unique (see [1]) -- and presumably no portable way to get it from within the host OS.

Unless I'm missing a trick, this is in wontfix territory!

[1] http://fuse.996288.n3.nabble.com/Getting-the-mountpoint-from-within-a-fusemod-td11620.html

README-hacking missing / autogen.sh fails

Running ./autogen.sh in a fresh git checkout fails because bootstrap looks for the presence of README-hacking as a sign that it is a git checkout rather than a tarball distribution.

Easiest solution is to check-in a README-hacking. Alternatively, you could find another file that won't be present in the tar dist and modify the bootstrap script to check for that instead.

listing a directory with a single large cram file takes 'forever'

$ 13amp /bam -S /cram
$ ls -l /bam

So far the 13amp process has consumed 82m of CPU time (at 100%+ utillisation) but it has not returned anything for the directory listing yet.

13amp is running inside a docker container where '/cram' is a bind-mounted directory containing a single ~24GiB cram:

$ ls -l /lustre/scratch115/teams/hgi/users/jr17/pomak_merged_cram_50/*.cram
-rw-r--r-- 1 mercury hgi 25027902374 Aug 24 17:47 /lustre/scratch115/teams/hgi/users/jr17/pomak_merged_cram_50/14106455.CCXX.paired310.1898143677.cram

It appears this is simply an issue of HTSLIB taking 3+ hours to fully stream through a 24GiB CRAM file. Suggest we have a "fast" directory listing mode that lies about the size of the virtual BAM files.

Break the read loop when chunk fetched

The read loop will consume the whole converted file, even if it has fetched the requested chunk. Break the loop (which will close the pipe, etc. and return) once the fetched size is equal to the requested size.

Virtual BAM file size

The file size of the injected virtual BAM file must be accurate, so the OS can send the EOF signal when reading. Other FUSE conversion filesystems do this by either:

  1. Converting the file and saving it somewhere temporary, reporting the file size from there. This is not an option here, as it would negate the main point of saving disk space.
  2. Calculate (and cache) the file size on-the-fly. (With the option of potentially precalculating everything at start up.) This has the disadvantage of being very slow -- not to mention wasteful; given the scale, it probably wouldn't be workable to cache the BAM in memory, as well -- on initial access.

We will have to implement option 2 in cramp_getattr. Precalculating may be far too expensive to be viable; even in getattr, it might just make everything too slow...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.