Giter Site home page Giter Site logo

mavetools's Introduction

Build Status Coverage Status Code style: black

mavetools

Useful functions for manipulating Multiplex Assay of Variant Effect datasets.

mavetools's People

Contributors

harmatt avatar afrubin avatar joemin avatar

Stargazers

James Stevenson avatar wook2014 avatar Haley Bianchi avatar Francesco Patane, MSc avatar Wei Lu (陆威) avatar Tingfeng Xu avatar Clint Valentine avatar Uri Laserson avatar Jeff Hammerbacher avatar Yo Yehudi avatar

Watchers

James Cloos avatar  avatar Nick Moore avatar  avatar  avatar  avatar  avatar  avatar

mavetools's Issues

Local MaveDB validation

MaveTools should implement the same MAVE-HGVS validation that MaveDB does, so that people can do it locally.

Generate static versions of the example notebooks for docs pages

Being able to serve documentation from the notebooks is really handy, but for the examples that expect to contact a development server this doesn't work as well. It's also less suitable to serve on the MaveDB website.

It would be good to run the example notebooks, capture the resulting output from a successful execution, and save these as .rst files from Jupyter directly to serve as static docs. The notebooks can be included elsewhere in the docs for users to have as a starting point.

Squelch validation error messages when running tests

When running the tests to make sure that expected validation failures are caught, the associated messages still get dumped to the console unnecessarily. We should adjust the way that the tests are run to prevent this.

Utility function suggestions

Here are two suggestions for utility functions in MaveTools:

Infer target sequence from variant data

The function would work for both protein and nucleotide data. Input is a list of mavehgvs Variant objects.

The function would output an appropriate error or warning message if the variant data has a gap (e.g. if there are variants for positions 1-10 and 12-15 but not 11).

The function would also throw an error if there are conflicting target residues for the same position.

The function needs to handle target identical and indel variants correctly, likely ignoring them in most cases (a possible exception being protein data with two adjacent residues).

Split "matched" delins into substitutions

The code that converts codon changes to mavehgvs variants prefers to define single events, e.g. a deletion-insertion of two bases rather than two substitutions. Many users may prefer to look at these data as multiple substitutions instead.

This function would take delins variants that have matching deletion and insertion length and output the corresponding multi-variant.

For many variants this will require the user to provide the appropriate target sequence. To improve usability, the user should be able to provide a longer target sequence and an offset.

The function should throw an informative error if the delins is not matched or if the target sequence doesn't match (e.g. outside of length bounds, or would result in a target-identical change that suggests the wrong target sequence was provided).

Remove redundant validation code

Lots of the validation code in the enrich2 module and elsewhere is better handled by the data frame validation provided by mavedb and the mavetools client.

The code in the enrich2 module should be slimmed down to ensure that the variants are correct and that the data integrity is preserved, and the full-dataframe validation should be removed or called from mavedb.

Intimidating errors when score set upload files are not found

When uploading a score set via API, if the files are not found it returns an intimidating and difficult to interpret error rather than telling the user the file could not be opened.

The file should be checked explicitly and a sensible error returned before contacting the server.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.