Giter Site home page Giter Site logo

cebel / pyhgnc Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 3.0 615 KB

A Python package to access and query data provided by HGNC-approved gene nomenclature, gene families and associated resources including links to genomic, proteomic and phenotypic information.

License: Apache License 2.0

Python 94.80% HTML 5.20%

pyhgnc's Introduction

Project logo Stable Build Status Development Documentation Status Apache 2.0 License

PyHGNC is a Python package to access and query data provided by HGNC-approved gene nomenclature, gene families and associated resources including links to genomic, proteomic and phenotypic information.

Data are installed in a (local or remote) RDBMS enabling bioinformatic algorithms very fast response times to sophisticated queries and high flexibility by using SOLAlchemy database layer.

PyHGNC is developed by the Department of Bioinformatics at the Fraunhofer Institute for Algorithms and Scientific Computing SCAI For more in for information about PyHGNC go to the documentation.

Entity relationship model

This development is supported by following IMI projects:

IMI project logo AETIONOMY project logo PHAGO project logo SCAI project logo

Supported databases

PyHGNC uses SQLAlchemy to cover a wide spectrum of RDMSs (Relational database management system). For best performance MySQL or MariaDB is recommended. But if you have no possibility to install software on your system SQLite - which needs no further installation - also works. Following RDMSs are supported (by SQLAlchemy):

  1. Firebird
  2. Microsoft SQL Server
  3. MySQL / MariaDB
  4. Oracle
  5. PostgreSQL
  6. SQLite
  7. Sybase

Getting Started

This is a quick start tutorial for impatient.

Installation

Current version on PyPI Stable Supported Python Versions

PyHGNC can be installed with pip.

pip install pyhgnc

If you fail because you have no rights to install use superuser (sudo on Linux before the commend) or ...

pip install --user pyhgnc

If you want to make sure you are installing this under python3 use ...

python3 -m pip install pyhgnc

SQLite

Note

If you want to use SQLite as your database system, because you ...

  • have no possibility to use RDMSs like MySQL/MariaDB
  • just test PyHGNC, but don't want to spend time in setting up a database

skip the next MySQL/MariaDB setup section. But in general we strongly recommend MySQL or MariaDB as your relational database management system.

If you don't know what all that means skip the section MySQL/MariaDB setup.

Don't worry! You can always later change the configuration. For more information about changing database system later go to the subtitle Changing database configuration Changing database configuration in the documentation on readthedocs.

MySQL/MariaDB setup

Log in MySQL as root user and create a new database, create a user, assign the rights and flush privileges.

CREATE DATABASE pyhgnc CHARACTER SET utf8 COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON pyhgnc.* TO 'pyhgnc_user'@'%' IDENTIFIED BY 'pyhgnc_passwd';
FLUSH PRIVILEGES;

There are two options to set the MySQL/MariaDB.

  1. The simplest is to start the command line tool
pyhgnc mysql

You will be guided with input prompts. Accept the default value in squared brackets with RETURN. You will see something like this

server name/ IP address database is hosted [localhost]:
MySQL/MariaDB user [pyhgnc_user]:
MySQL/MariaDB password [pyhgnc_passwd]:
database name [pyhgnc]:
character set [utf8]:

Connection will be tested and in case of success return Connection was successful. Otherwise you will see following hint

Test was NOT successful

Please use one of the following connection schemas
MySQL/MariaDB (strongly recommended):
        mysql+pymysql://user:passwd@localhost/database?charset=utf8

PostgreSQL:
        postgresql://user:passwd@localhost/database

MsSQL (pyodbc needed):
        mssql+pyodbc://user:passwd@database

SQLite (always works):

- Linux:
        sqlite:////absolute/path/to/database.db

- Windows:
        sqlite:///C:\absolute\path\to\database.db

Oracle:
        oracle://user:passwd@localhost:1521/database

2. The second option is to start a python shell and set the MySQL configuration. If you have not changed anything in the SQL statements above ...

import pyhgnc
pyhgnc.set_mysql_connection()

If you have used you own settings, please adapt the following command to you requirements.

import pyhgnc
pyhgnc.set_mysql_connection(host='localhost', user='pyhgnc_user', passwd='pyhgnc_passwd', db='pyhgnc')

Updating

The updating process will download the complete HGNC json file and the HCOP file.

import pyhgnc
pyhgnc.manager.database.update()

This will use either the default connection settings of PyHGNC or the settings defined by the user. It is also possible to run the update process from shell.

pyhgnc update

Quick start with query functions

Initialize the query object

query = pyhgnc.query()

Get all HGNC entries:

all_entries = query.hgnc()

Hint

Check out the documentation: Query functions section for more examples and check out the Query section for all possible parameters for the different models.

More information

See the installation documentation for more advanced instructions. Also, check the change log at CHANGELOG.rst.

HGNC tools

HGNC provides also online tools .

Links

HUGO Gene Nomenclature Committee (HGNC)

PyHGNC

pyhgnc's People

Contributors

cebel avatar christianebeling avatar cthoyt avatar lekono avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pyhgnc's Issues

Can't run as module

There's no __main__.py module so this error happens:

$ python3 -m pyhgnc update
/usr/local/opt/python3/bin/python3.6: No module named pyhgnc.__main__; 'pyhgnc' is a package and cannot be directly executed

Update crashes

$ pyhgnc update
Traceback (most recent call last):
  File "/Users/cthoyt/.local/bin/pyhgnc", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pyhgnc/cli.py", line 83, in update
    low_memory=low_memory)
  File "/usr/local/lib/python3.6/site-packages/pyhgnc/manager/database.py", line 419, in update
    database.db_import(silent=silent, hgnc_file_path=hgnc_file_path, hcop_file_path=hcop_file_path, low_memory=low_memory)
  File "/usr/local/lib/python3.6/site-packages/pyhgnc/manager/database.py", line 120, in db_import
    json_data = DbManager.load_hgnc_json(hgnc_file_path=hgnc_file_path)
  File "/usr/local/lib/python3.6/site-packages/pyhgnc/manager/database.py", line 402, in load_hgnc_json
    response = request.urlopen(HGNC_JSON)
NameError: name 'request' is not defined

It's hard to keep those request references straight, especially when writing code that's supposed to be 2/3 compatible. All that needs to be done is the namespace removed.

Error in update method (get_mgds)

There appears to be an issue with the update method, specifically in the get_mgds method. At least one of the mgd ids appears to be simply "M" which causes the code to fall over.

Line 187 in database.py. I solved this by:

def get_mgds(self, hgnc):
    mgds = []

    if 'mgd_id' in hgnc:

        for mgd in hgnc['mgd_id']:

            if mgd not in self.mgds:
                try:
                    mgdid = int(mgd.split(':')[-1])
                except ValueError:
                    mgdid = None
                if mgdid:
                    self.mgds[mgd] = models.MGD(mgdid=mgdid)
                    mgds.append(self.mgds[mgd])

    return mgds

However, I'm not sure if the real issue is in the download or not.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.