tanghaibao / goatools Goto Github PK
View Code? Open in Web Editor NEWPython library to handle Gene Ontology (GO) terms
License: BSD 2-Clause "Simplified" License
Python library to handle Gene Ontology (GO) terms
License: BSD 2-Clause "Simplified" License
Hi there,
I am using goatools to test for over- and/or under-represented GOSlim categories in a plant lineage that is lacking a closely-related annotated genome. Thus, for the 2091 loci in my "population" only 819 have known GO annotations. Similarly, my "study" data sets have many loci that are also lacking GO annotations.
My question is: Should I include loci in my "population" and "study" files that do not have GO annotations? I am concerned that, if I only use loci for which there are annotations, I may bias the results.
Any suggestions/input would be much appreciated.
Thanks!
$ pip install goatools
Collecting goatools
Using cached goatools-0.7.10.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/jw/23sjzsn97lz8qj3kp4bg4pw00000gn/T/pip-build-OpOQPw/goatools/setup.py", line 8, in <module>
from setup_helper import SetupHelper
ImportError: No module named setup_helper
Hi there,
Thanks for this code, it's exactly what I needed. I'm switching over from Enrichr (doesn't allow for background gene sets...), and they use GO BP 2015 annotations. I notice that the annotations in the file that they use is quite different than the annotations taken from NCBI, as was done in the tutorial.
Specifically, I plotted the number of terms for each GO pathway in the Enrichr annotations file vs. the NCBI file recommended in the tutorial (log2 scale), and found that there were many terms missing from the latter:
It appears that the NCBI annotations file is up to date. Could you please explain the discrepancy here?
Thanks!
See: http://geneontology.org/page/download-ontology
We appreciate it's a hassle changing, and will continue to support the currently URLs that you use, but we'd encourage people to move to the new more informative naming schema that's harmonized across ontologies.
I'll send a PR
Hi,
I am working also on a completely new organism, so I used gene prediction and blast them against Uniref90 in order to get gene IDs.
I found only this page ( http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot-goa/features?segment=Q9LZR0 ) which map gene ID ( Q9LZR0 ) to GO terms.
Could I used that page to create the association file?
Thank you in advance
http://github.com/tanghaibao/goatools/blob/master/goatools/multiple_testing.py#L108
calls count_terms() with 2 args, but it takes 3. cant do fdr testing.
It seems like goatools automatically ignore the entries with evidence code "ND" in the GAF file. But for my case I actually needed to use them. I found in the associations.py line 132, if 'NOT' not in ntgaf.Qualifier and evidence_code != 'ND':
, this seems to be hardcore. It would be very helpful in the future version, if there is an option to include these entries.
at least a readme.rst with some tests/examples.
have a helper method to plot the parents or children of a given GO term.
this means merging back after parents and diverged and then converged.
Something seems weird with the bioconda installation. Trying to install goatools through bioconda into a new python 3 environment leads to a UnsatisfiableError. However it installs just fine into the same environment using pip.
Hello,
What the dots mean before the GO terms?
python scripts/find_enrichment.py --alpha=0.05 --fdr --indent data/study data/population data/association
id enrichment description ratio_in_study ratio_in_pop p_uncorrected p_bonferroni p_holm p_sidak p_fdr
.GO:0003824 e catalytic activity 106/276 7773/33239 2.6e-08 2.01e-05 2.01e-05 1.96e-05 0
..GO:0016740 e transferase activity 45/276 2713/33239 7.46e-06 0.00575 0.00574 0.0056 0.008
.....GO:0006464 e cellular protein modification process 33/276 1723/33239 7.9e-06 0.00608 0.00606 0.00593 0.008
....GO:0036211 e protein modification process 33/276 1723/33239 7.9e-06 0.00608 0.00606 0.00593 0.008
...GO:0019748 e secondary metabolic process 11/276 252/33239 9.5e-06 0.00731 0.00728 0.00713 0.009
......GO:0006468 e protein phosphorylation 22/276 918/33239 1.01e-05 0.00776 0.00771 0.00757 0.009
.....GO:0016310 e phosphorylation 22/276 941/33239 1.47e-05 0.0114 0.0113 0.0111 0.009
.....GO:0008474 e palmitoyl-(protein) hydrolase activity 3/276 7/33239 1.93e-05 0.0149 0.0148 0.0145 0.01
...GO:0005839 e proteasome core complex 4/276 23/33239 3.64e-05 0.028 0.0277 0.0273 0.016
....GO:0043412 e macromolecule modification 33/276 1870/33239 5.53e-05 0.0426 0.0421 0.0415 0.026
....GO:0044550 e secondary metabolite biosynthetic process 8/276 158/33239 5.61e-05 0.0432 0.0427 0.0421 0.027
Thank you in advance.
? discuss how to do this.
should be able to take lists of accn names.
from goatools.obo_parser import GODag
obo_dag = GODag(obo_file="../GO/go-basicNS.obo")
leads to an infinite loop on:
'2.7.9 |Anaconda 2.2.0 (64-bit)| (default, Dec 18 2014, 16:57:52) [MSC v.1500 64 bit (AMD64)]'
the .obo file being:
format-version: 1.2
data-version: releases/2015-06-20
date: 19:06:2015 14:25
saved-by: dph
auto-generated-by: OBO-Edit 2.3.1
I temporarily fixed it in CFretter@a7aee05
Traceback (most recent call last):
File "scripts/find_enrichment.py", line 25, in
from goatools.go_enrichment import GOEnrichmentStudy
File "scripts/../goatools/init.py", line 5, in
from goatools.go_enrichment import *
File "scripts/../goatools/go_enrichment.py", line 18, in
import fisher
ImportError: No module named fisher
Hi,
Could you please add a method which allows to write summary to file?
Thank you in advance.
Mic
Hi,
any plans to do plot_go_term.py to Cytoscape or VisANT ?
Hi,
I tried the option --compare with the following comand:
$ python scripts/find_enrichment.py --compare --indent data/study data/population data/association
and got the following error massage:
removed 276 overlapping items
load obo file gene_ontology.1_2.obo
41085 nodes imported
id enrichment description ratio_in_study ratio_in_pop p_uncorrected p_bonferroni p_holm p_sidak p_fdr
Traceback (most recent call last):
File "scripts/find_enrichment.py", line 116, in <module>
g.print_summary(min_ratio=min_ratio, indent=opts.indent, pval=opts.pval)
File "scripts/../goatools/go_enrichment.py", line 154, in print_summary
for rec in self.results:
AttributeError: 'GOEnrichmentStudy' object has no attribute 'results'
here is the code in obo_parser.py
if hasattr(rec, name):
if name not in self.attrs_scalar:
if name not in self.attrs_nested:
getattr(rec, name).add(value)
else:
self._add_nested(rec, name, value)
else:
raise Exception("ATTR({NAME}) ALREADY SET({VAL})".format(
NAME=name, VAL=getattr(rec, name)))
else: # Initialize new GOTerm attr
if name in self.attrs_scalar:
setattr(rec, name, value)
elif name not in self.attrs_nested:
setattr(rec, name, set([value]))
else:
name = '_{:s}'.format(name)
setattr(rec, name, defaultdict(list))
self._add_nested(rec, name, value)
here, this code, if hasattr(rec, name), is never true, because name is relationship but the attr is relationship by name = '{:s}'.format(name), so only the last one relationship of each go_term saved.
And I change the code like this:
_name = '_{:s}'.format(name) if name in self.attrs_nested else name
if hasattr(rec, _name):
if name not in self.attrs_scalar:
if name not in self.attrs_nested:
getattr(rec, name).add(value)
else:
self._add_nested(rec, _name, value)
else:
raise Exception("ATTR({NAME}) ALREADY SET({VAL})".format(
NAME=name, VAL=getattr(rec, name)))
else: # Initialize new GOTerm attr
if name in self.attrs_scalar:
setattr(rec, name, value)
elif name not in self.attrs_nested:
setattr(rec, name, set([value]))
else:
# name = '_{:s}'.format(name)
setattr(rec, _name, defaultdict(list))
self._add_nested(rec, _name, value)
Thanks a lot.
Hi there,
I am trying to run the "find_enrichment.py" script with the --fdr option. My population of genes is only 2151 and my study sample file(s) is(/are) less than 1000 genes. The Bonferroni numbers seemed very high, which I thought might be due to my sample size. However, when I run "find_enrichment.py" I receive the following errors:
File "/Users/Grusz/bin/goatools-0.5.2/scripts/find_enrichment.py", line 124, in
study=study, methods=methods)
File "/Users/Grusz/bin/goatools-0.5.2/scripts/../goatools/go_enrichment.py", line 95, in init
self.run_study(study)
File "/Users/Grusz/bin/goatools-0.5.2/scripts/../goatools/go_enrichment.py", line 134, in run_study
self.term_pop, self.obo_dag)
File "/Users/Grusz/bin/goatools-0.5.2/scripts/../goatools/multiple_testing.py", line 109, in calc_qval
new_term_study = go_enrichment.count_terms(new_study, assoc, obo_dag)
NameError: global name 'go_enrichment' is not defined
I've scanned through the "go_enrichment.py" and "mutiple_testing.py" files to see where things are going wrong, but can't seem to find the problem. Any suggestions would be much appreciated.
Thanks,
Amanda
Python 3.4
>>> import fisher
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/fisher-0.1.4-py3.4-linux-x86_64.egg/fisher/__init__.py", line 1, in <module>
from cfisher import *
ImportError: No module named 'cfisher'
Fix:
Looks like a Python3 compatibility issue
__init__.py
from __future__ import absolute_import
from .cfisher import *
Printing records in go-basic.obo
like:
reader = OBOReader()
for rec in reader: print(rec)
gives:
TypeError: unsupported format string passed to NoneType.__format__
Python isn't happy formatting None
into strings. Changing the level
and depth
attributes in GOTerm.__init__()
from None
to ""
allows the records to be printed.
I was trying to compute similarity between two given entities. semantic_similarity
& resnik_sim
works for few entities but it's giving an error return max(common_parent_go_ids(terms, go), key=lambda t: go[t].depth) ValueError: max() arg is an empty sequence
It issues this error when these is no common parent in both provided entities/genes. Here is one example producing this error
semantic_similarity(GO:0003676, GO:0007516, go)
Is there any alternative to compute similarity measure between such entities who doesn't share common parents. I'm sorry If I have missed something.
Hi,
I'm just trying to use your tool (thanks for developing!) but it seems there's an error in my environment. Is it a Python 3 compatibility?
Thanks so much!
python ~/anaconda/bin/find_enrichment.py --pval=0.01 --indent /Users/alomana/scratch/temporalFile.txt /Users/alomana/gDrive2/projects/TREES-C/PfuEGRIN/data/go/populationFile.txt /Users/alomana/gDrive2/projects/TREES-C/PfuEGRIN/data/go/associationFile.txt
Study: 4 vs. Population 1462
load obo file go-basic.obo
go-basic.obo: format-version(1.2) data-version(releases/2016-09-10)
47199 nodes imported
Traceback (most recent call last):
File "/Users/alomana/anaconda/bin/find_enrichment.py", line 4, in
import('pkg_resources').run_script('goatools==0.6.5', 'find_enrichment.py')
File "/Users/alomana/anaconda/lib/python3.5/site-packages/setuptools-23.0.0-py3.5.egg/pkg_resources/init.py", line 719, in run_script
File "/Users/alomana/anaconda/lib/python3.5/site-packages/setuptools-23.0.0-py3.5.egg/pkg_resources/init.py", line 1511, in run_script
File "/Users/alomana/anaconda/lib/python3.5/site-packages/goatools-0.6.5-py3.5.egg/EGG-INFO/scripts/find_enrichment.py", line 137, in
File "/Users/alomana/anaconda/lib/python3.5/site-packages/goatools-0.6.5-py3.5.egg/goatools/go_enrichment.py", line 207, in init
TypeError: unsupported operand type(s) for >>: 'builtin_function_or_method' and '_io.TextIOWrapper'
Python 3.5.2 |Anaconda 4.1.1 (x86_64)| (default, Jul 2 2016, 17:52:12)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Could you please specify what is a method for False Discovery Rate calculation? Is it Benjamini-Hochberg or Benjamini-Yekutieli?
Thank You
Hi,
First of all, thank you very much for the scritps. They are very useful and easy to use.
I was wondering if there is any way to get all GO terms analysed, regardless alpha values.
That is, would like to know:
"id enrichment description ratio_in_study ratio_in_pop p_uncorrected"
for all terms in my study, even if the values are not statistically significant.
I tried with --alpha=1.0, but this option does not give me all terms.
I have some experience with Python, so I can try to modified the script if necessary, but I can't find where to do it.
Thank you very much.
Gabriel
Not really an issue with goatools, but I thought I'd put this here to draw attention to this issue:
... which was preventing me from installing fisher (and as a result, goatools) in Windows.
Although I believe I have identified the problem, I'm not versed enough in C/Cython to submit a fix.
For anyone else having this issue, here is a temporary fix:
I haven't verified, though, that the fisher package at commit e550127 is stable, though, so you might run into issues doing enrichment tests. However, I just wanted to download a parser for the obo files so this works for me.
It seems that goatools is only considering children that have an "Is a" relationship with their parent, but not a "Part of". It seems to me, after reading the Ontology relations page, that "Part of" should be included.
B is necessarily part of A: wherever B exists, it is as part of A, and the presence of the B implies the presence of A
Is there any reason it is not?
What about the other relationships (regulate, has_part)? I don't think they should be included by default, but maybe an optional argument to get_all_children to include them?
python /Users/Genesis/Downloads/goatools-0.6.4/goatools/go_enrichment.py /Users/Genesis/Desktop/data/study.txt /Users/Genesis/Desktop/data/population.txt /Users/Genesis/Desktop/data/association.txt
Traceback (most recent call last):
File "/Users/Genesis/Downloads/goatools-0.6.4/goatools/go_enrichment.py", line 22, in
from multiple_testing import Methods, Bonferroni, Sidak, HolmBonferroni, FDR, calc_qval
File "/Users/Genesis/Downloads/goatools-0.6.4/goatools/multiple_testing.py", line 12, in
from .ratio import count_terms
ValueError: Attempted relative import in non-package
Hi all!
If I understand correctly, if in a first sample of comparison some GO is abscent and in the other - is present abundantly, goatools will not count it as a purification and will not print in results at all. While, as I think, it should be counted as a strong purification. Was it made intentionally or is it a bug?
is it possible to custom the columns of the excel tables generated by goeaobj.wr_xlsx?
for example I want to export columns:
"ratio_in_study", "ratio_in_pop"
Note that we now have a proposed JSON representation of OBO that would obviate the need for special purpose parsers. Your comments as a developer would be most welcome:
See also this post describing motivation
Are there any plans on upgrading to Python3 ?
I ran everything through 2to3 and then manually fixed a couple things and got the obo_parser and mapslim to work correctly. I haven't tested everything else though. Is there any interest in migrating everything to py3?
This fixes the download_ncbi_associations
functions under Python 3.
Currently the following error is produced:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-2afea7ecb634> in <module>()
1 from goatools.base import download_ncbi_associations
----> 2 gene2go = download_ncbi_associations()
/home/lucas/Stack/programming/python/goatools/goatools/base.py in download_ncbi_associations(gene2go, prt)
129 if not os.path.isfile(gene2go):
130 wget.download("ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/{GZ}".format(GZ=gz))
--> 131 assert gunzip(gz) == gene2go
132 if prt is not None:
133 prt.write("\n DOWNLOADED: {FILE}\n".format(FILE=gene2go))
/home/lucas/Stack/programming/python/goatools/goatools/base.py in gunzip(gz, file_gunzip)
143 with gzip.open(gz, 'rb') as zstrm:
144 with open(file_gunzip, 'w') as ostrm:
--> 145 ostrm.write(zstrm.read())
146 os.remove(gz)
147 return file_gunzip
TypeError: write() argument must be str, not bytes
Hi,
I am looking for a tool that can be used to look for enrichment based on GAF and obo of any ontology. E.g. plant ontology PO, TO, EO...
Is this possible with goatools?
Best,
Daniel
It works perfectly when I use Arabidopsis thaliana but when I try to use any Homo sapiens I get error while computing TermsCount.
Here is errTraceback:
Traceback (most recent call last): File "/Users/zeeshannawaz/wp-tomoe-angiogenesis/src/angen/embed/preglove/go/gene_ontology_wrapper.py", line 88, in <module> termcounts = TermCounts(go, associations) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/goatools/semantic.py", line 31, in __init__ self._count_terms(godag, annots) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/goatools/semantic.py", line 42, in _count_terms allterms |= godag[go_id].get_all_parents() KeyError: 'GO:0102756'
Can't pip install goatools
on version 0.6.9 in new virtualenv. pip install goatools==0.6.5
works fine.
(venv) jeff$ pip install goatools
Collecting goatools
Using cached goatools-0.6.9.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/jw/23sjzsn97lz8qj3kp4bg4pw00000gn/T/pip-build-tHhzLf/goatools/setup.py", line 19, in <module>
open('requirements.txt').readlines()]
IOError: [Errno 2] No such file or directory: 'requirements.txt'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/jw/23sjzsn97lz8qj3kp4bg4pw00000gn/T/pip-build-tHhzLf/goatools/
(venv) jeff$ pip install goatools==0.6.5
Collecting goatools==0.6.5
Collecting fisher (from goatools==0.6.5)
Using cached fisher-0.1.4.tar.gz
Collecting xlsxwriter (from goatools==0.6.5)
Using cached XlsxWriter-0.9.3-py2.py3-none-any.whl
Collecting wget (from goatools==0.6.5)
Collecting statsmodels (from goatools==0.6.5)
Using cached statsmodels-0.6.1-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Building wheels for collected packages: fisher
Running setup.py bdist_wheel for fisher ... done
Stored in directory: /Users/jeff/Library/Caches/pip/wheels/94/70/27/f3e07047ba539a9c8a24c5738dad745c6fe3f1d76aa714ed83
Successfully built fisher
Installing collected packages: fisher, xlsxwriter, wget, statsmodels, goatools
Successfully installed fisher-0.1.4 goatools-0.6.5 statsmodels-0.6.1 wget-3.2 xlsxwriter-0.9.3
Hello,
Could you please provide some examples how the files in goatools / data / were generated?
Thank you in advance.
Hi All,
I met a problem when I tried to install goatools on Windows 7 system. I am looking forward to your suggestion!
The python version is 3.4 (32bit). I have installed VS2010.
The error shows:
Searching for goatools
Best match: goatools 0.5.5
Processing goatools-0.5.5-py3.4.egg
goatools 0.5.5 is already the active version in easy-install.pth
Installing map_to_slim.py script to C:\Python34\Scripts
Installing plot_go_term.py script to C:\Python34\Scripts
Installing write_hierarchy.py script to C:\Python34\Scripts
Installing find_enrichment.py script to C:\Python34\Scripts
Using c:\python34\lib\site-packages\goatools-0.5.5-py3.4.egg
Processing dependencies for goatools
Searching for fisher
Reading https://pypi.python.org/simple/fisher/
Best match: fisher 0.1.4
Downloading https://pypi.python.org/packages/source/f/fisher/fisher-0.1.4.tar.gz#md5=bfc763b7333a1f428e4c447dd8a85968
Processing fisher-0.1.4.tar.gz
Writing C:\Users\Duan\AppData\Local\Temp\easy_install-khjvz9br\fisher-0.1.4\setup.cfg
Running fisher-0.1.4\setup.py -q bdist_egg --dist-dir C:\Users\Duan\AppData\Local\Temp\easy_install-khjvz9br\fisher-0.1.4\egg-dist-tmp-xgn4zhdy
cfisher.c
c:\python34\lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_api.h(12) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
LINK : error LNK2001: unresolved external symbol PyInit_fisher/cfisher
build\temp.win32-3.4\Release\src\cfisher.lib : fatal error LNK1120: 1 unresolved externals
Thanks!
Thanks for writing this handy module!
I ran into a bit of trouble when I tried to generate a GML file. Line 306 in obo_parser.py throws an AttributeError when the "gml" option in the draw_lineage() function is set to True. It appears that the name must be set when the graph is initialized. Deleting line 306 and changing line 269 to:
G = pgv.AGraph(name="GO tree")
seemed to fix the problem
If I try to read an obo file with OBOReader, printing GO terms is generating an error.
import goatools
reader = goatools.obo_parser.OBOReader('go-basic.obo')
go_terms = list(reader)
go_terms[0]
TypeError Traceback (most recent call last)
C:\Anaconda\lib\site-packages\IPython\core\formatters.pyc in __call__(self, obj)
697 type_pprinters=self.type_printers,
698 deferred_pprinters=self.deferred_printers)
--> 699 printer.pretty(obj)
700 printer.flush()
701 return stream.getvalue()
C:\Anaconda\lib\site-packages\IPython\lib\pretty.pyc in pretty(self, obj)
381 if callable(meth):
382 return meth(obj, self, cycle)
--> 383 return _default_pprint(obj, self, cycle)
384 finally:
385 self.end_group()
C:\Anaconda\lib\site-packages\IPython\lib\pretty.pyc in _default_pprint(obj, p, cycle)
501 if _safe_getattr(klass, '__repr__', None) not in _baseclass_reprs:
502 # A user-provided repr. Find newlines and replace them with p.break_()
--> 503 _repr_pprint(obj, p, cycle)
504 return
505 p.begin_group(1, '<')
C:\Anaconda\lib\site-packages\IPython\lib\pretty.pyc in _repr_pprint(obj, p, cycle)
692 """A pprint that just redirects to the normal repr function."""
693 # Find newlines and replace them with p.break_()
--> 694 output = repr(obj)
695 for idx,output_line in enumerate(output.splitlines()):
696 if idx:
C:\Anaconda\lib\site-packages\goatools\obo_parser.pyc in __repr__(self)
207 ret.append("{K}:{V}".format(K=key, V=val))
208 else:
--> 209 ret.append("{K}: {V} items".format(K=key, V=len(val)))
210 if len(val) < 10:
211 for elem in val:
TypeError: object of type 'NoneType' has no len()
Hi,
I am trying to run GO term enrichment analysis with my own background set for humans, mice, and yeast. I am assuming that I could use "find_enrichment.py" and include a text file for my query genes (or proteins) and a background set, correct? However, I am curious as to how I could generate an association file. Have you used any UniProt database to generate the association file?
Any help is appreciated.
Thanks
Rav
Hi,
Can I discover "part_of" relationship between different terms or your program build children field only on "is_a" relationship?
fisher.pvalue_population
is defined like this:
# k, n = study_true, study_tot,
# C, G = population_true, population_tot
def pvalue_population(int k, int n, int C, int G):
#print "k=%i, n=%i, C=%i, G=%i" % (k, n, C, G)
return pvalue(k, n - k, C - k, G - C - n + k)
This suggests that population_true and population_tot must include the respective counts from study_true and study_tot, since those counts will be subtracted.
However, in find_enrichment.read_geneset()
, you explicitly make sure the population does not include the common terms when "comparing":
if compare:
...
pop -= common
Should the study set actually be added to the comparison set instead? Alternatively, should another function besides fisher.pvalue_population
be used?
Dear all,
Is there a way to get the infos about all the GO terms associated with the study items of the enrichment analysis, and not only the significantly enriched/purified GO terms?
Thank you very much,
and best wishes,
Isabel
since there's no fischer's test in scipy or on pypi, we should put one in pypi.
before doing that we should have:
a function/method that works with numpy (so can accept numpy arrays and return an array of p-values) this same function can also work with simple python integers. with an import at the top like:
try:
from numpy import log
lambda ffloat a: a.astype('f')
except:
from math import log
lambda ffloat a: float(a)
a cython function (or just keep the swig) for speed.
these will be available as fisher.NumpyFisher, fisher.CFisher
and fischer.Fischer will default to one of them ???
Hi,
After running find_enrichment.py I just wonder whether it is possible to know which Gene ID has been used for each line of the below results?
id enrichment description ratio_in_study ratio_in_pop p_uncorrected p_bonferroni p_holm p_sidak p_fdr
.GO:0003824 e catalytic activity 106/276 7781/33239 2.64e-08 2.05e-05 2.05e-05 2e-05 0
..GO:0016740 e transferase activity 45/276 2713/33239 7.46e-06 0.00578 0.00578 0.00564 0.004
Thank you in advance
...
Hi,
@dvklopfenstein and I had a conversation in issue #79 about discrepancies in GAF files between goa_human.gaf.gz found on the Gene Ontology site, and the one used for Enrichr.
I looked into using GOrilla instead, and they seem to have a much more complete set of GO enrichments than the one taken from Gene Ontology site. For instance, none of my top 3 GO terms found on GOrilla exist: "GO:0010604", "GO:0048522", "GO:0009893".
I'm using your suggested code to load the GAF file:
from goatools.associations import read_gaf
# Import GO annotations from http://geneontology.org/gene-associations/goa_human.gaf.gz
goatools_annotations = read_gaf("../data/goa_human.gaf")
Any ideas where I can find the most complete annotation file that is human specific?
Thanks,
Johnny
Hello,
I recently developed a conda
package for goatools
.
More info.
If you want, you can add the following badge:
Bérénice
Excessive output from print statements in GODag crash my Jupyter/IPython notebooks. I've attached a screenshot of my screen filled with bright red output to stderr. My workaround is the Jupyter magic %%capture
, but this also suppresses bona fide warnings.
This issue extends to command-line usage and calls from Python scripts.
A logger would solve these problems and allow flexible logging to a file or stream, which may be useful in some applications.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.