dvklopfenstein / pmidcite Goto Github PK
View Code? Open in Web Editor NEWTurbocharge a PubMed literature rather than clicking and clicking and clicking on Google Scholar
License: GNU Affero General Public License v3.0
Turbocharge a PubMed literature rather than clicking and clicking and clicking on Google Scholar
License: GNU Affero General Public License v3.0
Installed via pip3. icite works normally however when running this as a test:
summarize_papers goatools_cites.txt -p TOP CIT CLI
I get
Traceback (most recent call last):
File "/cluster/path/to/my/directory/env/bin/summarize_papers", line 5, in <module>
from pmidcite.scripts.icite import summarize_papers
ImportError: cannot import name 'summarize_papers' from 'pmidcite.scripts.icite' (/cluster/path/to/my/directory/env/lib/pypy3.9/site-packages/pmidcite/scripts/icite.py)
I installed the github repo and used the make command but icite command was not working. Can someone please help me with this ?
Hello,
Thank you for creating such a lovely literature search tool!
When using icite -k
or icite --print_keys
the result is:
YEAR/citations/references section:
----------------------------------
YEAR: The year the article was published
x: Total of all unique articles that have cited the paper, including clinical articles
y: Number of unique clinical articles that have cited the paper
z: Number of references
But the header while using icite -H
is as follows:
YEAR cit cli ref
I would like to kindly request the following changes in the --print_keys
display:
YEAR/citations/references section:
----------------------------------
YEAR: The year the article was published
cit: Total of all unique articles that have cited the paper, including clinical articles
cli: Number of unique clinical articles that have cited the paper
ref: Number of references
Thank you for your time.
If NIH citation data is not available for one or more requested PMIDs in a list of PMIDs, this error appears:
**WARNING: 1 NIH CITATION DATA NOT DOWNLOADED FOR PMIDs: 32809475
Traceback (most recent call last):
File "src/bin/dnld_pmids.py", line 40, in <module>
main()
File "src/bin/dnld_pmids.py", line 36, in main
obj.run(queries, dnld_idx)
File "/cygdrive/c/Users/note2/Data/git/pmidcite/src/pmidcite/pubmedqueryicite.py", line 40, in run
self.querypubmed_runicite(ntd.filename, ntd.pubmed_query)
File "/cygdrive/c/Users/note2/Data/git/pmidcite/src/pmidcite/pubmedqueryicite.py", line 61, in querypubmed_runicite
self.wr_icite(fout_icite, pmids)
File "/cygdrive/c/Users/note2/Data/git/pmidcite/src/pmidcite/pubmedqueryicite.py", line 78, in wr_icite
pmid2paper = dnldr.get_pmid2paper(pmids, self.pmid2note)
File "/cygdrive/c/Users/note2/Data/git/pmidcite/src/pmidcite/icite/pmid_dnlder.py", line 144, in get_pmid2paper
pmid2icite = {o.pmid:o for o in self.get_icites(pmids_top)}
File "/cygdrive/c/Users/note2/Data/git/pmidcite/src/pmidcite/icite/pmid_dnlder.py", line 144, in <dictcomp>
pmid2icite = {o.pmid:o for o in self.get_icites(pmids_top)}
AttributeError: 'NoneType' object has no attribute 'pmid'
Add an option to always download citations from the NIH; this will now be the default setting.
The previous default mode allows researchers to combine working on-line with large periods of working offline by downloading citation data from the NIH (json format), converting the json data to a Python dict, and writing the Python data into a temporary working file(p.py). Under the former default setting, citation data previously downloaded from the NIH would be loaded from the temporary Python working files rather than re-downloaded from the NIH unless the force_download
argument is True.
The new option of always downloading citations from the NIH results in no temporary working files written to disk resulting in the researcher always seeing the latest citation data, but with the cost of not being able to use previously downloaded data when working offline. An advantage of always downloading citations is an increase in speed resulting from not writing citation working data to disk. Another advantage includes simplifiying the user interface of the Python library for researchers using the library in their code. The ability to save NIH citation data to disk as Python dicts remains, but is now an option. No code changes are necessary for any researchers using this library in their code.
Hello,
Thank you so much for creating this literature search tool! It is very helpful for my work.
Can you add annotation to the icite results that identifies whether this paper is a review?
Thank you!
@dvklopfenstein Pmidcite uses a huge amount of memory while accessing the pubmed ids. I have a txt file which contains 90,000 pmids. On running pmidcite, the cluster gets aborted due to high memory imprint (about 16 GB). Can you please help me with this ? I only require the headers and do not want the information of the citations.
Hi,
I'm trying to set-up your tool manually by cloning this repo and running your Makefile, but the makefile doesn't seem to create the necessary files (but rather stops after executing just 2 find-commands). Could you please add some lines of explanation, how to set-up your tool manually?
Thank you so much for your efforts! Looking forward to testing your tool :)
Here my Makefile print-out:
make -f makefile
find src -regextype posix-extended -regex ".*[a-z]+.py"
src/bin/dnld_pmids.py
src/bin/rpt_dates_top.py
src/bin/plot_pubmed_contents.py
src/bin/plt_guassian_nihperc.py
src/bin/scatter.py
src/bin/query_pubmed.py
src/bin/dnld_pubmed.py
src/bin/icite.py
src/bin/read_pmids.py
src/tests/args_dflt.py
src/tests/pmids_many.py
src/tests/test_cfg_icite.py
src/tests/test_nb_print_paper_sort_cites.py
src/tests/test_paper_sorts.py
src/tests/test_nb_nihocc_data_download_always.py
src/tests/test_speed_api_dnld.py
src/tests/dnld_pmids_100k.py
src/tests/test_cli_icite.py
src/tests/test_nb_print_paper_all_refs_cites.py
src/tests/prt_hms.py
src/tests/test_nb_query_pubmed.py
src/tests/test_nb_nihocc_data_download_or_import.py
src/tests/icite.py
src/tests/pmids.py
src/tests/test_dnld_cites_refs.py
src/tests/test_speed_dnld_load.py
src/tests/test_database_list.py
src/tests/test_icite_longreq.py
src/tests/test_print_paper.py
src/tests/test_dnld_pmids.py
src/pmidcite/eutils/cmds/pubmed.py
src/pmidcite/eutils/cmds/efetch.py
src/pmidcite/eutils/cmds/elink.py
src/pmidcite/eutils/cmds/cmdbase.py
src/pmidcite/eutils/cmds/esearch.py
src/pmidcite/eutils/cmds/base.py
src/pmidcite/eutils/cmds/query_ids.py
src/pmidcite/eutils/pubmed/terms.py
src/pmidcite/eutils/pubmed/query.py
src/pmidcite/eutils/pubmed/author.py
src/pmidcite/eutils/pubmed/qualifiers.py
src/pmidcite/eutils/pubmed/descriptors.py
src/pmidcite/eutils/pubmed/rdwr.py
src/pmidcite/eutils/pubmed/record.py
src/pmidcite/eutils/pubmed/authors.py
src/pmidcite/eutils/pubmed/counts/dnlded_data.py
src/pmidcite/eutils/pubmed/counts/dnld.py
src/pmidcite/eutils/pubmed/counts/plt.py
src/pmidcite/eutils/pubmed/counts/data.py
src/pmidcite/cfg.py
src/pmidcite/pubmedqueryicite.py
src/pmidcite/plot/nih_perc.py
src/pmidcite/plot/scatter.py
src/pmidcite/utils_module.py
src/pmidcite/_version.py
src/pmidcite/icite/pmid_dnlder.py
src/pmidcite/icite/downloader.py
src/pmidcite/icite/papers.py
src/pmidcite/icite/paper.py
src/pmidcite/icite/api.py
src/pmidcite/icite/utils.py
src/pmidcite/icite/entry.py
src/pmidcite/icite/dnldr/pmid_dnlder.py
src/pmidcite/icite/dnldr/pmid_loader.py
src/pmidcite/icite/dnldr/pmid_dnlder_base.py
src/pmidcite/icite/dnldr/pmid_dnlder_only.py
src/pmidcite/icite/nih_grouper.py
src/pmidcite/cli/readpmids.py
src/pmidcite/cli/rptdatestop.py
src/pmidcite/cli/querypubmed.py
src/pmidcite/cli/entry_keyset.py
src/pmidcite/cli/utils.py
src/pmidcite/cli/icite.py
src/pmidcite/cli/dnldpubmed.py
src/pmidcite/cfgini.py
find src -regextype posix-extended -regex "[a-z./]*" -type d
src
src/bin
src/tests
src/tests/data
src/pmidcite
src/pmidcite/eutils
src/pmidcite/eutils/cmds
src/pmidcite/eutils/pubmed
src/pmidcite/eutils/pubmed/counts
src/pmidcite/plot
src/pmidcite/icite
src/pmidcite/icite/dnldr
src/pmidcite/cli
Tested on MacOS-System.
Thank you so much for writing this project and showing me how to use it!
I am seeing the following warning. I know it's not from your package, but I thought I would let you know.
% icite 32976797
/Users/user1/Library/Python/3.9/lib/python/site-packages/urllib3/__init__.py:34: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
warnings.warn(
Thank you very much!
I love your tool! It is so amazing. Thank you so much for this.
I see you can do PubMed searches from the script but I need to do it from the command line. Would it be possible to do this from the command line?
icite -s 'HIV AND methylation AND (2017:2023[pdat])' -o HIV_meth_gt2017.txt
Hey & thx for your tool!
My issue:
After installing pmidcite via pip install pmidcite
calling icite
does not work.
Error:
-bash: icite: command not found
fl: only return publications with the given fields. Separate multiple fields with commas (no space). Field names are very specific and listed in Response example below. No fl param will return all fields.
Hey all,
I made a fresh python 3.8 enviornment and ran pip install pmidcite
and got the following error:
Collecting pmidcite
Using cached pmidcite-0.0.36.tar.gz (2.6 MB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [7 lines of output]
running egg_info
creating /private/var/folders/lg/xq84d0w15qv8wrgxqbrxznkm0000gn/T/pip-pip-egg-info-zi1zdbyn/pmidcite.egg-info
writing /private/var/folders/lg/xq84d0w15qv8wrgxqbrxznkm0000gn/T/pip-pip-egg-info-zi1zdbyn/pmidcite.egg-info/PKG-INFO
writing dependency_links to /private/var/folders/lg/xq84d0w15qv8wrgxqbrxznkm0000gn/T/pip-pip-egg-info-zi1zdbyn/pmidcite.egg-info/dependency_links.txt
writing top-level names to /private/var/folders/lg/xq84d0w15qv8wrgxqbrxznkm0000gn/T/pip-pip-egg-info-zi1zdbyn/pmidcite.egg-info/top_level.txt
writing manifest file '/private/var/folders/lg/xq84d0w15qv8wrgxqbrxznkm0000gn/T/pip-pip-egg-info-zi1zdbyn/pmidcite.egg-info/SOURCES.txt'
error: package directory 'src/pmidcite/eutils/pubmed/mesh' does not exist
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
I'm using Mac m1 Monterey 12.3 and python 3.8. Any ideas why?
Great package, ty. I was wondering if rather than making API requests for a list of pmids, is there dump of the citation network somewhere for pubmed? I would just need the list of edges in two columns basically (source, destination). I figured this would max out API requests fairly quickly using the examples provided. I know this is a big and maybe impossible ask, but just thought how I'd obtain as much of the network as possible to assist in a recommendation engine I'm working on. Thank you.
icite -H $PMID -c > $PMID.txt
I would love to use this tool in a program, but the only issue I have is that it is meant to be human-readable, rather than machine-readable. I mean this in the sense that it is space-delimited to keep the columns in line, but the number of spaces is naturally inconsistent.
Is there an option for the output of the above command to be tab-delimited instead? Comma delimiting seems dangerous due to the paper names potentially containing commas. Standard commands to convert spaces to tabs fail due to the paper names and limiting to only consecutive spaces causes issues with columns 5, 6, and 7 where they are often only separated by a single space.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.