Giter Site home page Giter Site logo

peneira's Introduction

peneira

PyPI - Version Tests

It's time to sift through some articles 🤭

With this CLI you can search for papers for your research in different sources and export the results.

DISCLAIMER: This is a work in progress. The code is under active development and it's not ready for production use.

Available sources

...and many more to come! Feel free to contribute. There is a world of papers out there!

OpenAlex

Here are some details about this source:

This library obeys the rate limits of the OpenAlex API (10 requests per second).

Usage

CLI

You can interact with the CLI using peneira. For example, to search for papers on "artificial intelligence" and "syndromic surveillance" and save the results to a file, you can run:

peneira -s open_alex -s semantic_scholar --filename my-papers.json

You will be prompted to enter the search query for each source. The lib will search for papers in OpenAlex and Semantic Scholar and store it in a file named my-papers.json. If no filename is provided, the results will be stored to results.json.

You have also the option of export it to a bibtex file:

peneira -s open_alex -s semantic_scholar --format bibtex --filename my-papers.bib

peneira's People

Contributors

anapaulagomes avatar dependabot[bot] avatar

Watchers

 avatar

Forkers

gap10

peneira's Issues

Add support to CORE API registered API KEY

References:
https://api.core.ac.uk/docs/v3#section/Rate-limits

  • Registered Personal users: 10,000 tokens per day, maximum 10 per minute
  • Registered Academic (Supporting / Sustaining members) and Non-academic users: 200k tokens per day
  • Monitor your current api limit by looking for our customised HTTP headers: X-RateLimitRemaining, X-RateLimit-Retry-After, X-RateLimit-Limit

What needs to be done here:

  • Read the API Key from the env var
  • If there is a key, increase the rate limits to a maximum of 10 per minute
  • If there is a key, check the headers, calculate if it is close to the limit and show warning messages

Add log

It will be helpful to debug errors and warnings.

Add support to Semantic Scholar registered API KEY

Reference:
https://www.semanticscholar.org/product/api#api-key
https://api.semanticscholar.org/api-docs/
https://www.semanticscholar.org/product/api/tutorial#key

What needs to be done here:

  • Read the API Key from the env var
  • If there is a key, set the header x-api-key (case-sensitive)
  • If there is a key, increase the rate limits to (received via email):
1 request per second for the following endpoints: 

		*	/paper/batch
		*	/paper/search
		*	/recommendations

	*	10 requests / second for all other calls

Month should be an integer in bibtex entry

legacy month field 'Apr' in entry 'Oyebode_Ndulue_Adib_Mulchandani_Suruliraj_Orji_Chambers_Meier_Orji_2021' is not an integer - this will probably not sort properly.

Instead of Apr it should be 4.

Adapt search string using LLM

Each source has its own way of writing a search string. To avoid adding a query for each one, LLM generates the queries for each source, using few-shot prompting to train it with examples.

We can also ask the user for confirmation and offer the possibility of passing any queries or inputting a fix for the one generated by AI.

peneira aborts

I have errors when I try to run the module.

C:\Users\niels>peneira -s open_alex -o bibtex -f my-papers.bib
Please enter the search string for open_alex: 'ceramic spheres' uhpc
open_alex: Fetching articles for OPEN_ALEX... 24 papers distributed in 1 pages.
Executing the search...

I here attach the logging of the error messages.

Traceback (most recent call last).txt

/Frank

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.