eimenhmdt / autoresearcher Goto Github PK

View Code? Open in Web Editor NEW

363.0 363.0 36.0 635 KB

⚡ Automating scientific workflows with AI ⚡

License: MIT License

Python 100.00%

autoresearcher's Introduction

👋 Hi, I’m Eimen
🫶 Social scientist turned engineer
👀 I enjoy working on transformational products
🏗️ Currently building https://isaaceditor.com

autoresearcher's People

Contributors

Stargazers

Watchers

autoresearcher's Issues

Some new features

I like your project as I am building a research GPT. Following features will add value:

Add a relevance score logic for top 20 citations. Score can be generated based on keyword matching percentage.
For a given sentence (which a researcher might know through expert knowledge in the area), Get relevant citations to back the statement. This will be a very useful feature in LR.
Shortlist and save papers. Find research gaps in shortlisted papers and summarize them.

Please see if you can implement them. We can also collaborate if you are ready. I am ready to code some of these features.

I kept getting a JSONDecodeError (raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)) from the function get_citation_by_doi().

I currently worked around it by commenting out the function, the extract citation section, and the logic under 'for papers in papers' related to DOI, so the ctiation is just the paper ("url"), but wanted to flag it in case anyone else had a similar issue.

Surpassing the max number of tokens allowed

Using this research question: "How to optimize the demand response process using Surrogates trained by Active Learning"

I get the following error running the code in Collab:

InvalidRequestError Traceback (most recent call last)
in <cell line: 2>()
1 # Run the Literature Review
----> 2 researcher = literature_review(research_question, output_file=file)
3 researcher()

7 frames
/usr/local/lib/python3.9/dist-packages/openai/api_requestor.py in _interpret_response_line(self, rbody, rcode, rheaders, stream)
677 stream_error = stream and "error" in resp.data
678 if stream_error or not 200 <= rcode < 300:
--> 679 raise self.handle_error_response(
680 rbody, rcode, resp.data, rheaders, stream_error=stream_error
681 )

InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 4255 tokens (2455 in the messages, 1800 in the completion). Please reduce the length of the messages or completion.

I guess this can be easily fixed by bounding the number of tokens in the prompt generated joining the answers (line 48 in autoresearcher/workflows/literature_review/literature_review.py), or updating dynamically the number of tokens requested to OpenAI depending on the number of tokens used in creating the question.

Expose `fetch_and_sort_papers()` parameters in `literature_review()`

I think it would be great to be able to configure the following in the literature_review() function:

keyword combinations
year_range
top_n papers

Failed to fetch data from API: 400

Research question: Omnichannel marketing
Auto Researcher initiated!
Generating keyword combinations...
Keyword combinations generated!
Fetching top 20 papers...
Exception Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_22004/801009983.py in
----> 1 researcher = literature_review(research_question)

~\anaconda3\envs\Python38\lib\site-packages\autoresearcher\workflows\literature_review\literature_review.py in literature_review(research_question, output_file)
73 search_query = research_question
74 print(colored("Fetching top 20 papers...", "yellow"))
---> 75 top_papers = SemanticScholar.fetch_and_sort_papers(search_query, keyword_combinations=keyword_combinations, year_range="2000-2023")
76 print(colored("Top 20 papers fetched!", "green"))
77

~\anaconda3\envs\Python38\lib\site-packages\autoresearcher\data_sources\web_apis\semantic_scholar_loader.py in fetch_and_sort_papers(self, search_query, limit, top_n, year_range, keyword_combinations, weight_similarity)
25
26 for combination in keyword_combinations:
---> 27 papers.extend(self.fetch_data(combination, limit, year_range))
28
29 max_citations = max(papers, key=lambda x: x['citationCount'])['citationCount']

~\anaconda3\envs\Python38\lib\site-packages\autoresearcher\data_sources\web_apis\semantic_scholar_loader.py in fetch_data(self, search_query, limit, year_range)
16 params["year"] = year_range
17
---> 18 data = self.make_request("", params=params)
19 return data.get('data', [])
20

~\anaconda3\envs\Python38\lib\site-packages\autoresearcher\data_sources\web_apis\base_web_api_data_loader.py in make_request(self, endpoint, params)
18 return data
19 else:
---> 20 raise Exception(f"Failed to fetch data from API: {response.status_code}")
21

Exception: Failed to fetch data from API: 400

Exception: Failed to fetch data from API: 429

Cool package!

I am receiving this error:

Research question: What mutations in the N gene of sars cov 2 are involved in rapid antigen test failurre?
Auto Researcher initiated!
Generating keyword combinations...
Keyword combinations generated!
Fetching top 20 papers...

Traceback (most recent call last):
File "", line 1, in
File "/home/amar/miniconda3/envs/write-the/lib/python3.9/site-packages/autoresearcher/workflows/literature_review/literature_review.py", line 85, in literature_review
top_papers = SemanticScholar.fetch_and_sort_papers(search_query, keyword_combinations=keyword_combinations, year_range="2000-2023")
File "/home/amar/miniconda3/envs/write-the/lib/python3.9/site-packages/autoresearcher/data_sources/web_apis/semantic_scholar_loader.py", line 27, in fetch_and_sort_papers
papers.extend(self.fetch_data(combination, limit, year_range))
File "/home/amar/miniconda3/envs/write-the/lib/python3.9/site-packages/autoresearcher/data_sources/web_apis/semantic_scholar_loader.py", line 18, in fetch_data
data = self.make_request("", params=params)
File "/home/amar/miniconda3/envs/write-the/lib/python3.9/site-packages/autoresearcher/data_sources/web_apis/base_web_api_data_loader.py", line 20, in make_request
raise Exception(f"Failed to fetch data from API: {response.status_code}")
Exception: Failed to fetch data from API: 429

My code:

from autoresearcher import literature_review

research_question = "What mutations in the N gene of sars cov 2 are involved in rapid antigen test failurre?"
researcher = literature_review(research_question)
researcher = literature_review(research_question, output_file="my_literature_review.txt")

429 response suggests too many queries.

Discord invite link has expired

The Discord link https://discord.gg/PnQDR5h9 in the readme does not work!

The same paper is picked up many times

In some instances, it seems autoreseracher is reading from the same paper multiple times. Sometimes it goes as far as 5+ times. As expected, this happens more often with more specific questions that may have less literature available in the given time window.

May be useful to find a way to avoid repeating papers, even if there aren't enough to satisfy the maximum asked for. Just return something like no more papers found and break or something.

Old Topic Paper Prioritisation

For some questions that refer to old knowledge, such as questions that have been studied for a very long time, the AI still looks up recent papers (up to 2000), which is likely because previous versions are not widely available in a digital text format, but instead in a PDF.

Perhaps in the cases when it is clear that it is a long-studied topic, the AI should choose papers based not on their top position, which usually indicates the novelty they bring, but instead also based on the number of references or another heuristic, that allows identifying papers with a good overview of the field, such as literature review papers. If not done, the AI struggles to understand the topic well as it does not read the foundational papers, jumping straight to the cutting edge.

How do I change the model to gpt-4 when calling literature_review?

In the following places, I found that use_gpt4=True can be passed to change the use of the model from 3.5 to 4:

autoresearcher/llms/openai.py
autoresearcher/workflows/literature_review/combine_answers.py
autoresearcher/workflows/literature_review/extract_answers_from_papers.py

However, how do I do the same when calling literature_review? For example:

researcher = literature_review(
    research_question, output_file="answer.txt"
)

Question about search string

Hi, I am not a coder, a friend helped set it up, but autoresearcher is amazing!
My issue is that autoresearcher missed the most obvious articles no matter what I asked.

I asked about drug interactions and the herb Echinacea. I tried various search strings, from complex to a very simple one: "Echinacea, drug interactions"

Each time autoresearchers produced accurate keyword combinations to search for papers such as: 1. Echinacea, medication, interactions, 2. Herbal remedies, drug interactions, Echinacea, 3. Echinacea, prescription drugs, interactions, 4. Echinacea, supplements, drug interactions, 5. Echinacea, adverse effects, drug interactions

However, each time autoresearcher failed to identify and use the two articles listed below. Only if I actually included the very title of the papers in the search string, would autoresearcher include them.

The most obvious articles where not found and some articles with little relevance were included.

My questions are why this is and what I could do about improving the search questions?
thank you very much

These two articles were not included. The first one even has echinacea and drug interactions in its title. The second paper has a detail abstracts and also mentions echinacea and drug interactions.

A critical evaluation of drug interactions with Echinacea spp
Camille Freeman, Kevin Spelman
PMID: 18618481 DOI: 10.1002/mnfr.200700113

Review and Assessment of Medicinal Safety Data of Orally Used Echinacea Preparations
Karin Ardjomand-Woelkart, Rudolf Bauer
PMID: 26441065 DOI: 10.1055/s-0035-1558096

Keyword combinations used to search for papers: 1. Echinacea, medication, interactions, 2. Herbal remedies, drug interactions, Echinacea, 3. Echinacea, prescription drugs, interactions, 4. Echinacea, supplements, drug interactions, 5. Echinacea, adverse effects, drug interactions

Literature Review:

Echinacea is a commonly used herbal remedy for the prevention of common cold, but its efficacy remains inconclusive or contradictory (Izzo et al., 2016). Moreover, it may cause potentially serious adverse events, including herb-drug interactions (Izzo et al., 2016). A study by Qato et al. (2016) found that 15.1% of older adults were at risk for potential major drug-drug interactions, and most of these interactions involved medications and dietary supplements increasingly used in 2010-2011, including echinacea. Sparreboom et al. (2004) reported that echinacea has the potential to significantly modulate the activity of drug-metabolizing enzymes and/or the drug transporter P-glycoprotein, and participates in potential pharmacokinetic interactions with anticancer drugs. Sachar and Ma (2013) suggested that echinacea may cause herb-drug interactions through nuclear receptors (NRs) activation, resulting in NR-mediated HDIs. Parvez and Rishi (2019) warned that there exists a potential risk of herb-drug interactions leading to adverse side effects, including hepatotoxicity.

Despite the potential risks associated with echinacea use, it is still one of the most commonly used herbal remedies in the presurgical population (Tsen et al., 2000). However, the article by Chen et al. (2012) did not mention echinacea in their abstract, which focused on possible pharmacokinetic, pharmacodynamic, and herbal drug interactions occurring in the elderly.

Overall, the literature suggests that echinacea may have potential herb-drug interactions and adverse events, and caution should be exercised when using it in combination with other medications or supplements. Further research is needed to fully understand the mechanisms and clinical implications of echinacea-related herb-drug interactions.

References:

Chen, X.-W., B. Sneed, K., Pan, S.-Y., Cao, C., R. Kanwar, J., Chew, H., & Zhou, S.-F. (2012, May 1). Herb-Drug Interactions and Mechanistic and Clinical Considerations. Current Drug Metabolism. Bentham Science Publishers Ltd. http://doi.org/10.2174/1389200211209050640

Izzo, A. A., Hoon-Kim, S., Radhakrishnan, R., & Williamson, E. M. (2016, February 17). A Critical Approach to Evaluating Clinical Efficacy, Adverse Events and Drug Interactions of Herbal Remedies. Phytotherapy Research. Wiley. http://doi.org/10.1002/ptr.5591

Parvez, M. K., & Rishi, V. (2019, June 11). Herb-Drug Interactions and Hepatotoxicity. Current Drug Metabolism. Bentham Science Publishers Ltd. http://doi.org/10.2174/1389200220666190325141422

Qato, D. M., Wilder, J., Schumm, L. P., Gillet, V., & Alexander, G. C. (2016, April 1). Changes in Prescription and Over-the-Counter Medication and Dietary Supplement Use Among Older Adults in the United States, 2005 vs 2011. JAMA Internal Medicine. American Medical Association (AMA). http://doi.org/10.1001/jamainternmed.2015.8581

Sachar, M., & Ma, X. (2013, January 21). Nuclear receptors in herb–drug interactions. Drug Metabolism Reviews. Informa UK Limited. http://doi.org/10.3109/03602532.2012.753902

Sparreboom, A., Cox, M. C., Acharya, M. R., & Figg, W. D. (2004, June 15). Herbal Remedies in the United States: Potential Adverse Interactions With Anticancer Agents. Journal of Clinical Oncology. American Society of Clinical Oncology (ASCO). http://doi.org/10.1200/jco.2004.08.182

Tsen, L. C., Segal, S., Pothier, M., & Bader, A. M. (2000, July 1). Alternative Medicine Use in Presurgical Patients. Anesthesiology. Ovid Technologies (Wolters Kluwer Health). http://doi.org/10.1097/00000542-200007000-00025

Gujjarlamudi, H. (2016). Polytherapy and drug interactions in elderly. Journal of Mid-life Health. Medknow. http://doi.org/10.4103/0976-7800.191021

eimenhmdt / autoresearcher Goto Github PK

autoresearcher's Introduction

autoresearcher's People

Contributors

Stargazers

Watchers

Forkers

autoresearcher's Issues

Recommend Projects

Recommend Topics

Recommend Org