Exploring Time Trends and Public Opinions on COVID-19-related Therapeutics on Twitter

This repo contains the official code and analyses results for the paper Using Twitter Data to Understand Public Perceptions of Approved versus Off-label Use for COVID-19-related Medications, published at JAMIA, 2022. We release the following resources:

the NLP pipeline for analyzing public perception in drug use in this repository
A RoBERTa-based drug stance detection model, drug-stance-bert, in an off-the-shell fashion, available at HuggingFace.

If you use our pipeline or models, please kindly cite our work with

@article{10.1093/jamia/ocac114,
    author = {Hua, Yining and Jiang, Hang and Lin, Shixu and Yang, Jie and Plasek, Joseph M and Bates, David W and Zhou, Li},
    title = "{Using Twitter Data to Understand Public Perceptions of Approved versus Off-label Use for COVID-19-related Medications}",
    journal = {Journal of the American Medical Informatics Association},
    year = {2022},
    month = {07},
    issn = {1527-974X},
    doi = {10.1093/jamia/ocac114},
    url = {https://doi.org/10.1093/jamia/ocac114},
    note = {ocac114},
    eprint = {https://academic.oup.com/jamia/advance-article-pdf/doi/10.1093/jamia/ocac114/44371833/ocac114.pdf},
}

NLP Pipeline

1. Preprocessing

We have all preprocssing steps summarized in its own README. Follow the instructions.

2. Time trend analysis

Time trend analysis is in the time_trend folder. Follow Processor.ipynb to generate data for plotting the trends, and then use the plot_trend.py file to plot.

3. Stance detection

To train the stance models, follow the trainer, to run inference on tweets, follow inference.ipynb.

4. Geoinference

Use processor.ipynb to process the data we got from stance detection, unify user locations and calculate statewide average stance. Use the state_map.Rmd to visualize the results.

5. Content analysis

Follow clustering.ipynb to cluster and visualize tweets. Use NER.ipynb to find NERs and visualize the results in word clouds.

6. Demographic analysis

Follow the README.md file inside demographic_analysis to replicate the demographic and political orientation inference and analysis.

Acknowledgement

This study was approved by the Mass General Brigham International Review Board. The NLP pipeline is a joint work between Yining Hua (@ningkko) and Hang Jiang (@hjian42). We thank our coauthors Shixu Lin, Jie Yang, Joseph Plasek, David Bates, and Li Zhou. We also thank MIT Center for Constructive Communication (CCC) for funding the American politician dataset from Ballotpedia.

dobinyim / covid-drug Goto Github PK

covid-drug's Introduction

Exploring Time Trends and Public Opinions on COVID-19-related Therapeutics on Twitter

NLP Pipeline

1. Preprocessing

2. Time trend analysis

3. Stance detection

4. Geoinference

5. Content analysis

6. Demographic analysis

Acknowledgement

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent