Giter Site home page Giter Site logo

arf9999 / rtweetxtras Goto Github PK

View Code? Open in Web Editor NEW
22.0 2.0 3.0 19.92 MB

Twitter analysis functions built for use with rtweet package (by mkearney)

License: Creative Commons Zero v1.0 Universal

R 100.00%
twitter rstats-package rstats-twitter rtweet twitter-analysis

rtweetxtras's Introduction

Alt text

rtweetXtras

A collection of helper functions for twitter analysis using the {rtweet} package. NB: these functions have MANY dependencies… no warranty is offered, but please feel free to log issues. NB: This branch is now in development mode for the upcoming {rtweet version 1.0}. For compatibility with rtweet 0.7, please rather use the legacy version branch. Things will break here.

To install the package, use the {remotes} or {devtools} package: remotes::install_github("arf9999/rtweetXtras")

hashtagcloud

hashtagcloud(rtweet_timeline_df, num_words = 200)

Delivers a wordcloud of hashtag terms in an rtweet tibble.

df <- rtweet::get_timeline("jack", n=1000)
rtweetXtras::hashtagcloud(df, n=200)

profilecloud

profilecloud(rtweet_timeline_df, num_words = 200)
Delivers a wordcloud of terms in unique twitter user profiles in an rtweet tibble.

df1 <- rtweet::search_tweets("rstats",lang = "en", n= 2000)
rtweetXtras::profilecloud(df1, 100)

bar_plot_mentions

bar_plot_mentions(rtweet_df, no_of_bars = 20, title = NULL)
Delivers a bar plot of the count of user mentions in an rtweet tibble. Defaults to top twenty mentioned accounts.

df1 <- rtweet::search_tweets("#rstats", lang = "en", n = 2000)
rtweetXtras::bar_plot_mentions(df1, 20, title = paste("Barplot of user mentions in twitter search for \"#rstats\"", Sys.Date()))

common_follower_analysis

common_follower_analysis (user_list, follower_depth = 200, no_of_sets = 7, token = NULL)
This function creates an UpSetR graph of common followers Code cribbed from Bob Rudis’ 21 Recipes for Mining Twitter with Rtweet https://rud.is/books/21-recipes/visualizing-intersecting-follower-sets-with-upsetr.html

 rstats_list <- c("hadleywickham", "dataandme", "juliasilge", "statgarrett","thomasp85")
 
rtweetXtras::common_follower_analysis(rstats_list, follower_depth = 1000, no_of_sets = 5, token = NULL)

common_follower_matrix

common_follower_matrix (user_list, follower_depth = 200, token = NULL)
This function creates a matrix of followers of a list of twitter users, sums the number of common followers, and then ranks them in descending order.

  rstats_list <- c("hadleywickham", "dataandme", "juliasilge", "statgarrett","thomasp85")

 fm <- rtweetXtras::common_follower_matrix(rstats_list, follower_depth = 200, token = NULL)
 dplyr::glimpse(fm)

## Rows: 902
## Columns: 9
## $ screen_name       <chr> "statistician_dr", "ntthong", "Sahar62195425", "TimF…
## $ user_id           <chr> "1537438062564626433", "133186653", "124793356659759…
## $ hadleywickham     <dbl> 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1…
## $ dataandme         <dbl> 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0…
## $ juliasilge        <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0…
## $ statgarrett       <dbl> 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1…
## $ thomasp85         <dbl> 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ sum_intersections <dbl> 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2…
## $ ranking           <int> 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3…

account_activity

account_activity(account_name, depth = 3200, time_zone = "Africa/Johannesburg", no_of_weeks = 4, token = NULL)
This function creates a bubble plot of account activity by hour of a single twitter screen_name
(inspired by python script by twitter user “[@Conspirator0]”)

 rtweetXtras::account_activity("arfness", depth = 1000, time_zone = "Africa/Johannesburg", no_of_weeks = 5, token = NULL)

follower_dot_plot

follower_dot_plot(follower_df, point_colour = "statuses_count", show_legend = TRUE, suppress_warnings = TRUE, include_loess_smooth = FALSE, include_lm = FALSE, print_immediately = TRUE, log_transform = FALSE, viridis_option = "magma")

This builds a ggplot2 scatter plot with the creation date of twitter followers mapped to the order in which they followed. Various options of dot colouring are available to examine the follower’s attributes. Optionally, a loess smoothed curve and/or a linear model can be overlaid to analyse the change in follower creation dates over time.

arfness_followers <- rtweetXtras::get_followers_fast("arfness")

## [1] "arfness follower count = 6859"
## [1] "followers captured: 6859 out of 6859"

rtweetXtras::follower_dot_plot(arfness_followers, point_colour = "earliest_follow")

scatter_ts_interactive

scatter_ts_interactive(df, title = "", print_immediately = TRUE)

Builds an interactive timeline of tweets indicating time of tweet, author, type (tweet, retweet, quote, reply), and rate of tweets (posts per second) - posts per second is calculated as moving average over 10 tweets. Posts per second are calculated separately for content tweets (tweets, quotes and replies) and copy tweets (retweets). Output is an Apache echarts HTML widget.

mickey <- rtweet::search_tweets("Mickey Mouse", n = 100)
rtweetXtras::scatter_ts_interactive(mickey, title = "100 tweets about Mickey Mouse")

Other functions

The package includes some additional tools and wrappers for rtweet functions:

get_followers_fast and get_friends_fast wrap rtweet functions to deliver a tibble of followers/friends that includes user details. The order of friendship and following is added as separate column, the account name being followed, befriended is added as a column, and the earliest following or befriending date is added as a column.
In addition it is possible to pass a list of tokens to the function to manage ratelimiting when querying accounts with large following/friendship.

rtweet_net and save_csv_edgelist are functions to create an igraph network and to save that as an edgelist for use in external visualisation software. This is not being maintained - replaced by create_gexf

create_gexf creates a gexf file for export to Gephi for visualisation.

write_csv_compatible saves a csv file of an rtweet tibble with a modified “text” column to include “RT [@retweet_screen_name]:” for all Retweets. Original text column is saved as additional column “text2”

snscrape_search is a function that uses the Python snscrape library to search historical twitter. Python 3.8 and snscrape need to be installed. See https://github.com/JustAnotherArchivist/snscrape for more information.

snscrape_get_timeline is a function that uses Python snscrape library to pull a twitter user timeline, and rtweet to rehydrate it. Note: Currently, unlike rtweet::get_timeline(), no retweets are captured, and there is no API limitation of 3200 statuses as snscrape uses the web search facility of twitter.

get_perspective is a function to query the Google Perspective API to classify toxicity in text. More information here: https://www.perspectiveapi.com/ NB: A Google Cloud API key is required to use this function. Instructions on how to set it up are here: https://developers.perspectiveapi.com/s/docs-get-started

perspective_rtweet Queries the text from an rtweet dataframe, sequentially by row, using the get_perspective function. A Perspective API key is required. In addition, the number of queries per second can be set if this has been negotiated with the Perspective team.

check_shadowban is a function to check whether an account has been temporarily suppressed from search or display results by Twitter.

check_shadowban_list allows a list of twitter handles to be passed to check_shadowban

rtweetxtras's People

Contributors

arf9999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rtweetxtras's Issues

snscrape_search problem

Hi,
I have try to use snscrape_search() but i get the following error,

I use the following code

snscrape_search(
"BTC",
since_date = '2022-01-01',
until_date = '2022-01-02',
n = 100,
file = "temp",
token = NULL,
delete_tempfile = TRUE
)

the error is the following:

Error: 'temp1655840672.35401.txt' does not exist in current working directory ('C:/Users/bahlo/Documents')
Any help please!!

Errror "Unknown or uninitialised column:"

I got errors when I try examples listed in your readme.

I have R 4.2.1 and rtweet 1.0.2, so I need rtweetXtras from your master branch and not legacy branch.

When I check the rtweetXtras' version, I got this :

Package: rtweetXtras
Title: Extra Analysis Functions for rtweet package
Version: 0.4.1.0000

I don't know if it's the last or not (I've installed it via remotes::install_github("arf9999/rtweetXtras")

My error is "Unknown or uninitialised column:"
Here is a log of the errors :

> df <- rtweet::get_timeline("jack", n=100)
> rtweetXtras::hashtagcloud(df, n=200)
Attachement du package : ‘dplyr’
Les objets suivants sont masqués depuis ‘package:stats’:
    filter, lag
Les objets suivants sont masqués depuis ‘package:base’:
    intersect, setdiff, setequal, union
Package version: 3.2.3
Unicode version: 13.0
ICU version: 69.1
Parallel computing: 8 of 8 threads used.
See https://quanteda.io for tutorials and examples.
Error in `dplyr::filter()`:
! Problem while computing `..1 = !is.na(rtweet_timeline_df$hashtags)`.
✖ Input `..1` must be of size 100 or 1, not size 0.
Run `�]8;;rstudio:run:rlang::last_error()rlang::last_error()�]8;;` to see where the error occurred.
Message d'avis :
Unknown or uninitialised column: `hashtags`.
> df1 <- rtweet::search_tweets("rstats",lang = "en", n=2000)                                                                                                                                                                                                                                                     
> rtweetXtras::profilecloud(df1, 100)
Error in `dplyr::filter()`:
! Problem while computing `..1 = !is.na(rtweet_timeline_df$description)`.
✖ Input `..1` must be of size 1977 or 1, not size 0.
Run `�]8;;rstudio:run:rlang::last_error()rlang::last_error()�]8;;` to see where the error occurred.
Message d'avis :
Unknown or uninitialised column: `description`.
> colnames(df)
 [1] "created_at"                    "id"                            "id_str"                        "full_text"                     "truncated"                     "display_text_range"            "entities"                      "source"                        "in_reply_to_status_id"        
[10] "in_reply_to_status_id_str"     "in_reply_to_user_id"           "in_reply_to_user_id_str"       "in_reply_to_screen_name"       "geo"                           "coordinates"                   "place"                         "contributors"                  "retweeted_status"             
[19] "is_quote_status"               "retweet_count"                 "favorite_count"                "favorited"                     "retweeted"                     "lang"                          "possibly_sensitive"            "quoted_status_id"              "quoted_status_id_str"         
[28] "quoted_status_permalink"       "quoted_status"                 "text"                          "favorited_by"                  "scopes"                        "display_text_width"            "quote_count"                   "timestamp_ms"                  "reply_count"                  
[37] "filter_level"                  "metadata"                      "query"                         "withheld_scope"                "withheld_copyright"            "withheld_in_countries"         "possibly_sensitive_appealable"
> colnames(df1)
 [1] "created_at"                    "id"                            "id_str"                        "full_text"                     "truncated"                     "display_text_range"            "entities"                      "metadata"                      "source"                       
[10] "in_reply_to_status_id"         "in_reply_to_status_id_str"     "in_reply_to_user_id"           "in_reply_to_user_id_str"       "in_reply_to_screen_name"       "geo"                           "coordinates"                   "place"                         "contributors"                 
[19] "is_quote_status"               "retweet_count"                 "favorite_count"                "favorited"                     "retweeted"                     "lang"                          "possibly_sensitive"            "retweeted_status"              "quoted_status_id"             
[28] "quoted_status_id_str"          "quoted_status"                 "text"                          "favorited_by"                  "scopes"                        "display_text_width"            "quoted_status_permalink"       "quote_count"                   "timestamp_ms"                 
[37] "reply_count"                   "filter_level"                  "query"                         "withheld_scope"                "withheld_copyright"            "withheld_in_countries"         "possibly_sensitive_appealable"

Regards

cannot install

Hi,

I am getting the following error (on R Studio)

  • installing source package ‘rtweetXtras’ ...
    ** R
    ** byte-compile and prepare package for lazy loading
    Error in assign(x, get(x, envir = backports), envir = pkg) :
    cannot add bindings to a locked environment
    Error : unable to load R code in package ‘rtweetXtras’
    ERROR: lazy loading failed for package ‘rtweetXtras’

Thank you

Cannot find the scatter_ts_interactive

Hi again

I really like your package so I tried your update but it does not work for me.
When i ran this code : rtweetXtras::scatter_ts_interactive(mickey, title = "100 tweets about Mickey Mouse")
I get this issue -->Erreur : 'scatter_ts_interactive' is not an object exported from 'namespace:rtweetXtras'

When I ran this one : scatter_ts_interactive(mickey, title = "", print_immediately = TRUE), R does not find the function "scatter_ts_interactive"

Sorry about my warnings but I will show your package for my students and your beautiful data viz !

Error with follower_dot-plot

Hi, I just go this error
follower_dot_plot(rsr_followers, point_colour = "earliest_follow")

Scale for 'colour' is already present. Adding another
scale for 'colour', which will replace the existing
scale.

Error in as.Date.numeric(value) : 'origin' must be supplied

Here is the traceback

stop("'origin' must be supplied")
29.
as.Date.numeric(value)
28.
as.Date(value)
27.
[<-.Date(*tmp*, finite & x < range[1], value = NA_real_)
26.
[<-(*tmp*, finite & x < range[1], value = NA_real_)
25.
f(...)
24.
self$oob(x, range = limits)
23.
f(...)
22.
self$rescaler(x, from = range)
21.
f(..., self = self)
20.
self$rescale(self$oob(x, range = limits), limits)
19.
f(..., self = self)
18.
self$map(df[[j]])
17.
FUN(X[[i]], ...)
16.
lapply(aesthetics, function(j) self$map(df[[j]]))
15.
f(..., self = self)
14.
scale$map_df(df = df)
13.
FUN(X[[i]], ...)
12.
lapply(scales$scales, function(scale) scale$map_df(df = df))
11.
unlist(lapply(scales$scales, function(scale) scale$map_df(df = df)),
recursive = FALSE)
10.
FUN(X[[i]], ...)
9.
lapply(data, scales_map_df, scales = npscales)
8.
ggplot_build.ggplot(x)
7.
ggplot_build(x)
6.
print.ggplot(plot)
5.
print(plot)
4.
print(plot)
3.
withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
2.
suppressWarnings(print(plot))
1.
follower_dot_plot(rsr_followers, point_colour = "earliest_follow")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.