Giter Site home page Giter Site logo

zedyautja / webservices Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cran-task-views/webtechnologies

0.0 2.0 0.0 2.46 MB

CRAN Task View for interacting with data on the web via web services, and parsing data from the web

Home Page: http://cran.r-project.org/web/views/WebTechnologies.html

webservices's Introduction

CRAN Task View: Web Technologies and Services


Maintainer: Scott Chamberlain, Thomas Leeper, Patrick Mair, Karthik Ram, Christopher Gandrud Contact: scott at ropensci.org Version: 2014-09-17


This task view contains information about using R to obtain and parse data from the web. The base version of R does not ship with many tools for interacting with the web. Thankfully, there are an increasingly large number of tools for interacting with the web. A list of available packages and functions is presented below, grouped by the type of activity. If you have any comments or suggestions for additions or improvements for this taskview, go to GitHub and submit an issue, or make some changes and submit a pull request. If you can't contribute on GitHub, send Scott an email. If you have an issue with one of the packages discussed below, please contact the maintainer of that package.

Tools for Working with the Web from R

Parsing Data from the Web

  • txt, csv, etc.: you can use read.csv() after acquiring the csv file from the web via e.g., getURL() from RCurl. read.csv() works with http but not https, i.e.: read.csv("http://..."), but not read.csv("https://...").
  • The repmis package contains a source_data() command to load and cache plain-text data from a URL (either http or https). It also includes source_Dropbox() for downloading/caching plain-text data from non-public Dropbox folders and source_XlsxData() for downloading/caching Excel xlsx sheets.
  • The package XML contains functions for parsing XML and HTML, and supports xpath for searching XML (think regex for strings). A helpful function to read data from one or more HTML tables is readHTMLTable().
  • XML2R: The XML2R package is a collection of convenient functions for coercing XML into data frames. The development version is on GitHub here.
  • An alternative to XML is selectr, which parses CSS3 Selectors and translates them to XPath 1.0 expressions. XML package is often used for parsing xml and html, but selectr translates CSS selectors to XPath, so can use the CSS selectors instead of XPath. The selectorgadget browser extension can be used to identify page elements.
  • The rjson converts R object into Javascript object notation (JSON) objects and vice-versa.
  • An alternative to the rjson is RJSONIO which also converts to and from data in JSON format (it is fast for parsing).
  • An alternative to rjson and RJSONIO is jsonlite, a fork of the RJSONIO. It includes the parser from RJSONIO, but implements a different mapping between R objects and JSON strings.
  • Custom formats: Some web APIs provide custom data formats which are usually modified xml or json, and handled by XML and rjson or RJSONIO, respectively.
  • The RHTMLForms allows to read HTML documents and obtain a description of each of the forms it contains, along with the different elements and hidden fields
  • scrapeR provides additional tools for scraping data from HTML and XML documents.
  • The tldextract package extract top level domains and subdomains from a host name. It's a port of a Python library of the same name.

Curl, HTTP, FTP, HTML, XML, SOAP

  • RCurl: A low level curl wrapper that allows one to compose general HTTP requests and provides convenient functions to fetch URIs, get/post forms, etc. and process the results returned by the Web server. This provides a great deal of control over the HTTP/FTP connection and the form of the request while providing a higher-level interface than is available just using R socket connections. It also provide tools for Web authentication.
  • httr: A light wrapper around RCurl that makes many things easier, but still allows you to access the lower level functionality of RCurl. It has convenient http verbs: GET(), POST(), PUT(), DELETE(), PATCH(), HEAD(), BROWSE(). These wrap functions are more convenient to use, though less configurable than counterparts in RCurl. The equivalent of httr's GET() in RCurl is getForm(). Likewise, the equivalent of httr 's POST() in RCurl is postForm(). http status codes are helpful for debugging http calls. This package makes this easier using, for example, stop_for_status() gets the http status code from a response object, and stops the function if the call was not successful. See also warn_for_status(). Note that you can pass in additional Curl options to the config parameter in http calls.
  • The XMLRPC package provides an implementation of XML-RPC, a relatively simple remote procedure call mechanism that uses HTTP and XML. This can be used for communicating between processes on a single machine or for accessing Web services from within R.
  • The XMLSchema package provides facilities in R for reading XML schema documents and processing them to create definitions for R classes and functions for converting XML nodes to instances of those classes. It provides the framework for meta-computing with XML schema in R
  • RTidyHTML interfaces to the libtidy library for correcting HTML documents that are not well-formed. This library corrects common errors in HTML documents.
  • W3CMarkupValidator provides an R Interface to W3C Markup Validation Services for validating HTML documents.
  • SSOAP provides a client-side SOAP (Simple Object Access Protocol) mechanism. It aims to provide a high-level interface to invoke SOAP methods provided by a SOAP server.
  • Rcompression: Interface to zlib and bzip2 libraries for performing in-memory compression and decompression in R. This is useful when receiving or sending contents to remote servers, e.g. Web services, HTTP requests via RCurl.
  • The CGIwithR package allows one to use R scripts as CGI programs for generating dynamic Web content. HTML forms and other mechanisms to submit dynamic requests can be used to provide input to R scripts via the Web to create content that is determined within that R script.
  • httpRequest: HTTP Request protocols. Implements the GET, POST and multipart POST request.

Authentication

  • Using web resources can require authentication, either via API keys, OAuth, username:password combination, or via other means. Additionally, sometimes web resources that require authentication be in the header of an http call, which requires a little bit of extra work. API keys and username:password combos can be combined within a url for a call to a web resource (api key: http://api.foo.org/?key=yourkey; user/pass: http://username:[email protected]), or can be specified via commands in RCurl or httr. OAuth is the most complicated authentication process, and can be most easily done using httr. See the 6 demos within httr, three for OAuth 1.0 (linkedin, twitter, vimeo) and three for OAuth 2.0 (facebook, GitHub, google). ROAuth is a package that provides a separate R interface to OAuth. OAuth is easier to to do in httr, so start there.

Web Frameworks

  • The shiny package makes it easy to build interactive web applications with R.
  • The Rook web server interface contains the specification and convenience software for building and running Rook applications.
  • The opencpu framework for embedded statistical computation and reproducible research exposes a web API interfacing R, LaTeX and Pandoc. This API is used for example to integrate statistical functionality into systems, share and execute scripts or reports on centralized servers, and build R based apps.
  • A package by Yihui Xie called servr provides a simple HTTP server to serve files under a given directory based on the httpuv package.
  • The httpuv package, made by Joe Cheng at RStudio, provides low-level socket and protocol support for handling HTTP and WebSocket requests directly within R. Another related package, perhaps which httpuv replaces, is websockets, also made by Joe Cheng.
  • websockets (not on CRAN): A simple HTML5 websocket interface for R, by Joe Cheng.
  • Plot.ly is a company that allows you to create visualizations in the web using R (and Python). They have an R package in development here, as well as access to their services via a REST API.
  • The WADL package provides tools to process Web Application Description Language (WADL) documents and to programmatically generate R functions to interface to the REST methods described in those WADL documents.
  • The RDCOMServer provides a mechanism to export R objects as (D)COM objects in Windows. It can be used along with the RDCOMClient package which provides user-level access from R to other COM servers.
  • The RSelenium package (development version on GitHub here) provides a set of R bindings for the Selenium 2.0 webdriver using the JsonWireProtocol. Selenium automates browsers. Using RSelenium you can automate browsers locally or remotely. This can aid in automated application testing, load testing and web scraping. Examples are given interacting with popular projects such as shiny and sauceLabs.
  • rapporter.net provides an online environment (SaaS) to host and run rapport statistical report templates in the cloud.
  • neocities wraps the API for the Neocities web hosting service.
  • The Tiki Wiki CMS/Groupware framework has an R plugin ( PluginR) to run R code from wiki pages, and use data from their own collected web databases (trackers). A demo: http://r.tiki.org. More info in a useR!2013 presentation.
  • The MediaWiki has an extension ( Extension:R) to run R code from wiki pages, and use uploaded data. Links to demo pages (in German) can be found at the category page for R scripts at MM-Stat. A mailing list is available: R-sig-mediawiki.
  • whisker: Implementation of logicless templating based on Mustache in R. Mustache syntax is described in http://mustache.github.io/mustache.5.html

JavaScript

  • ggvis makes it easy to describe interactive web graphics in R. It fuses the ideas of ggplot2 and shiny, rendering graphics on the web with Vega.
  • rCharts (not on CRAN) allows for interactive Javascript charts from R.
  • rVega (not on CRAN) is an R wrapper for Vega.
  • clickme (not on CRAN) is an R package to create interactive plots.
  • animint (not on CRAN) allows an interactive animation to be defined using a list of ggplots with clickSelects and showSelected aesthetics, then exported to CSV/JSON/D3/JavaScript for viewing in a web browser.
  • The SpiderMonkey package provides a means of evaluating JavaScript code, creating JavaScript objects and calling JavaScript functions and methods from within R. This can work by embedding the JavaScript engine within an R session or by embedding R in an browser such as Firefox and being able to call R from JavaScript and call back to JavaScript from R.
  • d3Network: Tools for creating D3 JavaScript network, tree, dendrogram, and Sankey graphs from R.

Data Sources on the Web Accessible via R

Agriculture | Amazon web services | Chemistry | Data depots | Earth Science | Ecology/Evolution | Economics/Business | E-commerce | Finance | Genes/Genomes | Google web services | Government | Literature/Text-mining | Machine learning | Maps | Marketing | Media: Images/video/etc. | News | Other | Public Health | Social media | Sports | Web analytics |

# Agriculture

  • FAOSTAT: The package hosts a list of functions to download, manipulate, construct and aggregate agricultural statistics provided by the FAOSTAT (Food and Agricultural Organization of the United Nations) database.
  • cimis: R package for retrieving data from CIMIS, the California Irrigation Management Information System. Available in CRAN archives only.

# Amazon Web Services

  • AWS.tools: An R package to interact with Amazon Web Services (EC2/S3).
  • RAmazonS3 package provides the basic infrastructure within R for communicating with the S3 Amazon storage server. This is a commercial server that allows one to store content and retrieve it from any machine connected to the Internet.
  • RAmazonDBREST provides an interface to Amazon's Simple DB API.
  • MTurkR: Access to Amazon Mechanical Turk Requester API via R. Development version on GitHub here.

# E-commerce

# Chemistry

  • rpubchem: Interface to the PubChem Collection.

# Data Depots

  • dvn: Provides access to The Dataverse Network API.
  • rfigshare: Programmatic interface for Figshare.
  • factualR: Thin wrapper for the Factual.com server API.
  • dataone: Read/write access to data and metadata from the DataONE network of Member Node data repositories.
  • yhatr: Lets you deploy, maintain, and invoke models via the Yhat REST API.
  • RSocrata: Provided with a Socrata dataset resource URL, or a Socrata SoDA web API query, returns an R data frame. Converts dates to POSIX format. Supports CSV and JSON. Manages throttling by Socrata.
  • Quandl: A package that interacts directly with the Quandl API to offer data in a number of formats usable in R, as well as the ability to upload and search.
  • rdatamarket: Fetches data from DataMarket.com, either as timeseries in zoo form (dmseries) or as long-form data frames (dmlist).
  • infochimps: An R wrapper for the infochimps.com API services, from Drew Conway. The CRAN version is archived. Development is available on GitHub here.

# Earth Science

  • RNCEP: Obtain, organize, and visualize NCEP weather data.
  • crn: Provides the core functions required to download and format data from the Climate Reference Network. Both daily and hourly data are downloaded from the ftp, a consolidated file of all stations is created, station metadata is extracted. In addition functions for selecting individual variables and creating R friendly datasets for them is provided.
  • BerkeleyEarth: Data input for Berkeley Earth Surface Temperature. Archived on CRAN.
  • waterData: An R Package for retrieval, analysis, and anomaly calculation of daily hydrologic time series data.
  • CHCN: A compilation of historical through contemporary climate measurements scraped from the Environment Canada Website Including tools for scraping data, creating metadata and formatting temperature files.
  • decctools: Provides functions for retrieving energy statistics from the United Kingdom Department of Energy and Climate Change and related data sources. The current version focuses on total final energy consumption statistics at the local authority, MSOA, and LSOA geographies. Methods for calculating the generation mix of grid electricity and its associated carbon intensity are also provided.
  • Metadata: Collates metadata for climate surface stations. Archived on CRAN.
  • sos4R: A client for Sensor Observation Services (SOS) as specified by the Open Geospatial Consortium (OGC). It allows users to retrieve metadata from SOS web services and to interactively create requests for near real-time observation data based on the available sensors, phenomena, observations, etc. using thematic, temporal and spatial filtering.
  • raincpc: The Climate Prediction Center's (CPC) daily rainfall data for the entire world, from 1979 to the present, at a resolution of 50 km (0.5 degrees lat-lon). This package provides functionality to download and process the raw data from CPC.
  • weatherData: Functions that help in fetching weather data from websites. Given a location and a date range, these functions help fetch weather data (temperature, pressure etc.) for any weather related analysis.
  • soilDB: A collection of functions for reading data from USDA-NCSS soil databases.
  • rnoaa: R interface to NOAA Climate data API.
  • GhcnDaily: A package that downloads and processes Global Historical Climatology Network (GHCN) daily data from the National Climatic Data Center (NCDC).
  • okmesonet: Retrieves Oklahoma (USA) Mesonet climatological data provided by the Oklahoma Climatological Survey.
  • rainfreq: Estimates of rainfall at desired frequency and desired duration are often required in the design of dams and other hydraulic structures, catastrophe risk modeling, environmental planning and management. One major source of such estimates for the USA is the NOAA National Weather Service's (NWS) division of Hydrometeorological Design Studies Center (HDSC). Raw data from NWS-HDSC is available at 1-km resolution and comes as a huge number of GIS files.
  • rnrfa: Utility functions to retrieve data from the UK National River Flow Archive via an API (http://www.ceh.ac.uk/data/nrfa/). There are functions to retrieve stations falling in a bounding box, to generate a map and extracting time series and general information.

# Ecological and Evolutionary Biology

  • rvertnet: A wrapper to the VertNet collections database API.
  • rgbif: Interface to the Global Biodiversity Information Facility API methods.
  • rfishbase: A programmatic interface to fishbase.org.
  • treebase: An R package for discovery, access and manipulation of online phylogenies.
  • taxize: Taxonomic information from around the web.
  • dismo: Species distribution modeling, with wrappers to some APIs.
  • rWBclimate: R interface for the World Bank climate data.
  • rbison: Wrapper to the USGS Bison API.
  • neotoma (not on CRAN): Programmatic R interface to the Neotoma Paleoecological Database.
  • rnpn (not on CRAN): Wrapper to the National Phenology Network database API.
  • rfisheries: Package for interacting with fisheries databases at openfisheries.org.
  • rebird: A programmatic interface to the eBird database.
  • flora: Retrieve taxonomical information of botanical names from the Flora do Brasil website.
  • Rcolombos: This package provides programmatic access to Colombos, a web based interface for exploring and analyzing comprehensive organism-specific cross-platform expression compendia of bacterial organisms.
  • Reol: An R interface to the Encyclopedia of Life (EOL) API. Includes functions for downloading and extracting information off the EOL pages.
  • rPlant: An R interface to the the many computational resources iPlant offers through their RESTful application programming interface. Currently, rPlant functions interact with the iPlant foundational API, the Taxonomic Name Resolution Service API, and the Phylotastic Taxosaurus API. Before using rPlant, users will have to register with the iPlant Collaborative
  • ecoengine: ecoengine (http://ecoengine.berkeley.edu/) provides access to more than 2 million georeferenced specimen records from the Berkeley Natural History Museums. http://bnhm.berkeley.edu/
  • spocc: A programmatic interface to many species occurrence data sources, including GBIF, USGS's BISON, iNaturalist, Berkeley Ecoinformatics Engine eBird, AntWeb, and more as they sources become easily available.
  • paleobioDB: Functions to wrap each endpoint of the PaleobioDB API, plus functions to visualize and process the fossil data. The API documentation for the Paleobiology Database can be found at http://paleobiodb.org/data1.1/.
  • rnbn: An R interface to the UK National Biodiversity Network. Development version on GitHub here.
  • rYoutheria: A programmatic interface to web-services of Youtheria, an online database of mammalian trait data. Development version on GitHub here
  • The tpl package, created by Gustavo Carvalho, doesn't interact with the web directly, but queries locally stored data from theplantlist.org, and data will be updated when theplantlist updates, which is not very often. There is another package for interacting with this same data, called Taxonstand.
  • TR8: TR8 contains a set of tools which take care of retrieving trait data for plant species from publicly available databases via web services (including: Biolflor, The Ecological Flora of the British Isles, LEDA traitbase, Ellenberg values for Italian Flora, Mycorrhizal intensity database).

# Economics and Business

  • WDI: Search, extract and format data from the World Bank's World Development Indicators.
  • The Zillow package provides an R interface to the Zillow Web Service API. It allows one to get the Zillow estimate for the price of a particular property specified by street address and ZIP code (or city and state), to find information (e.g. size of property and lot, number of bedrooms and bathrooms, year built.) about a given property, and to get comparable properties.
  • sweSCB: Interface for the REST API of Statistics Sweden. Fetch information on data hierarchy stored behind the API; extract metadata; fetch actual data; and clean up results.
  • psidR Contains functions to download and format longitudinal datasets from the Panel Study of Income Dynamics (PSID).
  • ONETr searches and retrieves occupational data from O*NET Online. Development version on GitHub here.

# Finance

  • RDatastream (not on CRAN): An R interface to the Thomson Dataworks Enterprise SOAP API (paid), with some convenience functions for retrieving Datastream data specifically.
  • Datastream2R (not on CRAN): Another package for accessing the Datastream service. This package downloads data from the Thomson Reuters DataStream DWE server, which provides XML access to the Datastream database of economic and financial information.
  • quantmod: Functions for financial quantitative modelling as well as data acquisition, plotting and other utilities.
  • TFX: Connects to TrueFX(tm) for free streaming real-time and historical tick-by-tick market data for dealable interbank foreign exchange rates with millisecond detail.
  • fImport: Environment for teaching "Financial Engineering and Computational Finance"
  • Rbitcoin: Ineract with Bitcoin. Both public and private API calls. Support HTTP over SSL. Debug messages of Rbitcoin, debug messages of RCurl, error handling.
  • RCryptsy Wraps the API for the Cryptsy crypto-currency trading platform. Development version on GitHub here. The package was archived on 2014-08-07 because it "no longer works with pubapi.cryptsy.com.", according the CRAN overlords.
  • Thinknum: Interacts with the Thinknum API.
  • pdfetch: A package for downloading economic and financial time series from public sources.
  • tseries: Includes the get.hist.quote for historical financial data.
  • rbitcoinchartsapi: An R package for the BitCoinCharts.com API. From their website: "Bitcoincharts provides financial and technical data related to the Bitcoin network and this data can be accessed via a JSON application programming interface (API)."
  • ustyc: US Treasury yield curve data retrieval. Development version on GitHub here.

# Genes and Genomes

  • cgdsr: R-Based API for accessing the MSKCC Cancer Genomics Data Server (CGDS).
  • rsnps: This package is a programmatic interface to various SNP datasets on the web: openSNP, NBCI's dbSNP database, and Broad Institute SNP Annotation and Proxy Search. This package started as a library to interact with openSNP alone, so most functions deal with openSNP.
  • rentrez: Talk with NCBI entrez using R.
  • seqinr: Exploratory data analysis and data visualization for biological sequence (DNA and protein) data.
  • seq2R: Detect compositional changes in genomic sequences - with some interaction with GenBank. Archived on CRAN.
  • primerTree: Visually Assessing the Specificity and Informativeness of Primer Pairs.
  • hoardeR: Information retrieval from NCBI databases, with main focus on Blast.
  • RISmed: Download content from NCBI databases. Intended for analyses of NCBI database content, not reference management. See rpubmed for more literature oriented stuff from NCBI.
  • The mygene.r package is an R client for accessing Mygene.info annotation and query services.

# Google Web Services

  • RGoogleStorage provides programmatic access to the Google Storage API. This allows R users to access and store data on Google's storage. We can upload and download content, create, list and delete folders/buckets, and set access control permissions on objects and buckets.
  • The RGoogleDocs package is an example of using the RCurl and XML packages to quickly develop an interface to the Google Documents API.
  • translate: Bindings for the Google Translate API v2
  • translateR provides bindings for both Google and Microsoft translation APIs.
  • googlePublicData: An R library to build Google's public data explorer DSPL metadata files.
  • googleVis: Interface between R and the Google chart tools.
  • gooJSON: A Google JSON data interpreter for R which contains a suite of helper functions for obtaining data from the Google Maps API JSON objects.
  • plotGoogleMaps: Plot SP or SPT(STDIF,STFDF) data as HTML map mashup over Google Maps.
  • plotKML: Visualization of spatial and spatio-temporal objects in Google Earth.
  • bigrquery (not on CRAN): An interface to Google's bigquery from R.
  • GFusionTables (not on CRAN): An R interface to Google Fusion Tables. Google Fusion Tables is a data mangement system in the cloud. This package provides R functions to browse Fusion Tables catalog, retrieve data from Gusion Tables dtd storage to R and to upload data from R to Fusion Tables
  • RGoogleAnalytics: Provides functions for accessing and retrieving data from the Google Analytics API. on Github. There is another R package for the same service (RGA); see next entry.
  • RGA: Provides functions for accessing and retrieving data from the Google Analytics APIs. Supports OAuth 2.0 authorization. Also, the RGA package provides a shiny app to explore data. There is another R package for the same service (RGoogleAnalytics); see above entry.
  • RGoogleTrends provides programmatic access to Google Trends data. This is information about the popularity of a particular query.

# Government

  • acs: Download, manipulate, and present data from the US Census American Community Survey.
  • BerlinData: Easy access to http://daten.berlin.de. It allows you to search through the data catalogue and to download the data directly from within R. Development version on GitHub here.
  • dkstat (not on CRAN): A package to access the StatBank API from Statistics Denmark.
  • EIAdata: U.S. Energy Information Administration (EIA) API client.
  • federalregister: Client package for the U.S. Federal Register API. Development version on GitHub here.
  • govStatJPN: Functions to get public survey data in Japan.
  • pollstR: An R client for the Huffpost Pollster API. Development version on GitHub here.
  • pvsR: An R package to interact with the Project Vote Smart API for scientific research.
  • recalls: Access U.S. Federal Government Recall Data. Development version on GitHub here.
  • RPublica: ProPublica API Client. Development version on GitHub here.
  • rsunlight (not on CRAN): R client for the Sunlight Labs APIs. There are functions for Sunlight Labs Congress, Transparency, Open States, Real Time Congress, Capitol Words, and Influence Explorer APIs. Data outputs are R lists. There are also a few convenience functions for visualizing data and writing data to .csv.
  • rtimes (not on CRAN): R client for the New York Times APIs, including the Congress, Article Search, Campaign Finance, and Geographic APIs. The focus is on those that deal with political data, but throwing in Article Search and Geographic for good measure.
  • sorvi: Various tools for retrieving and working with Finnish open government data. Development version on GitHub here.
  • wethepeople: An R client for interacting with the White House's "We The People" petition API.
  • polidata: Access to various political data APIs, including e.g. Google Civic Information API or Sunlight Congress API for US Congress data, and POPONG API for South Korea National Assembly data. on Github

# Literature, Metadata, Text, and Altmetrics

  • rplos: A programmatic interface to the Web Service methods provided by the Public Library of Science journals for search.
  • rbhl: R interface to the Biodiversity Heritage Library (BHL) API.
  • rmetadata (not on CRAN): Get scholarly metadata from around the web.
  • RMendeley: Implementation of the Mendeley API in R. Archived on CRAN.
  • rentrez: Talk with NCBI entrez using R.
  • rorcid (not on CRAN): A programmatic interface the Orcid.org API.
  • rpubmed (not on CRAN): Tools for extracting and processing Pubmed and Pubmed Central records.
  • rAltmetric: Query and visualize metrics from Altmetric.com.
  • alm: R wrapper to the almetrics API platform developed by PLoS.
  • ngramr: Retrieve and plot word frequencies through time from the Google Ngram Viewer.
  • scholar provides functions to extract citation data from Google Scholar. Convenience functions are also provided for comparing multiple scholars and predicting future h-index values.
  • The Sxslt package is an R interface to Dan Veillard's libxslt translator. It allows R programmers to use XSLT directly from within R, and also allows XSL code to make use of R functions.
  • The Aspell package provides an interface to the aspell library for checking the spelling of words and documents.
  • OAIHarvester: Harvest metadata using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
  • RefManageR: Import and Manage BibTeX and BibLaTeX references with RefManager.
  • pubmed.mineR: An R package for text mining of PubMed Abstracts. Supports fetching text and XML from PubMed.
  • tm.plugin.webmining: Extensible text retrieval framework for news feeds in XML (RSS, ATOM) and JSON formats. Currently, the following feeds are implemented: Google Blog Search, Google Finance, Google News, NYTimes Article Search, Reuters News Feed, Yahoo Finance and Yahoo Inplay.
  • boilerpipeR: Generic Extraction of main text content from HTML files; removal of ads, sidebars and headers using the boilerpipe Java library.
  • WikipediR: WikipediR is a wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. on Github

# Machine Learning as a Service

  • bigml: BigML, a machine learning web service.
  • MTurkR: Access to Amazon Mechanical Turk Requester API via R.

# Maps

  • RgoogleMaps: This package serves two purposes: It provides a comfortable R interface to query the Google server for static maps, and use the map as a background image to overlay plots within R.
  • The R2GoogleMaps package - which is different from RgoogleMaps
    • provides a mechanism to generate JavaScript code from R that displays data using Google Maps.
  • osmar: This package provides infrastructure to access OpenStreetMap data from different sources to work with the data in common R manner and to convert data into available infrastructure provided by existing R packages (e.g., into sp and igraph objects).
  • ggmap: Allows for the easy visualization of spatial data and models on top of Google Maps, OpenStreetMaps, Stamen Maps, or CloudMade Maps using ggplot2.
  • The GeoIP package maps IP addresses and host names to geographic locations - latitude, longitude, region, city, zip code, etc.
  • The RKML is an implementation that provides users with high-level facilities to generate KML, the Keyhole Markup Language for display in, e.g., Google Earth.
  • RKMLDevice allows to create R graphics in KML format in a manner that allows them to be displayed on Google Earth (or Google Maps).
  • leafletR: Allows you to display your spatial data on interactive web-maps using the open-source JavaScript library Leaflet.

# Marketing

  • anametrix: Bidirectional connector to Anametrix API.

# Media: Images, Graphics, Videos, Music

  • colourlovers: Extracts colors and multi-color patterns from COLOURlovers, for use in creating R graphics color palettes. Development version on GitHub here.
  • imguR: A package to share plots using the image hosting service Imgur.com. The development version is on GitHub here. knitr also has a function imgur_upload() to load images from literate programming documents.
  • meme (not on CRAN): Provides the ability to create internet memes from template images using several online meme-generation services.
  • RLastFM: A package to interface to the last.fm API. Archived on CRAN.
  • rscribd (not on CRAN): API client for publishing documents to Scribd.
  • The RUbigraph package provides an R interface to a Ubigraph server for drawing interactive, dynamic graphs. You can add and remove vertices/nodes and edges in a graph and change their attributes/characteristics such as shape, color, size.

# News

  • GuardianR: Provides an interface to the Open Platform's Content API of the Guardian Media Group. It retrieves content from news outlets The Observer, The Guardian, and guardian.co.uk from 1999 to current day.
  • RNYTimes provides interfaces to several of the New York Times Web services for searching articles, meta-data, user-generated content and best seller lists.

# Other

  • sos4R: R client for the OGC Sensor Observation Service.
  • datamart: Provides an S4 infrastructure for unified handling of internal datasets and web based data sources. Examples include dbpedia, eurostat and sourceforge.
  • rDrop (not on CRAN): Dropbox interface.
  • zendeskR: This package provides an R wrapper for the Zendesk API.
  • AWS.tools: An R package to interact with Amazon Web Services (EC2/S3).
  • qualtrics (not on CRAN): Provides functions to interact with the Qualtrics online survey tool.
  • Rmonkey (not on CRAN): Provides programmatic access to Survey Monkey for creating simple surveys and retrieving survey results.
  • redcapAPI: Access data stored in REDCap databases using an API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. on Github.
  • RForcecom: RForcecom provides a connection to Force.com and Salesforce.com from R.
  • mailR: Interface to Apache Commons Email to send emails from within R.
  • gmailr: Access the Gmail RESTful API from R
  • RPushbullet: Provides an easy-to-use interface for the Pushbullet service which provides fast and efficient notifications between computers, phones and tablets. By Dirk Eddelbuettel
  • slackr: R client for Slack.com messaging platform. on Github

# Public Health

# Social media

  • streamR: This package provides a series of functions that allow R users to access Twitter's filter, sample, and user streams, and to parse the output into data frames. OAuth authentication is supported.
  • twitteR: Provides an interface to the Twitter web API.
  • The Rflickr package provides an R interface to the Flickr photo management and sharing application Web service.
  • Rfacebook: Provides an interface to the Facebook API.
  • plusser has been designed to to facilitate the retrieval of Google+ profiles, pages and posts. It also provides search facilities. Currently a Google+ API key is required for accessing Google+ data.
  • SocialMediaMineR is an analytic tool that returns information about the popularity of a URL on social media sites.

# Sports

  • nhlscrapr: Compiling the NHL Real Time Scoring System Database for easy use in R.
  • pitchRx: Tools for Collecting and Visualizing Major League Baseball PITCHfx Data
  • bbscrapeR (not on CRAN): Tools for Collecting Data from nba.com and wnba.com
  • fbRanks: Association Football (Soccer) Ranking via Poisson Regression - uses time dependent Poisson regression and a record of goals scored in matches to rank teams via estimated attack and defense strengths.

# Web Analytics

  • rgauges: This package provides functions to interact with the Gaug.es API. Gaug.es is a web analytics service, like Google analytics. You have to have a Gaug.es account to use this package.
  • RSiteCatalyst: Functions for accessing the Adobe Analytics (Omniture SiteCatalyst) Reporting API.
  • RGoogleAnalytics: Provides functions for accessing and retrieving data from the Google Analytics API. on Github. There is another R package for the same service (RGA); see next entry.
  • RGA: Provides functions for accessing and retrieving data from the Google Analytics APIs. Supports OAuth 2.0 authorization. Also, the RGA package provides a shiny app to explore data. There is another R package for the same service (RGoogleAnalytics); see above entry.
  • RGoogleTrends provides programmatic access to Google Trends data. This is information about the popularity of a particular query.

CRAN packages:

Related links:

webservices's People

Contributors

sckott avatar leeper avatar karthik avatar christophergandrud avatar briandiggs avatar cpsievert avatar tophcito avatar daroczig avatar jrnold avatar yihui avatar johndharrison avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.