Giter Site home page Giter Site logo

ropensci / taxize Goto Github PK

View Code? Open in Web Editor NEW
262.0 32.0 58.0 32.26 MB

A taxonomic toolbelt for R

Home Page: https://docs.ropensci.org/taxize

License: Other

Makefile 0.23% R 99.76% TeX 0.01%
taxonomy data api biodiversity biology rstats nomenclature darwincore taxize api-wrapper

taxize's Introduction

taxize

Project Status: Active – The project has reached a stable, usable state and is being actively developed. cran checks R-CMD-check codecov rstudio mirror downloads cran version

taxize allows users to search over many taxonomic data sources for species names (scientific and common) and download up and downstream taxonomic hierarchical information - among other things.

Package documentation: https://docs.ropensci.org/taxize/

Installation

Stable version from CRAN

install.packages("taxize")

Development version from GitHub

Windows users install Rtools first.

install.packages("remotes")
remotes::install_github("ropensci/taxize")
library('taxize')

Screencast

Contributing

See our CONTRIBUTING document.

Contributors

Collected via GitHub Issues: honors all contributors in alphabetical order. Code contributors are in bold.

afkoeppel - ahhurlbert - albnd - Alectoria - andzandz11 - anirvan - antagomir - arendsee - ArielGreiner - arw36 - ashenkin - ashiklom - benjaminschwetz - benmarwick - bienflorencia - binkySallly - bomeara - BridgettCollis - bw4sz - cboettig - cdeterman - ChrKoenig - chuckrp - clarson2191 - claudenozeres - cmzambranat - cparsania - daattali - DanielGMead - DarrenObbard - davharris - davidvilanova - diogoprov - dlebauer - dlenz1 - dougwyu - dschlaep - EDiLD - edwbaker - emhart - eregenyi - fdschneider - fgabriel1891 - fischhoff - fmichonneau - fozy81 - gedankenstuecke - gimoya - git-og - glaroc - gpli - gustavobio - hlapp - ibartomeus - Ironholds - jabard89 - jangorecki - jarioksa - jebyrnes - jeroen - jimmyodonnell - joelnitta - johnbaums - jonmcalder - jordancasey - josephwb - jsgosnell - JulietteLgls - jwilk - kamapu - karthik - katrinleinweber - KevCaz - kgturner - kmeverson - Koalha - ljvillanueva - maelle - Markus2015 - matutosi - mcsiple - MikkoVihtakari - millerjef - miriamgrace - mpnelsen - MUSEZOOLVERT - nate-d-olson - nmatzke - npch - ocstringham - p-neves - p-schaefer - padpadpadpad - paternogbc - patperu - pederengelstad - philippi - Phylloxera - pmarchand1 - pozsgaig - pssguy - raredd - rec3141 - Rekyt - RodgerG - rossmounce - sariya - sastoudt - scelmendorf - sckott - SimonGoring - snsheth - snubian - Squiercg - sunray1 - taddallas - tdjames1 - tmkurobe - toczydlowski - tpaulson1 - tpoisot - TrashBirdEcology - trvinh - vijaybarve - wcornwell - willpearse - wpetry - yhg926 - zachary-foster

Road map

Check out our milestones to see what we plan to get done for each version.

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for taxize in R doing citation(package = 'taxize')
  • Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

rofooter

taxize's People

Contributors

arendsee avatar benmarwick avatar bomeara avatar cboettig avatar dlebauer avatar eduardszoecs avatar fmichonneau avatar fozy81 avatar gedankenstuecke avatar gpli avatar ibartomeus avatar ironholds avatar jarioksa avatar jeroen avatar jimmyodonnell avatar josephwb avatar jwilk avatar karthik avatar katrinleinweber avatar maelle avatar patperu avatar pmarchand1 avatar raredd avatar rekyt avatar salix-d avatar sckott avatar trashbirdecology avatar trvinh avatar vijaybarve avatar zachary-foster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

taxize's Issues

I'm working on a vignette

Did try the knitr-vignette way (since we have already a markdown-template), however could not get it passing R CMD check.

Will rewrite it in Latex.

Bad request?

What going on here?

> require(taxize)
> get_tsn("Chironomus riparius", "anymatch", by_ = "name")
Error: Bad Request
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] taxize_0.0.1      stringr_0.6.1     RPostgreSQL_0.3-2 DBI_0.2-5        

loaded via a namespace (and not attached):
[1] digest_0.5.2   httr_0.1.1     plyr_1.7.1     RCurl_1.91-1   ritis_0.0.1    RJSONIO_0.99-0 tools_2.15.1   XML_3.93-0    

However this species is in ITIT under TSN 129313 (http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=129313)

Error installing package

On issuing command

install_github('taxize_','ropensci')

The error is

Error in .install_package_code_files(".", instdir) :
files in 'C:/Users/Vijay/AppData/Local/Temp/RtmpygH3RG/R.INSTALL228835225118/taxize/R'
missing from 'Collate' field:
  gni_parse.R
ERROR: unable to collate and parse R files for package 'taxize'

Progress on being able to submit to CRAN - pkg dependencies

There are two packages that I have taxize depend on: ritis and rentrez. I just submitted ritis to CRAN, so hopefully that will be up soon. rentrez may be a bit farther away - perhaps we can just use a different package that hits NCBI database until rentrez is up? Or we could keep rentrez in, but when you run R CMD CHECK it doesn't pass because rentrez is not on CRAN yet. Thoughts?

get_tsn should not ask so often about species

Currently when there is more then one tsn found, get_tsn asks which one to take.
But it should first look if there is a direct match (and than taking this).

SP.NAMES <- c("Clostridium", "Amanita", "Abies", "Lynx")
require(taxize)
tsn <- get_tsn(SP.NAMES)

asks 3 times (always one). A simple match (lowercase) between input and combinedname should solve this. And only if there is no direct match the user should be promted.

ubio_namebank

ubio_namebank(searchName = "elephant", sci = 1, vern = 0)
## Error: object of type 'externalptr' is not subsettable

I think the error comes from content()...
See also the vignette iin my 'clean' branch...

Problems with eol

Can't run examples from eol_hierachy:

eol_hierarchy('34345893')
Opening and ending tag mismatch: hr line 5 and body
Opening and ending tag mismatch: body line 3 and html
Premature end of data in tag html line 1
Error: 1: Opening and ending tag mismatch: hr line 5 and body
2: Opening and ending tag mismatch: body line 3 and html
3: Premature end of data in tag html line 1

Tropicos tests (prefix `tp_`) don't pass check - will try to fix this

on testthat::check() I get:

* checking tests ...
  Running 'test-all.R' [5s/69s]
 ERROR
Running the tests in 'tests/test-all.R' failed.
Last 13 lines of output:
  itis_taxrank : http://www.itis.gov/ITISWebService/services/ITISService/getTaxonomicRankNameFromTSN?tsn=202385
  ..
  iucn_summary : ..
  phylomatic_tree : http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=36616
  http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=19322
  http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=183327
  ...
  tp_acceptednames : 4
  tp_namedistributions : 5
  tp_namereferences : Error in getOption("tropicoskey", stop("need an API key for Tropicos")) : 
    need an API key for Tropicos
  Calls: test_package ... eval -> eval -> tp_namereferences -> paste -> getOption
  Execution halted
Error: Command failed (1)

itis does not return anything

itis does not return anything, which breaks some functions (eg get_tsn).

itis("Ursus", "searchbyscientificname")

Will take a look at it...

README

The link to the taxize Tutorial is broken.

Also, would it be worth to migrate the Tutorial to the github-wiki?

TODO: taxize, error checking and TBMap?

  1. Carl got errors with:

    tree <- get_phylomatic_tree(phyformat, 'TRUE', 'POST', 'new', 'FALSE')
    ((annona_cherimola:0.333333,annona_muricata:45.333336)annona:12.666667)euphyllophyte:1.000000;
    Warning message:
    In read.tree(text = z) : empty character string.
    This is a problem with the 'POST' format Phylomatic query, possibly just use GET format queries.

  2. Possibly interface with Rod Page's taxonomy database 'TBMap', that already queries many taxonomic sources, including ITIS. No response from from Rod Page on email so far

On check error with encoding for me

I get on check

  • checking R files for syntax errors ... WARNING
    Warning in Sys.setlocale("LC_CTYPE", "en_US.utf8") :
    OS reports request to set locale to "en_US.utf8" cannot be honored

I also get

  • checking examples ... WARNING
    checking a package with encoding 'UTF-8' in an ASCII locale

What do you run, Linux/Windows? I wonder if I should change something in my computers or R settings?

Windows check

I submitted to the Windows checker, and there is a note about doMC being suggested, but not available for testing. Do you think this is a problem? I don't think it is, but perhaps you do?

  • using log directory 'd:/RCompile/CRANguest/R-release/taxize.Rcheck'
  • using R version 2.15.2 (2012-10-26)
  • using platform: x86_64-w64-mingw32 (64-bit)
  • using session charset: ISO8859-1
  • checking for file 'taxize/DESCRIPTION' ... OK
  • this is package 'taxize' version '0.0.1'
  • checking CRAN incoming feasibility ... OK
    Maintainer: 'Scott Chamberlain [email protected]'
  • checking package namespace information ... OK
  • checking package dependencies ... NOTE
    Package suggested but not available for checking: 'doMC'
  • checking if this is a source package ... OK
  • checking if there is a namespace ... OK
  • checking whether package 'taxize' can be installed ... OK
  • checking installed package size ... OK
  • checking package directory ... OK
  • checking for portable file names ... OK
  • checking DESCRIPTION meta-information ... OK
  • checking top-level files ... OK
  • checking for left-over files ... OK
  • checking index information ... OK
  • checking package subdirectories ... OK
  • checking R files for non-ASCII characters ... OK
  • checking R files for syntax errors ... OK
  • loading checks for arch 'i386'
    ** checking whether the package can be loaded ... OK
    ** checking whether the package can be loaded with stated dependencies ... OK
    ** checking whether the package can be unloaded cleanly ... OK
    ** checking whether the namespace can be loaded with stated dependencies ... OK
    ** checking whether the namespace can be unloaded cleanly ... OK
    ** checking loading without being on the library search path ... OK
  • loading checks for arch 'x64'
    ** checking whether the package can be loaded ... OK
    ** checking whether the package can be loaded with stated dependencies ... OK
    ** checking whether the package can be unloaded cleanly ... OK
    ** checking whether the namespace can be loaded with stated dependencies ... OK
    ** checking whether the namespace can be unloaded cleanly ... OK
    ** checking loading without being on the library search path ... OK
  • checking for unstated dependencies in R code ... OK
  • checking S3 generic/method consistency ... OK
  • checking replacement functions ... OK
  • checking foreign function calls ... OK
  • checking R code for possible problems ... OK
  • checking Rd files ... OK
  • checking Rd metadata ... OK
  • checking Rd cross-references ... OK
  • checking for missing documentation entries ... OK
  • checking for code/documentation mismatches ... OK
  • checking Rd \usage sections ... OK
  • checking Rd contents ... OK
  • checking for unstated dependencies in examples ... OK
  • checking contents of 'data' directory ... OK
  • checking data for non-ASCII characters ... OK
  • checking data for ASCII and uncompressed saves ... OK
  • checking examples ...
    ** running examples for arch 'i386' ... OK
    ** running examples for arch 'x64' ... OK
  • checking for unstated dependencies in tests ... OK
  • checking tests ...
    ** running tests for arch 'i386' [75s] OK
    Running 'test-all.R' [75s]
    ** running tests for arch 'x64' [79s] OK
    Running 'test-all.R' [78s]
  • checking PDF version of manual ... OK
    NOTE: There was 1 note.
    See
    'd:/RCompile/CRANguest/R-release/taxize.Rcheck/00check.log'
    for details.

Add remaining tests back in to /inst/tests

tropicos tests are back in and appear to be working so far. @Edild can you check and see if the tropicos tests are working on your end? I submitted the package to Win Builder to see if it passes checks on windows.

get_tsn

Just an idea: shouldn't get_tsn() only return the tsn-numbers?
So no need for a) 'itistermscomname', 'itistermssciname', or 'tsnsvernacular', 'tsnfullhir', 'tsnhirdown',
and b)
the data.frame returned by the ritis functions (only the tsn col is necessary).

I dont know, there are two ways: Change all the ritis functions
or
keep them as they are and wrap their output into get_tsn. (as in my previously closed pull request?)
I think the second way would be less work? What do you think?

Problems with ITIS

I am facing some problems with the ITIS-API:

df <- read.table(header = TRUE, as.is = TRUE, text = 'SVN              Taxon CountValue
1 WP220110711  Zaitzevia.parvula        484
2 WP220110711           Tvetenia        109
                 3 WP220110711        Tubificidae       1054
                 4 WP220110711            Sweltsa         11
                 5 WP220110711 Suwallia.pallidula         32
                 6 WP220110711     Stempellinella         11 ')
df$Taxon <- gsub("\\.", " ", df$Taxon)

require(taxize)
#query itis
get_tsn(df$Taxon)

Retrieving data for species ' Zaitzevia parvula '
Error in function (type, msg, asError = TRUE)  : 
  Recv failure: Connection reset by peer

The same with the itis_* functions

llply(df$Taxon, function(x) itis(x, "searchbyscientificname"))
Error in function (type, msg, asError = TRUE)  : 
  Recv failure: Connection reset by peer

I tried a Sys.sleep() before every request, but that didn't fix the problem...

Any suggestions? I think the problem is on the server-side ?!

Dealing with large data.frame results in `get_tsn` and `get_uid`

Sometimes you get a data.frame from get_tsn and get_uid that is too large to scroll up to see al the options. For example:

get_tsn("Andrena", "sciname")

gives a huge data.frame with 1627 rows, in which you can't scroll to the top to see all the options in RStudio. It seems that you can scroll to the top in regular R.app though

Is there any way to get around this?

get_tsn addition: "no match"

When using get_tsn, sometimes the given options don't match your input. Perhaps you can add an option that is "none of the above", which will just insert a character string e.g, "no_match", so that you can continue on in case you are calling get_tsn in an apply call or for loop.

Put dialogue for classification below data.frame output

In cases where the returned data.frame is very large, it would be nice to have the dialogue, e.g,:

More than one TSN found for species ' Quercus '!

          Enter rowname of species to take:

be below the data.frame

e.g. run this to see what I mean:

classification(get_tsn(c("Quercus", "Pinus"), 'sciname'))

Phylomatic limit on API calls?

From Simon Goring:

"So, as far as the error in get_phylomatic_tree, I think it's actually
a hard limit on the length of the text string that gets sent out. It
works out to about 200 species for me, I'm not sure how many
characters that is, but I bet you could just generate a random string
of increasing length to figure it out."

the function `downstream` I started doesn't work

The point of the function is to automate getting all data you would like downstream of a certain taxon. For example, you may want to get all the families (and their TSNs) with the Class Insecta. I think I may have a working version somewhere on my machine, will try to dig it up.

The reason we need this stems from the fact that the ITIS API only allows you to go down to direct children using their method getHierarchyDownFromTSN (e.g., you can only get species within a genus, not subspecies too)..

E.g. of general class: Pull down all genera within a tribe or class

From Carl Boettiger:

One thing that would be really useful to get from taxize is: given a higher taxanomic level, like class, return all the genus names that it includes. I seems I cannot search treebase by class (I can for fishbase at least), so it would be nice to simply get a list of every genus in the class and pipe that through the treebase search as a work around. What do you think?

I can then pull some traits for fish down from fishbase, and would have a working example combining three packages.

treebase and fishbase can pull down publication info, so it wouldn't be too much further to wrap in mendeley, though not sure why it would be useful -- readership is probably a poor proxy for data quality, but it would be possible.

CRAN errors

Looks like we have errors on three different platforms: see here

I thought that the tests were fine when we tested them, and it passed the windows check

This one doesn't make any sense: that is, it says it cannot find the function paste0, weird.

Another Idea

Using classification(), we could make up a function to contruct species trees from taxonomy.
Sure they may not reflect phylogeny and user should be cautious, but when there is no molecular information available...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.