Giter Site home page Giter Site logo

dplyr's Introduction

dplyr

Build Status AppVeyor Build Status CRAN_Status_Badge Coverage Status

Overview

dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges:

  • mutate() adds new variables that are functions of existing variables
  • select() picks variables based on their names.
  • filter() picks cases based on their values.
  • summarise() reduces multiple values down to a single summary.
  • arrange() changes the ordering of the rows.

These all combine naturally with group_by() which allows you to perform any operation "by group". You can learn more about them in vignette("dplyr"). As well as these single-table verbs, dplyr also provides a variety of two-table verbs, which you can learn about in vignette("two-table").

dplyr is designed to abstract over how the data is stored. That means as well as working with local data frames, you can also work with remote database tables, using exactly the same R code. Install the dbplyr package then read vignette("databases", package = "dbplyr").

If you are new to dplyr, the best place to start is the data import chapter in R for data science.

Installation

# The easiest way to get dplyr is to install the whole tidyverse:
install.packages("tidyverse")

# Alternatively, install just dplyr:
install.packages("dplyr")

# Or the development version from GitHub:
# install.packages("devtools")
devtools::install_github("tidyverse/dplyr")

If you encounter a clear bug, please file a minimal reproducible example on github. For questions and other discussion, please use the manipulatr mailing list.

Usage

library(dplyr)

starwars %>% 
  filter(species == "Droid")
#> # A tibble: 5 x 13
#>   name  height  mass hair… skin… eye_… birt… gend… home… spec… films vehi…
#>   <chr>  <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <lis> <lis>
#> 1 C-3PO    167  75.0 <NA>  gold  yell… 112   <NA>  Tato… Droid <chr… <chr…
#> 2 R2-D2     96  32.0 <NA>  "whi… red    33.0 <NA>  Naboo Droid <chr… <chr…
#> 3 R5-D4     97  32.0 <NA>  "whi… red    NA   <NA>  Tato… Droid <chr… <chr…
#> 4 IG-88    200 140   none  metal red    15.0 none  <NA>  Droid <chr… <chr…
#> 5 BB8       NA  NA   none  none  black  NA   none  <NA>  Droid <chr… <chr…
#> # ... with 1 more variable: starships <list>

starwars %>% 
  select(name, ends_with("color"))
#> # A tibble: 87 x 4
#>   name             hair_color skin_color    eye_color
#>   <chr>            <chr>      <chr>         <chr>    
#> 1 "Luke Skywalker" blond      fair          blue     
#> 2 C-3PO            <NA>       gold          yellow   
#> 3 R2-D2            <NA>       "white, blue" red      
#> 4 "Darth Vader"    none       white         yellow   
#> 5 "Leia Organa"    brown      light         brown    
#> # ... with 82 more rows

starwars %>% 
  mutate(name, bmi = mass / ((height / 100)  ^ 2)) %>%
  select(name:mass, bmi)
#> # A tibble: 87 x 4
#>   name             height  mass   bmi
#>   <chr>             <int> <dbl> <dbl>
#> 1 "Luke Skywalker"    172  77.0  26.0
#> 2 C-3PO               167  75.0  26.9
#> 3 R2-D2                96  32.0  34.7
#> 4 "Darth Vader"       202 136    33.3
#> 5 "Leia Organa"       150  49.0  21.8
#> # ... with 82 more rows

starwars %>% 
  arrange(desc(mass))
#> # A tibble: 87 x 13
#>   name   heig…  mass hair… skin… eye_… birt… gend… home… spec… films vehi…
#>   <chr>  <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <lis> <lis>
#> 1 "Jabb…   175  1358 <NA>  "gre… oran… 600   herm… "Nal… Hutt  <chr… <chr…
#> 2 Griev…   216   159 none  "bro… "gre…  NA   male  Kalee Kale… <chr… <chr…
#> 3 IG-88    200   140 none  metal red    15.0 none  <NA>  Droid <chr… <chr…
#> 4 "Dart…   202   136 none  white yell…  41.9 male  Tato… Human <chr… <chr…
#> 5 Tarff…   234   136 brown brown blue   NA   male  Kash… Wook… <chr… <chr…
#> # ... with 82 more rows, and 1 more variable: starships <list>

starwars %>%
  group_by(species) %>%
  summarise(
    n = n(),
    mass = mean(mass, na.rm = TRUE)
  ) %>%
  filter(n > 1)
#> # A tibble: 9 x 3
#>   species      n  mass
#>   <chr>    <int> <dbl>
#> 1 Droid        5  69.8
#> 2 Gungan       3  74.0
#> 3 Human       35  82.8
#> 4 Kaminoan     2  88.0
#> 5 Mirialan     2  53.1
#> # ... with 4 more rows

dplyr's People

Contributors

hadley avatar romainfrancois avatar krlmlr avatar lionel- avatar lindbrook avatar hannes avatar jimhester avatar kevinushey avatar ilarischeinin avatar sfirke avatar zeehio avatar leondutoit avatar cosinequanon avatar javierluraschi avatar yutannihilation avatar arunsrinivasan avatar eibanez avatar pimentel avatar maurolepore avatar coolbutuseless avatar dgromer avatar xiaodaigh avatar ncarchedi avatar mkuhn avatar jayhesselberth avatar imanuelcostigan avatar daattali avatar rbdixon avatar bpbond avatar ajschumacher avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.