Giter Site home page Giter Site logo

flumenhabitans / hsds Goto Github PK

View Code? Open in Web Editor NEW

This project forked from abohyndoe/hsds

0.0 0.0 0.0 1.22 MB

Data package for the data sets from the book "A Handbook of Small Data Sets" by David Hand (1994)

Home Page: https://abohyndoe.github.io/HSDS/

License: Creative Commons Zero v1.0 Universal

R 100.00%

hsds's Introduction

HSDS

The goal of HSDS is to make all the data sets of the book “A Handbook of Small Data Sets” (1994) of David J. Hand available. These data sets are particularly useful to demonstrate examples of function or statistical tests, but also to teach about statistics and R.

All data sets are already available individually at this repo: https://github.com/JedStephens/Handbook-of-Small-Data-Sets/tree/master. However, they are not immediately usable in R, and undocumented. This package aims to solve this issue, and provide clean and documented data sets.

Do you like this package and want to support me ? “Buy Me A Coffee”

Installation

You can install the development version of HSDS like so:

devtools::install_github("ABohynDOE/HSDS")

Available data sets

The book contains more than 500 data sets. For the moment, only some are available. They are summarized in the table below, along with their names, what they contain, their structure, and the type of variables present.

name Title Structure Variables

Germinating seeds

48 × 3

factor(2), numeric(1)

Guessing lengths

113 × 3

character(1), numeric(2)

Darwin’s cross-fertilized and self-fertilized plants

30 × 3

factor(1), integer(1), numeric(1)

Intervals between cars on the M1 motorway

41 × 2

character(2)

Tearing factor for paper

20 × 2

numeric(2)

Abrasion loss

30 × 3

numeric(3)

Mortality and water hardness

61 × 5

factor(1), numeric(4)

Tensile strength of cement

21 × 2

numeric(2)

Weight gain in rats

40 × 3

factor(2), numeric(1)

Weight of chickens

24 × 3

factor(2), numeric(1)

Flicker frequency

27 × 4

factor(3), numeric(1)

Effect of ammonium chloride on yield

32 × 5

factor(4), numeric(1)

Example

This is a basic example which shows you how to use a data set to make a nice plot:

library(HSDS)
library(ggplot2)

ggplot(germin, aes(x = water, y = seeds, color = box)) +
  geom_boxplot(na.rm = T) +
  theme_bw()

Contributing

We are far from the 500 data sets, so any help is welcome ! If you want to contribute, all raw data sets are already present in the repo (at data-raw/data-files), so feel free to clean one or more… ! If you do so, please respect the following guidelines:

  • data sets should be named after the data structure index of the book (available here)

  • all variables in the data set should be labelled (using the labelled package for example)

  • data sets should be documented using the text from the book

hsds's People

Contributors

abohyndoe avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.