Giter Site home page Giter Site logo

well_hello_stats's Introduction

Well Hello Stats

Mauricio Garnier-Villarreal 01 February, 2024

Welcome

Welcome to Well Hello Stats. This is page to learn R for Social Scientists, is a series of tutorials that will teach you how to use R for research in the social sciences. Throughout the tutorials, you will learn how to install and set up R and RStudio, get your data into R, manage your data, and implement some of the most commonly used methods in quantitative social science research using R and RStudio.

There are many great resources out there to learn R. This series of tutorials is set up to teach you the necessary skills in a consistent approach.

What is R and why should you learn it?

R is an open-source statistical software language, that is currently among the most popular languages for research in the social sciences. In comparison to other popular software packages in social scientific research, such as SPSS and Stata, R has several notable advantages:

  • R is a programming language, which makes it much more versatile. While R focuses on statistical analysis at heart, it facilitates a wide-range of features, and virtually any tool for data analysis can be implemented.
  • The range of things you can do with R is constantly being updated. R is open-source, meaning that anyone can contribute to its development. In particular, people can develop new packages, that can easily and safely be installed from within R with a single command. Since many scholars and industry professionals use R, it is likely that any cutting-edge and bleeding-edge techniques that you are interested in are already available. You can think of it as an app-store for all your data-analysis needs!
  • R is free. While for students this is not yet a big deal due to free or cheap student and university licences, this can be a big plus in the commercial sector. Especially for small businesses and free-lancers. Allowing to democratize the access to cutting data analysis methods, for people in situations that otherwise would not be able to have access to a proprietary program.
  • The use of syntax base software improves our ability to reproduce/replicate our results, track down mistakes and fix them, and we can save and reuse syntax for future projects.

RStudio is the most commonly used editor for working with R. RStudio makes it easy to write and save code (the instructions for the tasks you want R to execute), to view and plot your data, and to manage your workspace (e.g., the code, data files, and output you are working with).

Prerequisites

Our goal is to make this series of tutorials self-sufficient. This means that there are not prerequisites in terms of knowledge of working with R and RStudio. We will start from the very beginning, with how to install R and RStudio on your Computer, how to set up RStudio for an easy workflow, and the very basics of working with data. If you are familiar with other programming languages or a statistical analysis software (like Stata or SPSS), you will be able to learn R even faster.

Importantly, these tutorials are not a substitute for education in quantitative research methods. They do teach you how to implement different methods in R, but they do not cover questions about research design, what the best method might be for the question you are asking, how these methods work and what their assumptions are. Thus, you are responsible for making sure that your analyses are sound.

How to use this resource

If you have not worked with R before, it is best to follow the series of tutorials from the beginning. Before we cover specific methods in the social sciences, we start with the installation of R and RStudio, clarify the most important basics for working with R, and teach you how to import data into R.

If you are generally familiar with R, you can skip tutorials on the R basics. If you want to follow tutorials on several methods, we recommend that you have a brief look at tutorial on downloading the data from the World Value Survey (WVS). The WVS will be used throughout most of the following tutorials. It is therefore a good idea to download the data set before continuing with the tutorials. If you are familiar with importing data as well you can jump right to the tutorial on methods.

If you are generally familiar with R, already have your own data set, and just want to know how to implement a specific method in R, you can simply jump to tutorial that covers the methods you are interested in.

In these tutorials we start and focus on base R data manipulation and work, instead of the tidyverse approach. This is because we consider that useRs should first be comfortable with the base R commands, and if desired can transition to use of the tidyverse as an extension of R instead of the default use.

The tutorials .Rmd and .md files are found in the tutorials folder. And the following section links to the respective .md files in an structure format, so that you can navigate the tutorials from this page.

The tutorials

Here you will find links for the respective tutorials, and a short description. They have been structure by increased complexity, kind of following a course.

Set up

Basics

  • R Basics: basic use of R, data types, data structures, importing data, functions, install and load packages, working directory and saving your work.
  • Download the WVS data set: where to request access to the World Value Survey (WVS) data set, used in a lot of the tutorials.
  • Import data sets (long): introduction to data formats, Base R, use of packages foreign, haven and rio, general recommendations. Packages: rio, haven, foreign.
  • Import data sets (short): introduction to data formats, use of package rio, general recommendations.

Initial work with data

  • Data management 1: setting up R session, import data set, recode items, create composite scores, variable calculations, selecting subsets. Packages: rio, car, psych.
  • Data management 2: setting up R session, import data set, variable types, use the factor function.
  • Descriptive Statistics: setting up R session, import data set, continuous items, categorical items, data frame summary. Packages: rio, summarytools.
  • Basic plots: setting up R session, import data set, ggplot2 basics, histogram, scatter plot, bar plot, box plot. Packages: rio, ggplot2.

Reporting

Relations between variables

  • Correlation: setting up R session, import data set, scatter plot, Pearson correlation, Spearman correlation, Kendal-tau correlation, extracting the matrices, correlogram, pairs plot. Packages: rio, psych, corrplot, ggplot2, GGally.
  • Contingency tables

Scale evaluation

  • Reliability: what is reliability analysis?, preparation, reliability analysis (Cronbach’s alpha, McDonald’s omega). Packages: rio, psych, car.

General Linear Models

  • t-test: setting up R session, import data set, difference in means, one sample t-test, two sample t-test, paired sample t-test, effect sizes, graphing your results. Packages: rio, effectsize, ggpubr.
  • Repeated Measures - ANOVA: setting up R session, import data set, convert data from wide to long format, run repeated measures anova and mixed design RM-ANOVA, effect size, post-hoc pairwise and planned comparisons, plot effects. Packages: rio, dplyr, reshape2, marginaleffect, afex, sjlabelled, effectsize.
  • MANOVA: introduction to MANOVA, setting up R session, import data set, MANOVA, effect sizes, homogeneity of variances, HE plots, LDA.
  • Linear regression: setting up R session, import data set, simple linear regression, linear regression with a binary predictor, multiple linear regression, standardize solution, assumptions, effect size, plots, interpretation. Packages: rio, psych, effectsize, visreg, rockchalk, ggplot2.
  • Moderation with lm: what is moderation, setting up R session, import data set, moderation analysis steps, categorical and continuous moderator, main effects, interaction models, effect size, probbing, plotting, interpretation. Packages: rio, effectsize, visreg, reghelper.
  • Moderation with the PROCESS macro: what is moderation, installing PROCESS macro, setting up R session, import data set, moderation analysis steps, categorical and continuous moderator, main effects, interaction models, effect size, probbing, plotting, interpretation. Packages: rio, PROCESS macro
  • Mediation with path analysis: what is mediation, setting up R session, import data set, mediation analysis steps, total effect, indirect effect, NHST methods, recommendations and interpretation. Packages: rio, lavaan, semTools.
  • Mediation with the PROCESS macro: what is mediation, installing PROCESS macro, setting up R session, import data set, mediation analysis steps, total effect, indirect effect, NHST methods, recommendations and interpretation. Packages: rio, PROCESS macro

Mixture models

  • LCA with depmixS4 (categorical indicators): latent class analysis. depmixS4 basics, dichotomous indicator example, class enumeration, and interpretation. Packages: rio, depmixS4, sjlabelled, summarytools, ggplot2.
  • LCA with tidySEM (categorical indicators): latent class analysis. tidySEM basics, dichotomous indicator example, class enumeration, and interpretation. Packages: rio, tidySEM, sjlabelled, summarytools, ggplot2, tidyr.
  • HMM with depmixS4 (categorical indicators): hidden markov models, depmixS4 basics, dichotomous indicator example, time invariant and time variant HMM, class enumeration, plots. Packages: rio, depmixS4, sjlabelled, summarytools, ggplot2, ggseqplot, tidyr, TraMineR.

Factor Analysis

Progress

As must things in life, these tutorials as a work in progress. So we will continue updating and adding new tutorials.

These tutorials started as a request from the Sociology department, as they are transitioning out of proprietary software. But we expect this to go beyond the departmental needs.

You are welcome to suggest new tutorials, and/or collaborate one.

well_hello_stats's People

Contributors

maugavilla avatar lauraebe avatar sago9202 avatar

Stargazers

 avatar Hengyi Liu avatar Owain  gaunders avatar wang minjie avatar  avatar  avatar  avatar  avatar

Watchers

James Cloos avatar  avatar wang minjie avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.