Giter Site home page Giter Site logo

stat_rethinking_2023's Introduction

Statistical Rethinking (2023 Edition)

For the 2024 version of the course see: https://github.com/rmcelreath/stat_rethinking_2024

Instructor: Richard McElreath

Lectures: Uploaded and pre-recorded, two per week

Discussion: Online (Zoom), Fridays 3pm-4pm Central European (Berlin) Time

Purpose

This course teaches data analysis, but it focuses on scientific models. The unfortunate truth about data is that nothing much can be done with it, until we say what caused it. We will prioritize conceptual, causal models and precise questions about those models. We will use Bayesian data analysis to connect scientific models to evidence. And we will learn powerful computational tools for coping with high-dimension, imperfect data of the kind that biologists and social scientists face.

Format

Online, flipped instruction. I will pre-record the lectures each week. We'll meet online once a week for an hour to discuss the material. The discussion time (3-4pm Berlin Time) should allow people in the Americas to join in their morning.

We'll use the 2nd edition of my book, <Statistical Rethinking>, and possibly some draft chapters for the 3rd edition. I'll provide a PDF of the book to enrolled students.

Registration: Closed.

Calendar & Topical Outline

There are 10 weeks of instruction. Links to lecture recordings will appear in this table. Weekly problem sets are assigned on Fridays and due the next Friday, when we discuss the solutions in the weekly online meeting.

Full lecture playlist: <Statistical Rethinking 2023 Playlist>

Note about slides: In some browsers, the slides don't show correctly. If points are missing from plots, download the slides PDF instead of viewing in browser.

Week ## Meeting date Reading Lectures
Week 01 06 January Chapters 1, 2 and 3 [1] <Science Before Statistics> <Slides>
[2] <Garden of Forking Data> <Slides>
Week 02 13 January Chapter 4 [3] <Geocentric Models> <Slides>
[4] <Categories and Curves> <Slides>
Week 03 20 January Chapters 5 and 6 [5] <Elemental Confounds> <Slides>
[6] <Good and Bad Controls> <Slides>
Week 04 27 January Chapters 7,8,9 [7] <Overfitting> <Slides>
[8] <MCMC> <Slides>
Week 05 03 February Chapters 10 and 11 [9] <Modeling Events> <Slides>
[10] <Counts and Confounds> <Slides>
Week 06 10 February Chapters 11 and 12 [11] <Ordered Categories> <Slides>
[12] <Multilevel Models> <Slides>
Week 07 17 February Chapter 13 [13] <Multilevel Adventures> <Slides>
[14] <Correlated Features> <Slides>
Week 08 24 February Chapter 14 [15] <Social Networks> <Slides>
[16] <Gaussian Processes> <Slides>
Week 09 03 March Chapter 15 [17] <Measurement> <Slides>
[18] <Missing Data> <Slides>
Week 10 10 March Chapters 16 and 17 [19] <Generalized Linear Madness> <Slides>
[20] <Horoscopes> <Slides>

Coding

This course involves a lot of scripting. Students can engage with the material using either the original R code examples or one of several conversions to other computing environments. The conversions are not always exact, but they are rather complete. Each option is listed below.

Original R Flavor

For those who want to use the original R code examples in the print book, you need to install the rethinking R package. The code is all on github https://github.com/rmcelreath/rethinking/ and there are additional details about the package there, including information about using the more-up-to-date cmdstanr instead of rstan as the underlying MCMC engine.

R + Tidyverse + ggplot2 + brms

The <Tidyverse/brms> conversion is very high quality and complete through Chapter 14.

Python and PyMC3

The <Python/PyMC3> conversion is quite complete.

Julia and Turing

The <Julia/Turing> conversion is not as complete, but is growing fast and presents the Rethinking examples in multiple Julia engines, including the great <TuringLang>.

Other

The are several other conversions. See the full list at https://xcelab.net/rm/statistical-rethinking/.

Homework and solutions

I will also post problem sets and solutions. Check the folders at the top of the repository.

stat_rethinking_2023's People

Contributors

rmcelreath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stat_rethinking_2023's Issues

height and age as piecewise linear in 03_howell_new_weight_model.r not working

Let me start by expressing my gratitude for the great lecture and book. Thanks for making this available on youtube and GitHub.

I did notice, that the last few lines in https://github.com/rmcelreath/stat_rethinking_2023/blob/main/scripts/03_howell_new_weight_model.r
are unfinished. There is a missing closing parenthesis as well as missing prior specifications.

#######
# height and age as piecewise linear

data(Howell1)
d <- Howell1

dat <- list(
    H = d$height,
    A = d$age )

m <- quap(
    alist(
        H ~ dnorm(mu,sigma),
        mu <- a1*(1-exp(-b1*A)) + a2*(1-exp(-b2*(A))
    )
)

Garden of forking paths not rendering

Hi, was not a signed-up student, but following along with your 2023 lectures, and couldn't get the initial 02_garden_animation.r to render without this error:

Error in garden(arc = arc, possibilities = c(0, 0, 0, 1), data = dat,  : 
  argument "prog_dat" is missing, with no default

Since no one else complained about this yet, it may just be a user-error on my part, as I'm neither an RStudio or dev regular. But I found that the error was resolved by adding a default value for prog_dat in line 93:

prog_dat=c(-1,-1,-1)

blank

The example code uses the function "blank".
However, it is not clear from the code which package I do need to import.

blank(w=2,h=0.7)
Error in blank(w = 2, h = 0.7) : could not find function "blank"

`sapply()` is an avoidable complication in code block 2.1

I've just been watching lecture 2, and when R example 2.1 comes up (35:00), you explain that

For those of you who don't use R, sapply is just a loop, it's just a function that loops over a list

Because R vectorises functions by default, you don't need to use sapply here. This might make the code flow clearer, since you wouldn't need to explain the purpose of sapply.

I've written a small counter-example that uses vectorisation rather than sapply:

sample <- c("W", "L", "W", "W", "W", "L", "W", "L", "W")
W <- sum(sample == "W")
L <- sum(sample == "L")
p <- c(0, 0.25, 0.5, 0.75, 1)

get_ways <- function(q) (q*4)^W * ((1-q)*4)^L
ways <- get_ways(p)

prob <- ways/sum(ways)

cbind(p, ways, prob)
#>         p ways       prob
#> [1,] 0.00    0 0.00000000
#> [2,] 0.25   27 0.02129338
#> [3,] 0.50  512 0.40378549
#> [4,] 0.75  729 0.57492114
#> [5,] 1.00    0 0.00000000

Created on 2023-03-20 with reprex v2.0.2

Possible error in the variance-covariance matrix in the r chunks 3.17 and 3.19

I am following the code of the third chapter of the third draft (I am a little late), and I have found a discrepancy. In the m3.1 model, using quap, the estimators calculated by running the code match perfectly with those in the draft. However, when I try to display the variance-covariance matrix, the values do not match.

I think the problem is that the matrix that appears in the draft is not the correct one, because if I square the sd of the estimators, they match the variance that I get using vcov(m3.1).

I report it here in case it is worth checking!

Issues with installing the "rethinking" package in RStudio

A colleague and I are going through the 'Statistical Rethinking' series, but we are unable to install the "rethinking" package in RStudio. This is the warning message in R, "package ‘rethinking’ is not available for this version of R".

We are unsure why this is happening because we are using the latest version of R. We would appreciate your suggestions on how to go about the issue.

Thanks!

make_bar from compute_posterior function (Lecture 2) not found

Hello! I was wondering if I needed to install a separate library to be able to run the make_bar in the compute_posterior function that was shown in Lecture #2? I tried typing out all the code and running it, however it produced an error saying "Error in make_bar(q): could not find function "make_bar"?

week07_solutions - sim( ) maybe broken of behaving unexpected

The code chunk below gives an error that may be pointing to the in bold below. When run, it breaks and gives the error listed later.

pU0 <- sapply( 1:61 ,
function(dist)
sim(m4,vars=c("Ks","C"),data=list(A=Asim,U=rep(1,n),D=rep(dist,n)))$C
)

Error in if (left == var) { : the condition has length > 1

When I tried to debug it, the error may be coming from the sim() function in the line in bold below... I might be wrong, but not able to fix it either. Or maybe I missed the whole point, actually.

sim_vars <- list()
for (var in vars) {
f <- fit@formula[[1]]
for (i in 1:length(fit@formula)) {
f <- fit@formula[[i]]
left <- as.character(f[[2]])
if (left == var) {
if (debug == TRUE)
print(f)
break

Book version 3

Hi,
I hope this is not the wrong place to ask, but I couldn't find information elsewhere:
Is there a rough date for when the 3rd version of the book gets published?

Circular Bayesian statistics

I was wondering, do you have some ideas/books/articles on circular statistics (using Bayesian methods as a core)? I have the book 'Circular statistics in R', but that is more non-Bayesian testing. Thanks for the help.
All the best,
Victor

Splines to compare treatments

Say I have some time-series data for 2 groups: 1 group received placebo and the other received treatment. Can the effect of treatment be modeled with splines similar to a linear model. For example:

# DATA
n_A <- 10 # Number of rats in Group A
n_B <- 10 # Number of rats in Group B

m_NREMS <- 4 # number of measurements for NREMS.  Here, we'll assume that the rhythyms of NREMS express over 6-h blocks, giving us 4 blocks.

NREMS_A <- data.frame("Blk0" = rep(0, n_A),
                      "Blk1" = rgamma(n = n_A, shape=144, rate=6/5), # 2h NREMS
                      "Blk2" = rgamma(n=n_A, shape=324, rate=9/5), # 3h NREMS
                      "Blk3" = rgamma(n=n_A, shape=900, rate=3), # 5h NREMS
                      "Blk4" = rgamma(n=n_A, shape=576, rate=12/5)) # 4h NREMS

NREMS_B <- data.frame("Blk0" = rep(0, n_B),
                      "Blk1" = rgamma(n = n_B, shape=144, rate=6/5) + rnorm(n=n_B, mean=30, sd=6), # Similar to NREMS_A, but with the effect of S added
                      "Blk2" = rgamma(n=n_B, shape=324, rate=9/5) + rnorm(n=n_B, mean=20, sd=5),
                      "Blk3" = rgamma(n=n_B, shape=900, rate=3) + rnorm(n=n_B, mean=10, sd=4),
                      "Blk4" = rgamma(n=n_B, shape=576, rate=12/5) + rnorm(n=n_B, mean=5, sd=3))

NREMS_A_cuml <- NREMS_A
for(i in 2:ncol(NREMS_A)) {
  NREMS_A_cuml[,i] <- NREMS_A_cuml[,i] + NREMS_A_cuml[,i-1]
}

NREMS_B_cuml <- NREMS_B
for(i in 2:ncol(NREMS_B)) {
  NREMS_B_cuml[,i] <- NREMS_B_cuml[,i] + NREMS_B_cuml[,i-1]
}

NREMS_All <- data.frame("NREMS" = c(as.numeric(unlist(NREMS_A_cuml)), as.numeric(unlist(NREMS_B_cuml))),
                        "Group" = c(rep("A", (m_NREMS+1)*n_A), c(rep("B", (m_NREMS+1)*n_B))),
                        "Block" = c(rep(0:4, each=n_A), c(rep(0:4, each=n_B))))
NREMS_All$treatment <- ifelse(test = NREMS_All$Group=="A",
                              yes = FALSE,
                              no = TRUE)
NREMS_All$minute <- NREMS_All$Block*360

# SPLINES
num_knots <- 4
knot_list <- quantile(NREMS_All$minute, probs=seq(from=0, to=1, length.out=num_knots))
knot_degree <- 3

B <- bs(NREMS_All$minute,
        knots=knot_list[-c(1, num_knots)],
        degree=knot_degree,
        intercept=TRUE)

# MODEL
NREMS_model_1 <- quap(
  alist(
    NREMS ~ dnorm(mu, sigma),
      mu <- a +
            B %*% w[treatment],
        a ~ dnorm(180, 10),
        w[treatment] ~ dnorm(1, 1),
      sigma ~ dexp(1)
  ), data=list(NREMS=NREMS_All$NREMS,
               treatment=as.integer(NREMS_All$treatment),
               B=B),
     start=list(w=rep(0, ncol(B)))
)

I think that the problem is that B is a matrix and w is a vector such that w[treatment] is read as an index on the vector w, but I want it to be work like it does in the random effects models.

Any help much appreciated.

Enquiry for 2024 enrolment

Any info for a 2024 course and how to enrol it ?
For online learning, I am much more successful if I am part of a cohort of students learning the same stuff at the same time..

phylogenetic imputation of a binary predictor

Thanks for yet another round of awesome lectures!

I have a question about the imputation procedure in the missing data lecture.

So, in the following model, the primate phylogeny is used to impute missing data in predictor G (group size).

mBMG_OU3 <- ulam(
    alist(
        B ~ multi_normal( mu , K ),
        mu <- a + bM*M + bG*G,
        G ~ multi_normal( nu , KG ),
        nu <- aG + bMG*M,
        M ~ normal(0,1),
        matrix[N_spp,N_spp]:K <- cov_GPL1(Dmat,etasq,rho,0.01),
        matrix[N_spp,N_spp]:KG <- cov_GPL1(Dmat,etasqG,rhoG,0.01),
        c(a,aG) ~ normal( 0 , 1 ),
        c(bM,bG,bMG) ~ normal( 0 , 0.5 ),
        c(etasq,etasqG) ~ half_normal(1,0.25),
        c(rho,rhoG) ~ half_normal(3,0.25)
    ), data=dat_all , chains=4 , cores=4 , sample=TRUE )

My question is, what if G was a binary predictor? For instance, we might code a species either as solitary (S=0) or social (S=1) and use that to predict brain size B. We then want to use phylogenetic information in the imputation of S.

My guess is that the likelihood for S would not be multivariate normal, but how would the code look like then?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.