The bio144 from opetchey

No barplots!

From second edition of Getting Started with R:
"We note that in the first edition of this book, we also showed tools to build bar charts ± error bars. However, we since decided we don’t like these, so we changed. Many other people don’t like bar charts... they can hide too much1)."

http://dx.doi.org/10.1371/journal.pbio.1002128

Key for count data

Link function.
Exponential link and implications for coefficient.
Why it is now deviance, and no longer variance.

Check all internal openedx links

Outlook (week 14)

computer science, marketing, advertising, machine learning, financial forecasting, ecology, medicine

GA 5, question 2

Owen, could you please again check question 2 in GA5? There still seems to be something wrong with it, thanks.

Explain what is the difference between counts and number of successes

This might seem obvious to us, but came up recently. The confusion / uncertainty can be created at least when one describes how binary data can be coded, with "count" of number of successes in one column. So I think its worth reinforcing with students:

Count data: theoretically no upper limit on number of times an "event" (e.g., number of birds observed in a forest plot), not possible to express as a proportion.

Binomial data: upper limit on number of times and "event" is observed (e.g., number of deaths cannot be greater than number of living individuals), possible to express as proportion.

"Significance" question in week 1

Hi Owen, I opened an issue, really nice feature:-)

I just spotted that there is a question in "Significance" in the graded assessment of week 1. Would it be more approriate to move it to week 9, where we discuss this topic?

In the option
"Statistical significance is often said to be when something is unexpected, given our expectation about what would happen by chance alone"
I think it is better to replace "by chance alone" by "under the null hypothesis".

Make a "Why R" video lecture.

check "finished" in show answer is working correctly

move everything to openedx!

put stuff from mercury exercise about log transform etc into helathcare

Lecture 8, slide 11-12

Nice to put the quadratic lm in, because it shows that although the nonlinearity is well dealt with, the scale-location plot still shows an increase. This is indicating that in the real data the variance is increasing with the mean (fitted value), while the linear model with identity link is assuming variance constant and independent of the mean (fitted value). I think its worth mentioning in slide 12 this as one of the problems. This would link well with content on slide 13 about the variance of the poisson increasing as the mean increases.

On slide 22, first bullet, I suggest pointing out the big difference in the diagnostic plots is in the scale-location plot, which no longer shows an increasing relationship. This is the big effect of using a glm in this case.

Second bullet on slide 22... a glm is still linear regression :)

Non-standard responsibilities

(I is Steffi, you is Owen)

In IC practical 4, I'll provide the part for the linAlg, while you will provide an ANCOVA example (right?)
In lecture 11, I will do the measurement error part, you will do the mixed models part (right?)
In IC practical 11, I'll provide the measurement error part, while you will provide the exercise for the mixed models (right?)

Talk about effect sizes

r-scripts

I just solved the algae ANOVA example as if I was a student. Of course, I'm supposed to be much faster then them. However, I noticed that I loose too much time with looking up the right dplyr and ggplot commands (looks I'm a seasoned R veteran, but I'm really happy;-)). I'd highly appreciate the solution scripts to be more efficient. Also it will help me to not show my different opinion on that to the students, haha.
Also I think that our TAs should have the scripts, so all of us will definitely give the very same advide to the students. I know that by far not all of the TAs have so far embraced the Hadleyverse;)

Rescale course size in 2018

E.g. fewer practical exercises

practical 2

In practical 2, Question 1 of "reporting your results" section, there seems to be a problem with the right answers.

redo the g2 4 4 video to use qplot

In week 01 things to do before class

Rationalise Owen and Steffi's Week 1 intro presentations

put a lecture number column in the schedule

Lectures numbered 1-12

Have a quiz on what test for what question / data.

Talk about power analysis

Make intro to ggplot video

Add plant growth script to week 3 BC material

Update the reaction time script for practical 0

It is for an old version of the dataset.
Also update the solution script, if required.

graded assessment 5

In GA 5 I have tried the questions and didn't get all correct:-)

Q1: the t-test is not actually model. Maybe reformulating the question to "Which of these is based on a linear model?" would be more precise?
Q2: I can't do a generalised linear model with the lm function;) Generalized is all the stuff with link function etc, so in general this isn't correct, right?
Q4: I was confused by the degrees of freedom question. The designs that is described there seems to be nested (hierarchical) instead of factorial. Unfortunately, I think we don't have time to cover nested designs, too. Did I misunderstand something here?

Edit proof all material

Expected competencies

check from all relevant previous courses

MAT183
BIO134

Move all info from schedule to each week page

then tell Steffi to stop using schedule pdf

Check dates, particularly which Thurs and Fris are really available.

For mixed model week

In first week, record each of the five trials. So then we can do a mixed model in later week.

Include info / resources from my "bio 144" teaching folder, integrate in "bio 144 stuff" folder

Overfitting

Numerical demonstration of how more variables (= parameters) decreases the residual error, but can eventually lead to greater prediction error, due to increases in parameter uncertainty.

Include r-squared and adjusted r-squared.

Describe Rethinking course text

Practical 3, milk

Correct the questions and any text about the relationship between mass and neocortex.

look at some papers with statistical tests

Video 3 in BC of week 5

I just noticed that in the BC video 3 of week 5 there is a problem with the interpretation of the significance of the interaction term (earthworm example, around minute 13). Actually one would have to use the anova table again to test the interaction between Magenumf and Gattung and not the single p-values.

(Btw, for next year I'll plan to bring an example of an interaction term with at least 3 levels already in the lecture, because I have now only used a binary variable with interaction in lecture 4...)

Include Clement Aldebert

New postdoc in Owen's group starting Jan 2017.

Here is what he has previously taught:

statistics and probability (30h/year): basics of univariate and bivariate statistics, probability laws, combinatory (1st year course)
computing (~20h/year): introduction to R language (3rd year course)
modelling (26h/year): introduction to models based on ODEs and discrete maps, equilibrium and linear stability analysis, introduction to bifurcations (3rd year course)
mathematics (26h/year): numerical methods to approximate functions, solve integrals, ODEs, root-finding in systems of linear and non-linear equations, generalized linear models for data analysis (3rd year course)
theoretical ecology (2h/year): lecture on my research activities (4rd year course).

experimental design lesson

Interpretation of summary tables

from simple to more complex, from lm to glm to lmm: what changes?

Give help with this.
Perhaps an overview

ensure super clear and thorough learning objective for each week, and test to them

Clean up the repo and make a "release" branch.

Talk about meta-analysis

Degrees of freedom video lecture

What makes 37?
Son walking down platform looking at clock, 17.20.

How many numbers am I allowed to us.

How much freedom will you give me.

How much freedom does the have to go wrong?

Lots of freedom gives lots of power.

Total
Mean
Slope intercept
Number of things estimated.

First, what do we use df for, in practice... checking design and model, looking up pvalue.

Talk about reporting findings

Add a couple of peer review assignments in 2018

Week by week learning objectives, with final exam questions to match

check randomisation of option order in questions

to make cheating in final exam harder

Make intro to tidyverse video

Steffi's required reading:

week 1: No additional material except lecture notes.

week 2: Stahel script "Lineare regression", chapter 2.
alterantively: "Statistische Datenenalayse" book chapters 13.1-13.4.

week 3: Stahel script chapters 3.1, 3.2a-q, 4.1, 4.2f, 4.3a-e;
"Statistische Datenanalyse" book, Chapter 11.2

week 4: Stahel script chapters 3.2u-x, 3.3, 4.1-4.5

week 5: Stahel book ("Statistische Datenanalyse") chapter 12;
"The new statistics with R", chapter 2
GSWR chapters 5.6 and 6.2

week 6: GSWR chapter 6.3
``The new Statistics with R'' chapter 7 (ANCOVA) (ev remove this one?)
Stahel Script chapters 3.4, 3.5 (pp 39-42) and 3.A (pp 43-45)

week 7: Stahel script chapters 5.1-5.4
``Choice and Interpretation of Models'' (Clayton/Hills) chapters 27.1 + 27.2 (pdf provided as a scan)

week 8: Self-study week, see papers and articles provided.

week 9: No BC reading (because covered by self-study week).

week 10: GSWR chapter 7

opetchey / bio144 Goto Github PK

bio144's People

Contributors

Stargazers

Watchers

Forkers

bio144's Issues

Recommend Projects

Recommend Topics

Recommend Org