Giter Site home page Giter Site logo

bio144's People

Contributors

duriah avatar erikpwillems avatar opetchey avatar stefaniemuff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bio144's Issues

No barplots!

From second edition of Getting Started with R:
"We note that in the first edition of this book, we also showed tools to build bar charts ± error bars. However, we since decided we don’t like these, so we changed. Many other people don’t like bar charts... they can hide too much1)."

  1. http://dx.doi.org/10.1371/journal.pbio.1002128

Key for count data

  1. Link function.
  2. Exponential link and implications for coefficient.
  3. Why it is now deviance, and no longer variance.

Outlook (week 14)

computer science, marketing, advertising, machine learning, financial forecasting, ecology, medicine

GA 5, question 2

Owen, could you please again check question 2 in GA5? There still seems to be something wrong with it, thanks.

Explain what is the difference between counts and number of successes

This might seem obvious to us, but came up recently. The confusion / uncertainty can be created at least when one describes how binary data can be coded, with "count" of number of successes in one column. So I think its worth reinforcing with students:

Count data: theoretically no upper limit on number of times an "event" (e.g., number of birds observed in a forest plot), not possible to express as a proportion.

Binomial data: upper limit on number of times and "event" is observed (e.g., number of deaths cannot be greater than number of living individuals), possible to express as proportion.

"Significance" question in week 1

Hi Owen, I opened an issue, really nice feature:-)

I just spotted that there is a question in "Significance" in the graded assessment of week 1. Would it be more approriate to move it to week 9, where we discuss this topic?

In the option
"Statistical significance is often said to be when something is unexpected, given our expectation about what would happen by chance alone"
I think it is better to replace "by chance alone" by "under the null hypothesis".

Lecture 8, slide 11-12

Nice to put the quadratic lm in, because it shows that although the nonlinearity is well dealt with, the scale-location plot still shows an increase. This is indicating that in the real data the variance is increasing with the mean (fitted value), while the linear model with identity link is assuming variance constant and independent of the mean (fitted value). I think its worth mentioning in slide 12 this as one of the problems. This would link well with content on slide 13 about the variance of the poisson increasing as the mean increases.

On slide 22, first bullet, I suggest pointing out the big difference in the diagnostic plots is in the scale-location plot, which no longer shows an increasing relationship. This is the big effect of using a glm in this case.

Second bullet on slide 22... a glm is still linear regression :)

Non-standard responsibilities

(I is Steffi, you is Owen)

  • In IC practical 4, I'll provide the part for the linAlg, while you will provide an ANCOVA example (right?)
  • In lecture 11, I will do the measurement error part, you will do the mixed models part (right?)
  • In IC practical 11, I'll provide the measurement error part, while you will provide the exercise for the mixed models (right?)

r-scripts

I just solved the algae ANOVA example as if I was a student. Of course, I'm supposed to be much faster then them. However, I noticed that I loose too much time with looking up the right dplyr and ggplot commands (looks I'm a seasoned R veteran, but I'm really happy;-)). I'd highly appreciate the solution scripts to be more efficient. Also it will help me to not show my different opinion on that to the students, haha.
Also I think that our TAs should have the scripts, so all of us will definitely give the very same advide to the students. I know that by far not all of the TAs have so far embraced the Hadleyverse;)

practical 2

In practical 2, Question 1 of "reporting your results" section, there seems to be a problem with the right answers.

graded assessment 5

In GA 5 I have tried the questions and didn't get all correct:-)

  • Q1: the t-test is not actually model. Maybe reformulating the question to "Which of these is based on a linear model?" would be more precise?

  • Q2: I can't do a generalised linear model with the lm function;) Generalized is all the stuff with link function etc, so in general this isn't correct, right?

  • Q4: I was confused by the degrees of freedom question. The designs that is described there seems to be nested (hierarchical) instead of factorial. Unfortunately, I think we don't have time to cover nested designs, too. Did I misunderstand something here?

For mixed model week

In first week, record each of the five trials. So then we can do a mixed model in later week.

Overfitting

Numerical demonstration of how more variables (= parameters) decreases the residual error, but can eventually lead to greater prediction error, due to increases in parameter uncertainty.

Include r-squared and adjusted r-squared.

Practical 3, milk

Correct the questions and any text about the relationship between mass and neocortex.

Video 3 in BC of week 5

I just noticed that in the BC video 3 of week 5 there is a problem with the interpretation of the significance of the interaction term (earthworm example, around minute 13). Actually one would have to use the anova table again to test the interaction between Magenumf and Gattung and not the single p-values.

(Btw, for next year I'll plan to bring an example of an interaction term with at least 3 levels already in the lecture, because I have now only used a binary variable with interaction in lecture 4...)

Include Clement Aldebert

New postdoc in Owen's group starting Jan 2017.

Here is what he has previously taught:

  • statistics and probability (30h/year): basics of univariate and bivariate statistics, probability laws, combinatory (1st year course)
  • computing (~20h/year): introduction to R language (3rd year course)
  • modelling (26h/year): introduction to models based on ODEs and discrete maps, equilibrium and linear stability analysis, introduction to bifurcations (3rd year course)
  • mathematics (26h/year): numerical methods to approximate functions, solve integrals, ODEs, root-finding in systems of linear and non-linear equations, generalized linear models for data analysis (3rd year course)
  • theoretical ecology (2h/year): lecture on my research activities (4rd year course).

Degrees of freedom video lecture

What makes 37?
Son walking down platform looking at clock, 17.20.

How many numbers am I allowed to us.

How much freedom will you give me.

How much freedom does the have to go wrong?

Lots of freedom gives lots of power.

Total
Mean
Slope intercept
Number of things estimated.

First, what do we use df for, in practice... checking design and model, looking up pvalue.

Steffi's required reading:

week 1: No additional material except lecture notes.

week 2: Stahel script "Lineare regression", chapter 2.
alterantively: "Statistische Datenenalayse" book chapters 13.1-13.4.

week 3: Stahel script chapters 3.1, 3.2a-q, 4.1, 4.2f, 4.3a-e;
"Statistische Datenanalyse" book, Chapter 11.2

week 4: Stahel script chapters 3.2u-x, 3.3, 4.1-4.5

week 5: Stahel book ("Statistische Datenanalyse") chapter 12;
"The new statistics with R", chapter 2
GSWR chapters 5.6 and 6.2

week 6: GSWR chapter 6.3
``The new Statistics with R'' chapter 7 (ANCOVA) (ev remove this one?)
Stahel Script chapters 3.4, 3.5 (pp 39-42) and 3.A (pp 43-45)

week 7: Stahel script chapters 5.1-5.4
``Choice and Interpretation of Models'' (Clayton/Hills) chapters 27.1 + 27.2 (pdf provided as a scan)

week 8: Self-study week, see papers and articles provided.

week 9: No BC reading (because covered by self-study week).

week 10: GSWR chapter 7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.