Giter Site home page Giter Site logo

mics-whales's People

Contributors

edderic avatar

Stargazers

 avatar

Watchers

 avatar  avatar

mics-whales's Issues

More 2s than yellow 1s

@rsullivan-lord Are new calves always recorded in a new column? In other words, if a mom gives birth, does the calf always get included in a new row? It seems to me that that there are a lot more "2"s but not a lot of yellow "1"s. Why is that?

screenshot from 2019-02-25 21-23-08

Found a pyABC bug

Hi @rsullivan-lord,

Here's an update. While trying to work on this project last weekend, I found a bug in pyABC. Basically, discrete priors like starting Years Since Previous Birth (i.e. yspb for 1980) weren't being handled correctly by code in pyABC, leading to pyABC getting stuck and not finishing inference. But that's okay because we can use continuous values (i.e. yspb is 6.7 years) instead which solves the issue. Another idea from one of the maintainers of pyABC is for us to instead create some extra code to handle that case.

Anyway, I'm now able to run inference and pyABC is now working correctly. Now I'm focusing on working on priors and distance functions to enable the simulator to produce code that is close enough to the observed data for a specific individual. For the first individual, H002, the whale was mostly observed. Using the simple distance function of subtracting the simulated and observed data pairwise, taking the absolute value and then summing leads to the inference where whale is very likely to show up (which is good), but has very low chance of giving birth (which is incorrect). I think this has something to do with the distance function -- births only happen once in a while, so using the distance function above still leads to a small epsilon by just guessing that the whale is alive and likely to be seen and not giving birth. I'm playing around with the idea of weighting differences in observed births more so that the model can learn that H002 does give birth from time to time!

Use Approximate Bayesian Computation

@rsullivan-lord I tried over the week to use the PyMC3 package for regular MCMC, but I had a hard time getting it to work. So I'm back to using ABC instead. Good thing is that I found pyABC which looks awesome and is very promising. Instead of using the simple Rejection ABC algorithm I used earlier (which is super slow), we could use Sequential Monte Carlo (SMC) available in the pyABC package, which is order of magnitudes faster. Nothing really changes on your end -- we're still answering the same questions, but just wanted to give you an update. 😄

Here are some resources we should cite:

Missing years (?) in individual sighting histories

https://esajournals.onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1002%2Fecs2.1796&file=ecs21796-sup-0001-AppendixS1.pdf

Here's some of the supplemental information from Arso Civil et al., 2017. It shows that to deal with gaps in the sighting histories of individuals they just selected parts of the individuals sighting history that were unbroken - not completely, they assumed like we are that an individual couldn't give birth in consecutive years, so a ? in a year right before or after an observed birth was considered to be a 0 even if the dolphin wasn't sighted that year.

Plausible models

Hey @rsullivan-lord,

I was able to get PyABC set up and working. It looks like it's able to get posteriors that make sense, which is good! Instead of predicting for the whole data set (i.e. 1980 to 2016), I've limited it to period of study (i.e. 2005 to 2016), which will make the inference faster / more accurate.

I've only tried it for the first individual so far. Now I'm going to do inference for all individuals in the data set who were unobserved for at least one of the years in the study. Here's the model I'm thinking of, which is similar to what you've seen so far.

Screen Shot 2019-04-05 at 9 35 14 PM

The U term is some unobserved cause of birth. I have two models so far, one with this U term, and one without it. For the U term we'll use the year as a proxy (e.g. 0 for 2005, 1 for 2006, etc.). Do these make sense? Do you have other models you're thinking of?

Note: I'm not using Years Since Previous Birth as a predictor; I'm now only looking at whether or not whale gave birth the previous year. Reason is partly computational: PyABC as I pointed out in #7 currently doesn't handle discrete variables well yet, and YSPB is discrete. However, it could be thought of as a continuous variable, but the difficulty to write it as such is unknown. Programming its replacement (whale gave birth the year before), on the other hand, is easy. Plus, given the many unknown variables, I feel like removing it shouldn't make a big difference. Having many unknown variables means we can set one of those variables to some value, and other unknown variables would change to fit this new constraint.

Here's what I'm thinking that's left for the computation side of things:

For each individual that has unobservables in 2005-2016, set priors for the 3 or so models that we're thinking of, then find posterior distributions. We then generate samples from the posterior distributions. Find ones that align well enough with the observed data. Once we do this enough times, we can compute credible intervals for birth rate for each year in 2005-2016. I think we'll be able to get this done this weekend, if not next.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.