Giter Site home page Giter Site logo

bayesmadesimple's Introduction

Bayesian Statistics Made Simple

Allen Downey

Bayesian statistical methods are becoming more common, but there are not many resources to help beginners get started. People who know Python can use their programming skills to get a head start.

In this tutorial, I introduce Bayesian methods using grid algorithms, which help develop understanding and prepare for MCMC, which is a powerful algorithm for real-world problems.

It is based on my book, Think Bayes, a class I teach at Olin College, and my blog, “Probably Overthinking It.”

Slides for this tutorial are here.

Installation instructions

Note: Please try to install everything you need for this tutorial before you leave home!

To prepare for this tutorial, you have two options:

  1. Install Jupyter on your laptop and download my code from GitHub.

  2. Run the Jupyter notebooks on a virtual machine on Binder.

I'll provide instructions for both, but here's the catch: if everyone chooses Option 2, the wireless network might not be able to handle the load. So, I strongly encourage you to try Option 1 and only resort to Option 2 if you can't get Option 1 working.

Option 1A: If you already have Jupyter installed.

Code for this workshop is in a Git repository on Github.
You can download it in this zip file. When you unzip it, you should get a directory named BayesMadeSimple.

Or, if you have a Git client installed, you can clone the repo by running:

    git clone https://github.com/AllenDowney/BayesMadeSimple

It should create a directory named BayesMadeSimple.

To run the notebooks, you need Python 3 with Jupyter, NumPy, SciPy, matplotlib and Seaborn. If you are not sure whether you have those modules already, the easiest way to check is to run my code and see if it works.

You will also need a small library I wrote, called empyrical-dist. You can see it on PyPI and you can install it using pip:

    pip install empyrical-dist

To start Jupyter, run:

    cd BayesMadeSimple
    jupyter notebook

Jupyter should launch your default browser or open a tab in an existing browser window. If not, the Jupyter server should print a URL you can use. For example, when I launch Jupyter, I get

    ~/BayesMadeSimple$ jupyter notebook
    [I 10:03:20.115 NotebookApp] Serving notebooks from local directory: /home/downey/BayesMadeSimple
    [I 10:03:20.115 NotebookApp] 0 active kernels
    [I 10:03:20.115 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
    [I 10:03:20.115 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

In this case, the URL is http://localhost:8888.
When you start your server, you might get a different URL. Whatever it is, if you paste it into a browser, you should should see a home page with a list of the notebooks in the repository.

Click on 01_cookie.ipynb. It should open the first notebook for the tutorial.

Select the cell with the import statements and press "Shift-Enter" to run the code in the cell. If it works and you get no error messages, you are all set.

If you get error messages about missing packages, you can install the packages you need using your package manager, or try Option 1B and install Anaconda.

Option 1B: If you don't already have Jupyter.

I highly recommend installing Anaconda, which is a Python distribution that contains everything you need for this tutorial. It is easy to install on Windows, Mac, and Linux, and because it does a user-level install, it will not interfere with other Python installations.

Information about installing Anaconda is here.

Choose the Python 3.7 distribution.

After you install Anaconda, you can install the packages you need like this:

    conda install jupyter numpy scipy matplotlib seaborn
    pip install empyrical-dist

Or you can create a Conda environment just for the workshop, like this:

    cd BayesMadeSimple
    conda env create -f environment.yml
    conda activate BayesMadeSimple

Then go to Option 1A to make sure you can run my code.

Option 2: if Option 1 failed.

You can run my notebook in a virtual machine on Binder. To launch the VM, press this button:

Binder

You should see a home page with a list of the files in the repository.

If you want to try the exercises, open 01_cookie.ipynb. You should be able to run the notebooks in your browser and try out the examples.

However, be aware that the virtual machine you are running is temporary.
If you leave it idle for more than an hour or so, it will disappear along with any work you have done.

Special thanks to the people who run Binder, which makes it easy to share and reproduce computation.

bayesmadesimple's People

Contributors

allendowney avatar colcarroll avatar hsm207 avatar pleabargain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bayesmadesimple's Issues

Can't calculate credible intervals nor quantiles

When I try to do it, I get the following error:

NotImplementedError Traceback (most recent call last)
in
1 for i, b in enumerate(beliefs):
----> 2 print(b.mean(), b.credible_interval(0.9))

c:\users...\appdata\local\programs\python\python36-32\lib\site-packages\empiricaldist\empiricaldist.py in credible_interval(self, p)
716 tail = (1 - p) / 2
717 ps = [tail, 1 - tail]
--> 718 return self.quantile(ps)
719
720 @staticmethod

c:\users...\appdata\local\programs\python\python36-32\lib\site-packages\empiricaldist\empiricaldist.py in quantile(self, ps, **kwargs)
137 :return: float
138 """
--> 139 return self.make_cdf().quantile(ps, **kwargs)
140
141 def choice(self, *args, **kwargs):

c:\users...\appdata\local\programs\python\python36-32\lib\site-packages\empiricaldist\empiricaldist.py in inverse(self, **kwargs)
846 )
847
--> 848 interp = interp1d(self.ps, self.qs, **kwargs)
849 return interp
850

c:\users...\appdata\local\programs\python\python36-32\lib\site-packages\scipy\interpolate\interpolate.py in init(self, x, y, kind, axis, copy, bounds_error, fill_value, assume_sorted)
443 elif kind not in ('linear', 'nearest'):
444 raise NotImplementedError("%s is unsupported: Use fitpack "
--> 445 "routines for other types." % kind)
446 x = array(x, copy=self.copy)
447 y = array(y, copy=self.copy)

NotImplementedError: next is unsupported: Use fitpack routines for other types.

add_dist failed at statement "twice = d6.add_dist(d6)" in 01_cookie.ipynb

The error:
AttributeError: 'Pmf' object has no attribute 'add_dist'

Looks like you have removed add_dist from the Pmf class in the latest version of the empyrical_dist and that seems to break the code in the notebook.

My questions/requests to you are:

  1. Did you change the empyrical_dist from empiricaldist module? Are they 2 different packages? Is one a later version of the other?
  2. If you have changed the name/code of the module, could you please make necessary changes in the jupyter notebook files as well, so that it is easier for people who are following your lecture on youtube?

Unable to import empyrical-dist module

I have installed empyrical-dist module using pipenv virtual environment but when I try to run your code 01_cookie.ipynb, it throws an error in the first cell at "from empiricaldist import Pmf" saying "ModuleNotFoundError: No module named 'empiricaldist'". Could you please look into this and confirm this is not an issue with the package itself?

Incorrect solution: Dungeons & Dragons Bonus

First of all, thank you for putting on a great tutorial!

I believe I found bad solution in the cookie-notebook.

Bonus exercise: In Dungeons and Dragons, the amount of damage a goblin can withstand is the sum of two six-sided dice. The amount of damage you inflict with a short sword is determined by rolling one six-sided die.

Suppose you are fighting a goblin and you have already inflicted 3 points of damage. What is your probability of defeating the goblin with your next successful attack?

The provided solution is:

d6 = Pmf()
for x in [1,2,3,4,5,6]:
    d6[x] = 1
d6.normalize()

twice = d6.add_dist(d6)
twice[2] = 0
twice[3] = 0
twice.normalize()

>>> d6.ge_dist(twice)
0.11111111111111109

This implies that Goblin's health should be reduced, due to the 3 damage you already did, by creating the posterior over the Goblin's health with the assumption that it does not have 1-3 health remaining. Clearly this is not correct. The blow means that the Goblin's health must lie in the interval [1, 9], not [4, 12]

The correct solution, I believe, would be:

d6 = Pmf()
for x in [1,2,3,4,5,6]:
    d6[x] = 1
d6.normalize()

twice = d6.add_dist(d6)
goblin_health = twice.copy()

# 3 HP of damage already dealt:
dmg3 = Pmf()
dmg3[3] = 1.
sword = d6.copy().add_dist(dmg3)

>>> sword.ge_dist(goblin_health)
0.5

Include a `requirements.txt` file

To make it easier for attendees to install the necessary packages it would be nice to include a requirements.txt, e.g.

# requirements.txt
scipy
numpy
matplotlib
pandas

Attendees can then run pip install -r requirement.txt to get the required packages installed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.