Giter Site home page Giter Site logo

Comments (7)

tbuckl avatar tbuckl commented on July 22, 2024

The importance of versioning notebooks/outputs is not limited to the "working" outputs from them. In fact, it may be more pressing for debugging. Whats the best way to share errors in notebooks that need to be debugged? Recently, we committed to master a copy-and-paste of the errors from the Simulation notebook as a text file. BayAreaMetro@4f462d2

This is clearly not a good way to proceed. So what else might we do? Committing those changes on a temporary branch would probably be wiser, but is that the best way to work?

One related issue is that before the Simulation notebook is run, the Estimation notebook might(should?) be run, and this will produce outputs that change yaml files in "configs." This is especially confusing for a new user (especially one thats working with git). It seems that all of the changed YAML files in /configs/ should also be checked in so that Simulation can be debugged with those. However, the user might not know whether or not those configs were relevant to the Simulation bug.

from bayarea_urbansim.

fscottfoti avatar fscottfoti commented on July 22, 2024

Generally speaking I think Notebooks are terrible for version control for all the obvious reasons. I've begin to just use them for development and polished outputs come from a straight Python script - .e.g. -

https://github.com/synthicity/bayarea_urbansim/blob/master/Simulation.py

As for estimation, I do not think we should be running estimation before simulation every time. Estimation rarely needs to change once we get coefficients we believe in and we just modify simulation inputs and rerun. It does make sense to test estimation to make sure it works on a regular basis, but I would then discard the results. That said, I have often wanted the feature that would check for 1 or 2 decimal place closeness of coefficients and not update the YAML files if it's the same at that degree of precision.

from bayarea_urbansim.

tbuckl avatar tbuckl commented on July 22, 2024

Thanks @fscottfoti. So it sounds like the best thing would be for us to share Python scripts when it comes to the input. What do you make of sharing the outputs? Should we just write the standard error and standard output as Simulation.stderr and Simulation.stdout and commit those? Also, should we commit these on a new branch each time? It seems that we don't need to ever merge back in outputs to master, but we would like to keep track of them and share them.

from bayarea_urbansim.

fscottfoti avatar fscottfoti commented on July 22, 2024

Makes sense. Honestly just making a gist seems like a good idea for some of these things. Saving and sharing the stdout on the outputs makes a lot of sense. I don't think that these are really version controlled though - I mean there's random noise every time you run so you can't really compare them. I mean you just run them and tag them with a date and some git hashes and just save them. I wonder if you could just make the output directory sync with Box and do it that way?

from bayarea_urbansim.

tbuckl avatar tbuckl commented on July 22, 2024

i can't speak to the random noise. @mkreilly any thoughts on that?

from bayarea_urbansim.

tbuckl avatar tbuckl commented on July 22, 2024

that said, i will remove MetropolitanTransportationCommission@4f462d2 and put it on a branch with the configs

from bayarea_urbansim.

tbuckl avatar tbuckl commented on July 22, 2024

this is how grumpy cat feels about random noise:
how grumpy cat feels about noise

from bayarea_urbansim.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.