Giter Site home page Giter Site logo

indaba-pracs-2019's People

Contributors

elanvb avatar jamesallingham avatar sasha115 avatar

Watchers

 avatar

indaba-pracs-2019's Issues

Clarity / comparability of methods

Sub issues:

  • Consistency to starting x and y of Banana function section #12
  • Update SGD with decay hyper-parameters so that this method is somewhat competitive in the “Putting it all into practice” section #13
  • Fix code variable naming/notation to be consistent (learning rate decay is different to all other sections) #14
  • Pseudocode notation to be consistent (spacing around “*”, variable names, more pseudocode like - less python-like, etc) #15

Clarity of descriptions / text

Sub issues:

  • Update language used to be more friendly to non-native English speakers #8
  • Ensure that main practical goals are clearly met #9
  • Tweak some descriptions for better clarity #10
  • Ensure all relevant terminology is introduced prior to it being needed (or at least link to a source) #11

Relevant terminology

Ensure all relevant terminology is introduced prior to it being needed (or at least link to a source).

Interactive batch size insight

It would be great to add an interactive cell where the students can play around with batch size and see how it affects the quality of the gradient estimate (and perhaps the effect on computation time).

[main issue] Improve the optimisation practical

Sub issues

  • Fix errors #2
  • Fix pseudocode (errors in Adam section)
  • Clarity of descriptions / text #3
  • Update language used to be more friendly to non-native English speakers
  • Ensure that main practical goals are clearly met
  • Tweak some descriptions for better clarity
  • Ensure all relevant terminology is introduced prior to it being needed (or at least link to a source)
  • Clarity / comparability of methods #4
  • Consistency to starting x and y of Banana function section
  • Update SGD with decay hyper-parameters so that this method is somewhat competitive in the “Putting it all into practice” section
  • Fix code variable naming/notation to be consistent (learning rate decay is different to all other sections)
  • Pseudocode notation to be consistent (spacing around “*”, variable names, more pseudocode like - less python-like, etc)
  • Increase intuition #5
  • Add more interactive elements
    • “what are gradients” section
    • practically to see the effect of batch size
  • Answers and solutions at the ready #6

  • Aesthetics #7

  • Enlarge the global minimum star in the plot
  • Add comments to the main code #28

Non-native English friendly language

There are words and phrases used that I think will make the tutorial less accessible to non-native English speakers. A particular example is the metaphore for gradient descent.

Aesthetics

  • Enlarge the "Minimum" star on the Banana plots so that we can see if the "End" star is on the same position

Add comments to the main code

Tensorflow can be confusing to those who have not seen much of it before. I think we should maybe add some basic comments in the code to make sure people easily follow what is going on.

For instance, seeing Tensor.assign_sub doesn't make it that clear that what is happening is we are assigning a new value to the tensor by subtracting the given value from its current value.

I think it would also be great to put the relevant pieces of mathematics as comments next to the code where it is being computed.

Interactive gradient visualisation

I would be great to add an interactive cell that gives a better feel for the "what are gradients" section. I think, just recreating the image with an interactive plot would be excellent.

Increase intuition

Sub issues:

Add more interactive elements

  • “what are gradients” section #16
  • practically to see the effect of batch size #17

Better SGD hyper-parameters

We should give SGD with decay better hyper-parameters so that this method is somewhat competitive in the “Putting it all into practice” section. Otherwise, it might look like this method can be ignored.

I have actually played around a bit and I think a learning rate of 0.5 and a decay 0.01 works quite well.

I was also thinking of adding the Momentum with decay example because it is just too easy to leave as an exercise and it shows that many of these techniques can be combined. Hyper-parameter values for that case that provide decent performance are a learning rate of 0.5, momentum of 0.9, and decay of 0.01.

Better clarity

Some sections could be improved in terms of clarity. For instance, I feel the sentence order in the RMSProp section should be changed around. We should first point out that a single learning rate is multiplied to the gradients for all the variables, then we mention that some parameters might not need to be changed as much as others, etc.

Answers and solutions at the ready

Sub issues:

  • We should have a list of answers for the (partner) exercises ready to give the tutors for this practical
  • We should have the coding sections done so that we can merge them in after the Indaba (and perhaps give those implementations to the tutors too - so that they can help the students)

Consistent variable names

Variable naming/notation should be consistent across all optimisation methods (the learning rate decay section uses different variable names to all other sections).

Fix errors

  • The pseudocode in the Adam section has errors. Both the phrases "mixing_prop" and "beta" are used but just "beta" should be used.

Consistent pseudocode

Pseudocode notation should be consistent (spacing around “*”, variable names, more pseudocode like - less python-like, etc).

Main practical goals

I feel that some of the main goals are not met. Specifically:

  • An explicit answer for what an optimiser is is never given
  • An explicit description of how optimisers are used in deep learning is never given
  • I don't think students will leave with a proper understanding of how batch size affects optimisation (or the quality of the gradient) - related to #5 and [batch size interactive]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.