Clarity / comparability of methods

Sub issues:

Consistency to starting x and y of Banana function section #12
Update SGD with decay hyper-parameters so that this method is somewhat competitive in the “Putting it all into practice” section #13
Fix code variable naming/notation to be consistent (learning rate decay is different to all other sections) #14
Pseudocode notation to be consistent (spacing around “*”, variable names, more pseudocode like - less python-like, etc) #15

Clarity of descriptions / text

Sub issues:

Update language used to be more friendly to non-native English speakers #8
Ensure that main practical goals are clearly met #9
Tweak some descriptions for better clarity #10
Ensure all relevant terminology is introduced prior to it being needed (or at least link to a source) #11

Relevant terminology

Ensure all relevant terminology is introduced prior to it being needed (or at least link to a source).

Interactive batch size insight

It would be great to add an interactive cell where the students can play around with batch size and see how it affects the quality of the gradient estimate (and perhaps the effect on computation time).

Consistency to starting x and y of Banana function section

Mainly for the sake of more easily comparing the various optimisation methods, we should just have all the starting x and y values be the same across the optimisation methods.

[main issue] Improve the optimisation practical

Sub issues

Fix errors #2

Fix pseudocode (errors in Adam section)

Clarity of descriptions / text #3

Update language used to be more friendly to non-native English speakers
Ensure that main practical goals are clearly met
Tweak some descriptions for better clarity
Ensure all relevant terminology is introduced prior to it being needed (or at least link to a source)

Clarity / comparability of methods #4

Consistency to starting x and y of Banana function section
Update SGD with decay hyper-parameters so that this method is somewhat competitive in the “Putting it all into practice” section
Fix code variable naming/notation to be consistent (learning rate decay is different to all other sections)
Pseudocode notation to be consistent (spacing around “*”, variable names, more pseudocode like - less python-like, etc)

Increase intuition #5

Add more interactive elements
- “what are gradients” section
- practically to see the effect of batch size

Answers and solutions at the ready #6
Aesthetics #7

Enlarge the global minimum star in the plot

Add comments to the main code #28

Non-native English friendly language

There are words and phrases used that I think will make the tutorial less accessible to non-native English speakers. A particular example is the metaphore for gradient descent.

Aesthetics

Enlarge the "Minimum" star on the Banana plots so that we can see if the "End" star is on the same position

Add comments to the main code

Tensorflow can be confusing to those who have not seen much of it before. I think we should maybe add some basic comments in the code to make sure people easily follow what is going on.

For instance, seeing Tensor.assign_sub doesn't make it that clear that what is happening is we are assigning a new value to the tensor by subtracting the given value from its current value.

I think it would also be great to put the relevant pieces of mathematics as comments next to the code where it is being computed.

Interactive gradient visualisation

I would be great to add an interactive cell that gives a better feel for the "what are gradients" section. I think, just recreating the image with an interactive plot would be excellent.

Increase intuition

Sub issues:

Add more interactive elements

“what are gradients” section #16
practically to see the effect of batch size #17

Better SGD hyper-parameters

We should give SGD with decay better hyper-parameters so that this method is somewhat competitive in the “Putting it all into practice” section. Otherwise, it might look like this method can be ignored.

I have actually played around a bit and I think a learning rate of 0.5 and a decay 0.01 works quite well.

I was also thinking of adding the Momentum with decay example because it is just too easy to leave as an exercise and it shows that many of these techniques can be combined. Hyper-parameter values for that case that provide decent performance are a learning rate of 0.5, momentum of 0.9, and decay of 0.01.

Better clarity

Some sections could be improved in terms of clarity. For instance, I feel the sentence order in the RMSProp section should be changed around. We should first point out that a single learning rate is multiplied to the gradients for all the variables, then we mention that some parameters might not need to be changed as much as others, etc.

Answers and solutions at the ready

Sub issues:

We should have a list of answers for the (partner) exercises ready to give the tutors for this practical
We should have the coding sections done so that we can merge them in after the Indaba (and perhaps give those implementations to the tutors too - so that they can help the students)

Consistent variable names

Variable naming/notation should be consistent across all optimisation methods (the learning rate decay section uses different variable names to all other sections).

Fix errors

The pseudocode in the Adam section has errors. Both the phrases "mixing_prop" and "beta" are used but just "beta" should be used.

Consistent pseudocode

Pseudocode notation should be consistent (spacing around “*”, variable names, more pseudocode like - less python-like, etc).

Main practical goals

I feel that some of the main goals are not met. Specifically:

An explicit answer for what an optimiser is is never given
An explicit description of how optimisers are used in deep learning is never given
I don't think students will leave with a proper understanding of how batch size affects optimisation (or the quality of the gradient) - related to #5 and [batch size interactive]

elanvb / indaba-pracs-2019 Goto Github PK

indaba-pracs-2019's People

Contributors

Watchers

indaba-pracs-2019's Issues

Sub issues:

Sub issues:

Sub issues

Sub issues:

Add more interactive elements

Sub issues:

Recommend Projects

Recommend Topics

Recommend Org