Instead of tracking/applying sparse updates manually (for example storing them in a di

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="14

We talked about this solution and it seems like it will work. <a href="https://github.

Closed by <a class="issue-link js-issue-link" data-error-text="Failed to load title" d

Investigate using git for tracking sparse updates and git smudge to apply them. about git-theta HOT 6 CLOSED

blester125 commented on August 14, 2024

Investigate using git for tracking sparse updates and git smudge to apply them.

from git-theta.

Comments (6)

craffel commented on August 14, 2024

This is cool. A small note that it's not just dense updates that we'd stop on - one could imagine other update types (e.g. "randomly set the values by drawing from a normal distribution with seed N" or "set all the values to 1") which are not dense per se (i.e. they don't involve storing explicit parameter values) but do involve setting all the parameter values while ignoring the previous values. Probably best to distinguish between updates that are truly updates (i.e. they rely on modifying the previous state) or aren't and use just look for the first instance of the latter kind of update. As an aside, I think it's informative to think about a from-scratch training run - ideally the first commit would just say "add these parameter groups and initialize them in this way".

I had something else to say but I forgot, maybe I will think of it another time.

from git-theta.

blester125 commented on August 14, 2024

Yeah, we definitely want to support stopping at other update types.

I think a recursive solution would handle that. A "true update" would look up the previous update type in git and call the .get (or whatever) method. If the previous one is also a "true update" it will continue the recursion. A "fake update" like dense or your random value one would just return values as is and function as a base case, without needing to enumerate what update types overwrite all params.

from git-theta.

blester125 commented on August 14, 2024

#84 was very close to implementing this approach, however, it seems like it is not possible to do this correctly when using git to time travel (git checkout ${commit}, git checkout branch, etc.).

The gist of it is that the code that looks back through the git history to build the parameter needs to know where in the history to start looking. In something like a checkout, we only the commit we are at, within the smudge filter there isn't a way to know what commit we are going to.

So basically the result is that whenever we time travel, we end up with the smudged model checkpoint of where we were not where we wanted to be. Running git reset --hard fixes this, but we don't want to have to run this everytime.

I talked with @nkandpa2 about this issue and neither of us found a way to fix it. Thus we took a lot of the ideas on how this implementation of updates worked and applied them to a file system based method of tracking and applying updates in #92.

I'm closing this as we don't think the git approach will work, but I'll leave the branch with the implementation on my fork as it may be useful to revisit in the future.

from git-theta.

nkandpa2 commented on August 14, 2024

Re-opening this discussion. I can't remember if we talked about this solution but why wouldn't it work to store the the hash of HEAD in the metadata file at clean time?

For example:

I stage a model for the first time and the hash of HEAD is 1. When it gets staged the metadata file contains a key "previous_commit": 1. I commit this checkpoint and now the hash of HEAD is 2.
I make a sparse update to the model and stage that. The staged metadata file contains "previous_commit": 2. I commit this checkpoint and now HEAD is 3.
I make another sparse update to the model and stage it. The staged metadata file contains "previous_commit": 3. I commit this checkpoint and now HEAD is 4.
I make as many other commits as I like.

Now say, I run git checkout 4. The smudge filter reads the metadata file and loads up the files in commit 4. It sees that commit 4 was a sparse update so it looks up the "previous_commit" key in the metadata file and recursively loads commit 3. Since 3 is also a sparse update, it looks up the key in the metadata file and recursively loads commit 2. Finally, commit 2 is a dense update so we don't need to recurse any further.

Are there any issues with this solution?

from git-theta.

blester125 commented on August 14, 2024

We talked about this solution and it seems like it will work. This branch has some tools for getting files from the git history which should help in the multi-pointer PR too.

We can get this up and running once the multi-pointer branch is working with dense updates.

One of the main questions to explore for us is if we will be able to track the last update directly or if we will need to iterate through history to find it but either way till work.

In the original git-tracks-updates implementation I occasionally has times where it was slow to re-build indices on something like a checkout. In the new format, the only file getting index'd is the main metadata file (not each parameter file) so it should be faster?

One question this does bring up is our tree processing algorithms. Currently we essentially process the parameter tree depth first where each parameter is processed individually (which might involve moving backwards through the git history) which could cause repeated work. It might be more efficient to collect all parameters that have changed in a batch and then go back in time once updating each parameter as appropriate. But before a large refactor like that we should 1) test that it is actually an issue and 2) check if memoization of our "get file from git history" function fixes any issue there is.

from git-theta.

nkandpa2 commented on August 14, 2024

Closed by #114

from git-theta.

Investigate using git for tracking sparse updates and git smudge to apply them. about git-theta HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent