Comments (5)
Depends on #46
from git-theta.
Took a first cut at this. Here's a sample for some feedback:
@craffel I think you said this depends on #46 because we want the diff tool to give some idea of what intermediate changes were applied between two versions of a checkpoint (e.g., between commit A and commit B there were 2 sparse updates and one dense update). As per my current understanding this will not be possible with the information that git provides external diff tools. The interface git enforces for diff tools is that they take the following parameters: diff-tool path old-file old-hash old-mode new-file new-hash new-mode
. Since there's no information about what commit the two versions come from, we won't be able to display the intermediate changes applied.
from git-theta.
I think with the way I setup the parameter updates on the FS we can actually figure out the updates applied.
Basically each update to a parameter is stored in a sub-dir that is named with the hash of the parameters for that update. So if we follow to metadata backpointers from .../params/updates/${new-file["${param_name}"]["hash"]}
(or the parameter metadata file which I think has dup information rn) until we hit old-file["${param_name}"]["hash"]
we should be able to tell what updates were applied.
I think this is dependent on the diff direction through? Like we need the on-disk .git_theta/...
dir to be the one for the most recent model. If the on-disk value is older commit we can't do this.
A demo FS dir can be found here https://github.com/r-three/git-theta/blob/3e8566a83b6ef1e5cc5a24dc41ad7e0d73b8a0d3/git_theta/updates/base.py
from git-theta.
Yeah, just took a look at the updates PR and this seems feasible for diff-ing versions that do not have a dense update between them (since the updates folder is purged upon a dense update). I guess I'll wait on implementing this feature until after #92 gets merged.
from git-theta.
Finished in #207
from git-theta.
Related Issues (20)
- Add an "apply to all" option to merge actions
- Parameter groups that are more than just tensors? HOT 3
- Add a way to script merges
- Functionality for partial model loading HOT 3
- Method to tell if git-theta wasn't installed? HOT 4
- Pytorch Checkpoint reading
- Git Add can have high memory usage.
- Finer-grained control of `git theta install` HOT 1
- Tensorflow model loading/saving seems bugged
- `git theta ls-files` HOT 1
- Git-Theta Clean
- Hanging when crashing
- More intelligent concurrency limits
- Investigate using cffi to speed up git lfs interface
- Configurable Serialization, Combining, and Saving to a backend
- Add `__str__` to metadata object HOT 1
- Update CI to handle MacOS
- Add retry to end2end tests
- in the `clean` filter, auto-detect checkpoint handler based on file extension HOT 1
- [end2ends] push repos to Hugging Face Hub (and git clone from there) to ensure it works HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from git-theta.