Comments (4)
There was talk of using the metadata file to implement this to avoid reading both the checkpoint file and the git-theta model dir. This is a great idea. I have a suggestion on the API/implementation for this through.
Instead of something like looping through the model parameters and comparing them to the metadata file, we can have the git-theta model dir reading be lazy. Instead of returning actual parameter values it returns some object with tensor meta data (shape, dtype, hash, and maybe even a __eq__
function that knows how to hash/compare to a real tensor [I got myself excited with this lol]!) with methods to read and replace itself with the real tensor.
With this API, comparisions between the checkpoint tree and the model dir tree would look just like any tree intersection instead of being custom.
It could be overkill as this is similar to an approach we have in t5x where the main use of lazy arrays is for checkpoint manipulation for huge models which might not be a use case we care about now (but could be useful for parameter-efficient update applications in the future).
from git-theta.
We talked in coworking and the main approach we decided on was to:
- Build the model_dict mapping parameter names to real tensor values
- Build a lazy_model_dict mapping parameter names to LazyArray values
- This will be built from the previous metadata file, which may need to be gotten through
git show HEAD:path/to/checkpoint
- Iterate through the two versions together checking lazy == real (note lazy needs to be the LHS for now)
- Only write values when they don't evaluate to equal.
LazyArray skeleton
class LazyArray:
def __init__(self, shape, dtype, hash):
self._shape = shape
self._dtype = dtype
self._hash = hash
@property
def shape(self):
return self._shape
@property
def dtype(self):
return self._dtype
@property
def hash(self):
return self._hash
def __eq__(self, other):
if self.shape != other.shape or self.dtype != other.dtype:
return False
if isinstance(other, LazyArray):
return self.hash == other.hash
else:
return self.hash == hash_array(other)
Eventually we plan to extend the lazy array with the ability to load the value it represents.
from git-theta.
Note #90 has landed. It converts the metadata file into a nested format which should make the creation of the lazy array easier
from git-theta.
Note: The new multi-pointer approach #114 will change how we solve this. That change also makes this important as it allows us to skip piping parameters to git-lfs
from git-theta.
Related Issues (20)
- Add an "apply to all" option to merge actions
- Parameter groups that are more than just tensors? HOT 3
- Add a way to script merges
- Functionality for partial model loading HOT 3
- Method to tell if git-theta wasn't installed? HOT 4
- Pytorch Checkpoint reading
- Git Add can have high memory usage.
- Finer-grained control of `git theta install` HOT 1
- Tensorflow model loading/saving seems bugged
- `git theta ls-files` HOT 1
- Git-Theta Clean
- Hanging when crashing
- More intelligent concurrency limits
- Investigate using cffi to speed up git lfs interface
- Configurable Serialization, Combining, and Saving to a backend
- Add `__str__` to metadata object HOT 1
- Update CI to handle MacOS
- Add retry to end2end tests
- in the `clean` filter, auto-detect checkpoint handler based on file extension HOT 1
- [end2ends] push repos to Hugging Face Hub (and git clone from there) to ensure it works HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from git-theta.