Giter Site home page Giter Site logo

Comments (4)

blester125 avatar blester125 commented on August 14, 2024

There was talk of using the metadata file to implement this to avoid reading both the checkpoint file and the git-theta model dir. This is a great idea. I have a suggestion on the API/implementation for this through.

Instead of something like looping through the model parameters and comparing them to the metadata file, we can have the git-theta model dir reading be lazy. Instead of returning actual parameter values it returns some object with tensor meta data (shape, dtype, hash, and maybe even a __eq__ function that knows how to hash/compare to a real tensor [I got myself excited with this lol]!) with methods to read and replace itself with the real tensor.

With this API, comparisions between the checkpoint tree and the model dir tree would look just like any tree intersection instead of being custom.

It could be overkill as this is similar to an approach we have in t5x where the main use of lazy arrays is for checkpoint manipulation for huge models which might not be a use case we care about now (but could be useful for parameter-efficient update applications in the future).

from git-theta.

blester125 avatar blester125 commented on August 14, 2024

We talked in coworking and the main approach we decided on was to:

  1. Build the model_dict mapping parameter names to real tensor values
  2. Build a lazy_model_dict mapping parameter names to LazyArray values
  • This will be built from the previous metadata file, which may need to be gotten through git show HEAD:path/to/checkpoint
  1. Iterate through the two versions together checking lazy == real (note lazy needs to be the LHS for now)
  2. Only write values when they don't evaluate to equal.

LazyArray skeleton

class LazyArray:
  def __init__(self, shape, dtype, hash):
    self._shape = shape
    self._dtype = dtype
    self._hash = hash
    
  @property
  def shape(self):
    return self._shape
    
  @property
  def dtype(self):
    return self._dtype
    
  @property
  def hash(self):
    return self._hash
    
  def __eq__(self, other):
    if self.shape != other.shape or self.dtype != other.dtype:
      return False
    if isinstance(other, LazyArray):
      return self.hash == other.hash
    else:
      return self.hash == hash_array(other)

Eventually we plan to extend the lazy array with the ability to load the value it represents.

from git-theta.

blester125 avatar blester125 commented on August 14, 2024

Note #90 has landed. It converts the metadata file into a nested format which should make the creation of the lazy array easier

from git-theta.

blester125 avatar blester125 commented on August 14, 2024

Note: The new multi-pointer approach #114 will change how we solve this. That change also makes this important as it allows us to skip piping parameters to git-lfs

from git-theta.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.