Giter Site home page Giter Site logo

lorax's Issues

Equinox support?

This looks neat! I'm just curious about supporting Equinox as a possible backend neural network library.

This is typically called as:

model = eqx.nn.MLP(...)
model(data)

but this can still be thought of in an init/apply paradigm if you want it to:

init = eqx.nn.MLP
apply = eqx.nn.MLP.__call__

params = init(...)
apply(params, data)

c.f. also this example

So I'm guessing this should be straightforward/elegant to support.

(I'll own up to the fact that I'm discussing compatibility with one of my own projects here!)

Predicting LoRA weights

I would like to use a separate neural network to predict LoRA weights for a main neural network, while training both neural networks at the same time. How can I manipulate the pytrees or to achieve this if it is possible at all?

How to use lorax in python3.8

good works!
I hope use lorax in python3.8, but qax-0.2.0 require python3.10. How to use lorax in python3.8?
Please help me!

ValueError: safe_zip() argument 2 is shorter than argument 1

Thank you for your work. However, I encounter a bug when running the simple example.

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/media/wenbo/12T/manipulation_project/lorax/examples/simple.py", line 36, in
lora_model(lora_params, jnp.ones((dim,)))
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/qax/implicit/implicit_array.py", line 59, in implicit_f
outs_flat = f_wrapped.call_wrapped(*flat_args)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/jax/_src/linear_util.py", line 192, in call_wrapped
ans = self.f(*args, **dict(self.params, **kwargs))
File "/media/wenbo/12T/manipulation_project/lorax/examples/simple.py", line 10, in model
x = jax.nn.relu(x @ massive_w)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/jax/_src/numpy/array_methods.py", line 265, in deferring_binary_op
return binary_op(*args)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/qax/implicit/implicit_array.py", line 302, in process_primitive
outs = _default_handlers[primitive.name](primitive, *vals, params=params)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/qax/implicit/implicit_array.py", line 401, in _handle_pjit
outs = primitive.bind(*subfuns, *flat_inputs, **bind_params)
ValueError: safe_zip() argument 2 is shorter than argument 1

Integration into EasyLM

Hi! Cool project. I wonder how hard it would be to implement an integration of this library into something like @young-geng's EasyLM. That would make using lorax really easy as all the training would be handled by EasyLM.

How to save the parameters non-destructively?

Hi, thank you for this very useful library.
How to save the LoRA parameters?
I need to implement training checkpointing, so I don't want to merge the parameters in a way that is not possible to unmerge. When using a msgpack on the LoRA weights, I get an error TypeError: can not serialize 'LoraWeight' object.
I guess it would be possible to export the dictionnary as a regular PyTree of JAX Array and then import the weights again as a LoRA weight class. However, I am not sure how to write this.

UserWarning in `qax`

Hello, I am testing lorax on my flax model in jax 0.4.26, and I got warning like this:

/home/zyh/anaconda3/envs/jax_v0.4.26/lib/python3.10/site-packages/qax/implicit/implicit_array.py:306: UserWarning: Primitive concatenate was not handled by class LoraWeight, so implicit args will be materialized.
  warnings.warn(f'Primitive {primitive.name} was not handled by class {implicit_cls.__name__}, so implicit args will be materialized.')

Will this behavior cause numerical errors in calculation? Thanks!

Using lorax with class-based models

Great work with lorax and qax!
I am trying to add LoRA to a pretrained model of Octo.
The main model OctoModel is a class instead of a function. When I use lora_model = lorax.lora(model) on model = OctoModel.load_pretrained("hf://rail-berkeley/octo-base"), the output is a function instead of a class. All attributes of model are there in lora_model but all functions of model are lost (functions associated with @struct.dataclass and with class OctoModel). Is there any way to use lorax and ensure that lora_model is a class with the same functions and attributes as model? I hit a key error while fine-tuning at the train_step which tells me that the model used should not be a function.
Thanks a lot!

P.S. I've used your example to edit the fine-tuning script but model = FlaxGPT2LMHeadModel.from_pretrained('gpt2') outputs a function not a class.

lorax with haiku

Awesome work!

I'm trying to use the combination of Lorax and Haiku. However, due to the design features of Haiku itself, I'm not sure if Lorax is feasible. I hope you can provide some assistance.

Here is my first part for preparing a linear module.

import numbers
from typing import Union, Sequence
import haiku as hk
import jax.numpy as jnp
import numpy as np

class Linear(hk.Module):
  """Protein folding specific Linear module.

  This differs from the standard Haiku Linear in a few ways:
    * It supports inputs and outputs of arbitrary rank
    * Initializers are specified by strings
  """

  def __init__(self,
               num_output: Union[int, Sequence[int]],
               initializer: str = 'linear',
               num_input_dims: int = 1,
               use_bias: bool = True,
               bias_init: float = 0.,
               precision = None,
               name: str = 'linear'):
    """Constructs Linear Module.

    Args:
      num_output: Number of output channels. Can be tuple when outputting
          multiple dimensions.
      initializer: What initializer to use, should be one of {'linear', 'relu',
        'zeros'}
      num_input_dims: Number of dimensions from the end to project.
      use_bias: Whether to include trainable bias
      bias_init: Value used to initialize bias.
      precision: What precision to use for matrix multiplication, defaults
        to None.
      name: Name of module, used for name scopes.
    """
    super().__init__(name=name)
    if isinstance(num_output, numbers.Integral):
      self.output_shape = (num_output,)
    else:
      self.output_shape = tuple(num_output)
    self.initializer = initializer
    self.use_bias = use_bias
    self.bias_init = bias_init
    self.num_input_dims = num_input_dims
    self.num_output_dims = len(self.output_shape)
    self.precision = precision

  def __call__(self, inputs):
    """Connects Module.

    Args:
      inputs: Tensor with at least num_input_dims dimensions.

    Returns:
      output of shape [...] + num_output.
    """

    num_input_dims = self.num_input_dims

    if self.num_input_dims > 0:
      in_shape = inputs.shape[-self.num_input_dims:]
    else:
      in_shape = ()

    weight_init = get_initializer_scale(self.initializer, in_shape)

    in_letters = 'abcde'[:self.num_input_dims]
    out_letters = 'hijkl'[:self.num_output_dims]

    weight_shape = in_shape + self.output_shape
    weights = hk.get_parameter('weights', weight_shape, inputs.dtype,
                               weight_init)

    equation = f'...{in_letters}, {in_letters}{out_letters}->...{out_letters}'

    output = jnp.einsum(equation, inputs, weights, precision=self.precision)

    if self.use_bias:
      bias = hk.get_parameter('bias', self.output_shape, inputs.dtype,
                              hk.initializers.Constant(self.bias_init))
      output += bias

    return output

These are some of my attempts, and the process of obtaining LoraWeight through lorax.init went smoothly.

def _model(x, n_out):
    module = Linear(num_output=n_out, name='linear')
    return module(x)
model = hk.transform(_model)

n_in = 5000
n_out = 3000
dummy_x = jnp.ones((n_in))
rng_key = jax.random.PRNGKey(42)
params = model.init(rng=rng_key, x=dummy_x, n_out=n_out)

import lorax
from lorax.constants import LORA_FREEZE, LORA_FULL

def decision_fn(path, param):
    if 'bias' in path:
        print(f'Fully finetuning param {path}')
        return LORA_FULL
    dim = 32
    print(f'Using LoRA with dim={dim} for param {path}')
    return dim

lora_spec = lorax.simple_spec(params, decision_fn=decision_fn, tune_vectors=True)
lora_params = lorax.init_lora(params, lora_spec, jax.random.PRNGKey(42))

However, obtaining the lora_model and utilizing the lora_params did not work for me.

# code may like these
lora_model = lorax.lora(model)
out = lora_model(params=lora_params, x=dummy_x, rng=jax.random.PRNGKey(42), n_out=n_out)

Looking forward to your response!:)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.