davisyoshida / lorax Goto Github PK

View Code? Open in Web Editor NEW

124.0 4.0 4.0 57 KB

LoRA for arbitrary JAX models and functions

License: MIT License

Python 100.00%

lorax's Issues

Equinox support?

This looks neat! I'm just curious about supporting Equinox as a possible backend neural network library.

This is typically called as:

model = eqx.nn.MLP(...)
model(data)

but this can still be thought of in an init/apply paradigm if you want it to:

init = eqx.nn.MLP
apply = eqx.nn.MLP.__call__

params = init(...)
apply(params, data)

c.f. also this example

So I'm guessing this should be straightforward/elegant to support.

(I'll own up to the fact that I'm discussing compatibility with one of my own projects here!)

I would like to use a separate neural network to predict LoRA weights for a main neural network, while training both neural networks at the same time. How can I manipulate the pytrees or to achieve this if it is possible at all?

How to use lorax in python3.8

good works!
I hope use lorax in python3.8, but qax-0.2.0 require python3.10. How to use lorax in python3.8?
Please help me!

ValueError: safe_zip() argument 2 is shorter than argument 1

Thank you for your work. However, I encounter a bug when running the simple example.

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/wenbo/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/media/wenbo/12T/manipulation_project/lorax/examples/simple.py", line 36, in
lora_model(lora_params, jnp.ones((dim,)))
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/qax/implicit/implicit_array.py", line 59, in implicit_f
outs_flat = f_wrapped.call_wrapped(*flat_args)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/jax/_src/linear_util.py", line 192, in call_wrapped
ans = self.f(*args, **dict(self.params, **kwargs))
File "/media/wenbo/12T/manipulation_project/lorax/examples/simple.py", line 10, in model
x = jax.nn.relu(x @ massive_w)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/jax/_src/numpy/array_methods.py", line 265, in deferring_binary_op
return binary_op(*args)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/qax/implicit/implicit_array.py", line 302, in process_primitive
outs = _default_handlers[primitive.name](primitive, *vals, params=params)
File "/home/wenbo/Documents/data/miniconda3/envs/octo/lib/python3.10/site-packages/qax/implicit/implicit_array.py", line 401, in _handle_pjit
outs = primitive.bind(*subfuns, *flat_inputs, **bind_params)
ValueError: safe_zip() argument 2 is shorter than argument 1

LoRA Hypernetworks for 3 dimensional kernels

          I am currently playing around with training the Transformer. I am also now wondering how I could apply this to convolutions as the kernel for the convolution has 3 dimensions?

Originally posted by @PuR3Luck in #6 (comment)

Integration into EasyLM

Hi! Cool project. I wonder how hard it would be to implement an integration of this library into something like @young-geng's EasyLM. That would make using lorax really easy as all the training would be handled by EasyLM.

How to save the parameters non-destructively?

Hi, thank you for this very useful library.
How to save the LoRA parameters?
I need to implement training checkpointing, so I don't want to merge the parameters in a way that is not possible to unmerge. When using a msgpack on the LoRA weights, I get an error TypeError: can not serialize 'LoraWeight' object.
I guess it would be possible to export the dictionnary as a regular PyTree of JAX Array and then import the weights again as a LoRA weight class. However, I am not sure how to write this.

UserWarning in `qax`

Hello, I am testing lorax on my flax model in jax 0.4.26, and I got warning like this:

/home/zyh/anaconda3/envs/jax_v0.4.26/lib/python3.10/site-packages/qax/implicit/implicit_array.py:306: UserWarning: Primitive concatenate was not handled by class LoraWeight, so implicit args will be materialized.
  warnings.warn(f'Primitive {primitive.name} was not handled by class {implicit_cls.__name__}, so implicit args will be materialized.')

Will this behavior cause numerical errors in calculation? Thanks!

LoRA for trasnformers are typically only applied to Linear layers, how to achieve this with this package？

Using lorax with class-based models

Great work with lorax and qax!
I am trying to add LoRA to a pretrained model of Octo.
The main model OctoModel is a class instead of a function. When I use lora_model = lorax.lora(model) on model = OctoModel.load_pretrained("hf://rail-berkeley/octo-base"), the output is a function instead of a class. All attributes of model are there in lora_model but all functions of model are lost (functions associated with @struct.dataclass and with class OctoModel). Is there any way to use lorax and ensure that lora_model is a class with the same functions and attributes as model? I hit a key error while fine-tuning at the train_step which tells me that the model used should not be a function.
Thanks a lot!

P.S. I've used your example to edit the fine-tuning script but model = FlaxGPT2LMHeadModel.from_pretrained('gpt2') outputs a function not a class.

lorax with haiku

Awesome work!

I'm trying to use the combination of Lorax and Haiku. However, due to the design features of Haiku itself, I'm not sure if Lorax is feasible. I hope you can provide some assistance.

Here is my first part for preparing a linear module.

import numbers
from typing import Union, Sequence
import haiku as hk
import jax.numpy as jnp
import numpy as np

class Linear(hk.Module):
  """Protein folding specific Linear module.

  This differs from the standard Haiku Linear in a few ways:
    * It supports inputs and outputs of arbitrary rank
    * Initializers are specified by strings
  """

  def __init__(self,
               num_output: Union[int, Sequence[int]],
               initializer: str = 'linear',
               num_input_dims: int = 1,
               use_bias: bool = True,
               bias_init: float = 0.,
               precision = None,
               name: str = 'linear'):
    """Constructs Linear Module.

    Args:
      num_output: Number of output channels. Can be tuple when outputting
          multiple dimensions.
      initializer: What initializer to use, should be one of {'linear', 'relu',
        'zeros'}
      num_input_dims: Number of dimensions from the end to project.
      use_bias: Whether to include trainable bias
      bias_init: Value used to initialize bias.
      precision: What precision to use for matrix multiplication, defaults
        to None.
      name: Name of module, used for name scopes.
    """
    super().__init__(name=name)
    if isinstance(num_output, numbers.Integral):
      self.output_shape = (num_output,)
    else:
      self.output_shape = tuple(num_output)
    self.initializer = initializer
    self.use_bias = use_bias
    self.bias_init = bias_init
    self.num_input_dims = num_input_dims
    self.num_output_dims = len(self.output_shape)
    self.precision = precision

  def __call__(self, inputs):
    """Connects Module.

    Args:
      inputs: Tensor with at least num_input_dims dimensions.

    Returns:
      output of shape [...] + num_output.
    """

    num_input_dims = self.num_input_dims

    if self.num_input_dims > 0:
      in_shape = inputs.shape[-self.num_input_dims:]
    else:
      in_shape = ()

    weight_init = get_initializer_scale(self.initializer, in_shape)

    in_letters = 'abcde'[:self.num_input_dims]
    out_letters = 'hijkl'[:self.num_output_dims]

    weight_shape = in_shape + self.output_shape
    weights = hk.get_parameter('weights', weight_shape, inputs.dtype,
                               weight_init)

    equation = f'...{in_letters}, {in_letters}{out_letters}->...{out_letters}'

    output = jnp.einsum(equation, inputs, weights, precision=self.precision)

    if self.use_bias:
      bias = hk.get_parameter('bias', self.output_shape, inputs.dtype,
                              hk.initializers.Constant(self.bias_init))
      output += bias

    return output

These are some of my attempts, and the process of obtaining LoraWeight through lorax.init went smoothly.

def _model(x, n_out):
    module = Linear(num_output=n_out, name='linear')
    return module(x)
model = hk.transform(_model)

n_in = 5000
n_out = 3000
dummy_x = jnp.ones((n_in))
rng_key = jax.random.PRNGKey(42)
params = model.init(rng=rng_key, x=dummy_x, n_out=n_out)

import lorax
from lorax.constants import LORA_FREEZE, LORA_FULL

def decision_fn(path, param):
    if 'bias' in path:
        print(f'Fully finetuning param {path}')
        return LORA_FULL
    dim = 32
    print(f'Using LoRA with dim={dim} for param {path}')
    return dim

lora_spec = lorax.simple_spec(params, decision_fn=decision_fn, tune_vectors=True)
lora_params = lorax.init_lora(params, lora_spec, jax.random.PRNGKey(42))

However, obtaining the lora_model and utilizing the lora_params did not work for me.

# code may like these
lora_model = lorax.lora(model)
out = lora_model(params=lora_params, x=dummy_x, rng=jax.random.PRNGKey(42), n_out=n_out)

Looking forward to your response!：）

davisyoshida / lorax Goto Github PK

lorax's Issues

Equinox support?

Predicting LoRA weights

How to use lorax in python3.8

ValueError: safe_zip() argument 2 is shorter than argument 1

LoRA Hypernetworks for 3 dimensional kernels

Integration into EasyLM

How to save the parameters non-destructively?

UserWarning in `qax`

LoRA for trasnformers are typically only applied to Linear layers, how to achieve this with this package？

Using lorax with class-based models

lorax with haiku

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent