Giter Site home page Giter Site logo

elixir-nx / bumblebee Goto Github PK

View Code? Open in Web Editor NEW
1.2K 36.0 86.0 6.56 MB

Pre-trained Neural Network models in Axon (+ ๐Ÿค— Models integration)

License: Apache License 2.0

Elixir 100.00%
axon elixir nx hugging-face pre-trained machine-learning transformer

bumblebee's Introduction

Bumblebee

Docs Actions Status

Bumblebee provides pre-trained Neural Network models on top of Axon. It includes integration with ๐Ÿค— Models, allowing anyone to download and perform Machine Learning tasks with few lines of code.

Numbat and Bumblebees

Getting started

The best way to get started with Bumblebee is with Livebook. Our announcement video shows how to use Livebook's Smart Cells to perform different Neural Network tasks with few clicks. You can then tweak the code and deploy it.

We also provide single-file examples of running Neural Networks inside your Phoenix (+ LiveView) apps inside the examples/phoenix folder.

You may also check our official docs, which includes notebooks and our API reference. The "Tasks" section in the sidebar covers high-level APIs for using Bumblebee. The remaining modules in the sidebar lists all supported architectures.

Installation

First add Bumblebee and EXLA as dependencies in your mix.exs. EXLA is an optional dependency but an important one as it allows you to compile models just-in-time and run them on CPU/GPU:

def deps do
  [
    {:bumblebee, "~> 0.5.3"},
    {:exla, ">= 0.0.0"}
  ]
end

Then configure Nx to use EXLA backend by default in your config/config.exs file:

import Config

config :nx, default_backend: EXLA.Backend

To use GPUs, you must set the XLA_TARGET environment variable accordingly.

In notebooks and scripts, use the following Mix.install/2 call to both install and configure dependencies:

Mix.install(
  [
    {:bumblebee, "~> 0.5.3"},
    {:exla, ">= 0.0.0"}
  ],
  config: [nx: [default_backend: EXLA.Backend]]
)

Usage

To get a sense of what Bumblebee does, look at this example:

{:ok, model_info} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})

serving = Bumblebee.Text.fill_mask(model_info, tokenizer)
Nx.Serving.run(serving, "The capital of [MASK] is Paris.")
#=> %{
#=>   predictions: [
#=>     %{score: 0.9279842972755432, token: "france"},
#=>     %{score: 0.008412551134824753, token: "brittany"},
#=>     %{score: 0.007433671969920397, token: "algeria"},
#=>     %{score: 0.004957548808306456, token: "department"},
#=>     %{score: 0.004369721747934818, token: "reunion"}
#=>   ]
#=> }

We load the BERT model from Hugging Face Hub, then plug it into an end-to-end pipeline in the form of "serving", finally we use the serving to get our task done. For more details check out the documentation.

HuggingFace Hub

HuggingFace Hub is a platform hosting models, datasets and demo apps (Spaces), all using Git repositories (with Git LFS for large files). For further information check out the Hub documentation and explore the model repositories.

Models

Model repositories are regular Git repositories, therefore they can store arbitrary files. However, most repositories store models saved using the Python Transformers library. Bumblebee is an Elixir counterpart of Transformers and allows for importing those models, as long as they are implemented in Bumblebee.

A repository in the Transformers format does not store an actual model, only the trained parameters and a configuration file. The configuration file specifies the model type (e.g. BERT) and high-level properties, such as the number layers and their size. The model implementation lives in the library code (both Transformers and Bumblebee). When loading a model, the library fetches the configuration and builds a matching model, then it fetches the trained parameters to pair them with the model. The key takeaway is that in order to use any given model, it needs to have an implementation in Bumblebee.

Model repository

Here is a list of files commonly found in a repository following the Transformers format.

  • config.json - model configuration, specifies the model type and model-specific options. You can think of this as a blueprint for how the model should be constructed

  • pytorch_model.bin - raw model parameters (tensors) serialized from a PyTorch model using PyTorch format (supported by Bumblebee)

  • model.safetensors - raw model parameters (tensors) serialized from a PyTorch model using Safetensors (supported by Bumblebee)

  • flax_model.msgpack, tf_model.h5 - raw model parameters (tensors) serialized from Flax and Tensorflow models respectively (not supported by Bumblebee)

  • tokenizer.json, tokenizer_config.json - tokenizer configuration, describes how to convert text input to model inputs (tensors). See Tokenizer support

  • preprocessor_config.json - featurizer configuration, describes how to convert real-world input (image, audio) to model inputs (tensors)

  • generation_config.json - a set of configuration options specific to text generation, such as token sampling strategy and various constraints

Model support

As pointed out above, in order to load a model, the given model type must be implemented in Bumblebee. To find out whether the model is supported you can call Bumblebee.load_model({:hf, "model-repo"}) or use this tool to run a number of checks against the repository.

If you prefer to poke around the code, open the config.json file in the model repository and copy the class name under "architectures". Next, search Bumblebee codebase for that keyword. If you find a match, this indicates the model is supported.

Also note that certain repositories include multiple models in separate repositories, for example stabilityai/stable-diffusion-2. In such case use Bumblebee.load_model({:hf, "model-repo", subdir: "..."}).

Tokenizer support

The Transformers library distinguishes two types of tokenizer implementations:

  • "slow tokenizer" - a tokenizer implemented in Python and stored as tokenizer_config.json and a couple extra files

  • "fast tokenizer" - a tokenizer implemented in Rust and stored in a single file - tokenizer.json

Bumblebee relies on the Rust implementations (through bindings to Tokenizers) and therefore always requires the tokenizer.json file. Many repositories only include files for a "slow tokenizer". When you stumble upon such repository, there are two options you can try.

First, if the repository is clearly a fine-tuned version of another model, you can look for tokenizer.json in the original model repository. For example, textattack/bert-base-uncased-yelp-polarity only includes tokenizer_config.json, but it is a fine-tuned version of bert-base-uncased, which does include tokenizer.json. Consequently, you can safely load the model from textattack/bert-base-uncased-yelp-polarity and tokenizer from bert-base-uncased.

Otherwise, the Transformers library includes conversion rules to load a "slow tokenizer" and convert it to a corresponding "fast tokenizer", which is possible in most cases. You can generate the tokenizer.json file using this tool. Once successful, you can follow the steps to submit a PR adding tokenizer.json to the model repository. Note that you do not have to wait for the PR to be merged, instead you can copy commit SHA from the PR and load the tokenizer with Bumblebee.load_tokenizer({:hf, "model-repo", revision: "..."}).

License

Copyright (c) 2022 Dashbit

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

bumblebee's People

Contributors

benjamin-philip avatar blackeuler avatar bunste avatar cmeon avatar coderrg avatar connorrigby avatar dbernheisel avatar edennis avatar fhunleth avatar grzuy avatar haavars avatar ityonemo avatar jlxq0 avatar joelpaulkoch avatar jonatanklosko avatar josevalim avatar kianmeng avatar linusdm avatar msluszniak avatar nyo16 avatar preciz avatar rajrajhans avatar seanmor5 avatar sitch avatar thiagopromano avatar wtedw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bumblebee's Issues

Got OOM message with GTX3060

I've been trying to Stable Diffusion with GPU.

But it failed and I got the OOM message

Is this error message due to insufficient GPU memory?
Is it possible to make it work by adjusting some parameters?
Stable Diffusion 1.4 is running on this GPU in the tensorflow environment. It would be nice if it works with bumblebee too.

it's working fine with :host . It's amazing how easy it is to use neural networks with livebooks!!!

OS Ubunt 22.04 on WSL2
GPU GTX3060(12GB)
Livebook v0.8.0
Elixir v1.14.2
XLA_TARGET=cuda111
CUDA Version: 11.7

05:32:56.019 [info] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

05:32:56.023 [info] XLA service 0x7fb39437dac0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

05:32:56.023 [info]   StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6

05:32:56.023 [info] Using BFC allocator.

05:32:56.023 [info] XLA backend allocating 10641368678 bytes on device 0 for BFCAllocator.

05:32:58.662 [info] Start cannot spawn child process: No such file or directory
05:34:00.234 [info] total_region_allocated_bytes_: 10641368576 memory_limit_: 10641368678 available bytes: 102 curr_region_allocation_bytes_: 21282737664

05:34:00.234 [info] Stats: 
Limit:                     10641368678
InUse:                      5530766592
MaxInUse:                   7566778624
NumAllocs:                        3199
MaxAllocSize:                399769600
Reserved:                            0
PeakReserved:                        0
LargestFreeBlock:                    0

05:34:00.234 [warn] **********___***********************************************************____________________________

05:34:00.234 [error] Execution of replica 0 failed: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 3546709984 bytes.
BufferAssignment OOM Debugging.
BufferAssignment stats:
             parameter allocation:    3.84GiB
              constant allocation:       144B
        maybe_live_out allocation:   768.0KiB
     preallocated temp allocation:    3.30GiB
  preallocated temp fragmentation:       304B (0.00%)
                 total allocation:    7.15GiB
              total fragmentation:   821.0KiB (0.01%)

whole log is
oommessage.log

Stable Diffusion, load previously downloaded model

Is it possible to load an already downloaded SD model using a file path, rather than downloading it from Huggingface using the repo name?

I think it would be a nice addition especially that some people might have already some working setup of Stable Diffusion and want to skip downloading the models again.

Support object detection

Note: I'm posting this issue at Sean Moriarity's (emailed) suggestion. However, I'm not at all sure what needs to be done here, let alone how. So, I'll just summarize the use case.

As discussed here, I'd like there to be a way to scan videos of technical presentations, extract the text and layout of slides, and generate corresponding Markdown files. Aside from making it possible to search the slides, this could help to make the presentations more accessible to blind and visually impaired users.

Let's assume that a video has been downloaded from a web site (e.g., via VLC media player) and that we can extract individual images from the resulting file (e.g., via Membrane).

On edited videos, these images will often contain regions showing the presenter, a slide, and assorted fill. Before we can process the slide (e.g., via Tesseract OCR), we need to extract it from the surrounding image. And, before we can do that, we need to determine its boundaries. According to Sean:

This is an object segmentation task. It's a task available in pre-trained models on HuggingFace like DETR -- which means we can certainly build the same functionality into Bumblebee. Object segmentation outlines the boundary region of an image as you describe, and then you can use that boundary region to do whatever you want.

As a side note, various related tasks will need to be addressed. For example, a production system should identify and handle duplicate images, dynamic content, embedded graphics, etc. It would also be nice to generate and incorporate transcriptions from the audio track (and a pony...).

Issue with .load_model/1 matching on :zip.unzip (bad_central_directory)

On arch linux calling:

Bumblebee.load_model({:hf, "stanford-crfm/pubmedgpt"})

Throws

** (MatchError) no match of right hand side value: {:error, :bad_central_directory}                                              
    (bumblebee 0.1.2) lib/bumblebee/conversion/pytorch/loader.ex:29: Bumblebee.Conversion.PyTorch.Loader.load_zip!/1
    (bumblebee 0.1.2) lib/bumblebee/conversion/pytorch.ex:24: Bumblebee.Conversion.PyTorch.load_params!/4
    (bumblebee 0.1.2) lib/bumblebee.ex:399: Bumblebee.load_params/4
    (bumblebee 0.1.2) lib/bumblebee.ex:378: Bumblebee.load_model/2

The problem is with https://github.com/elixir-nx/bumblebee/blob/main/lib/bumblebee/conversion/pytorch/loader.ex#L29

Seems to be some sort of erlang decoding issue in: https://github.com/erlang/otp/blob/master/lib/stdlib/src/zip.erl

The file isn't corrupt, as I am able to unzip using the linux unzip binary. Also, the file size is 10GB, and I currently have >50GB free ram

Unify model inputs

Currently there are some inputs applicable to most most models (input embeds, head mask, position ids), but not all models accept them. We should add the missing inputs to make the models aligned as much as possible.

CLIP models

We currently have ClipText, we should also add ClipVision. There we can add a Clip model that combines both and make sure that loading parameters works fine across those.

Nx.LazyContainer not implemented for %Bumblebee.Diffusion.PndmScheduler

Hi!
I'm trying to run Stable Diffusion in Livebook without using the pre-built Nx.Serving setup (aka Bumblebee.Diffusion.StableDiffusion.text_to_image).

I'm getting stuck though because this call keeps failing:

{_state, new_latents} = Bumblebee.Diffusion.PndmScheduler.step(scheduler, scheduler_state, latents, noise_pred) 

where scheduler comes from {:ok, scheduler} = Bumblebee.load_scheduler({:hf, repository_id, subdir: "scheduler"})

The error I'm getting is:

** (Protocol.UndefinedError) protocol Nx.LazyContainer not implemented for %Bumblebee.Diffusion.PndmScheduler{num_train_steps: 1000, beta_schedule: :quadratic, beta_start: 8.5e-4, beta_end: 0.012, alpha_clip_strategy: :alpha_zero, timesteps_offset: 1, reduce_warmup: true} of type Bumblebee.Diffusion.PndmScheduler (a struct), data-structures given to defn/Nx must implement either Nx.LazyContainer or Nx.Container. This protocol is implemented for the following type(s): Any, Atom, Complex, Float, Integer, List, Map, Nx.Batch, Nx.Tensor, Tuple

I'm not sure why this doesn't work since it seems to be the same call that the text_to_image function uses internally. Is it because it's within Livebook? The failing notebook is here (last cell)

Support BigBird

I'm happy to give this a shot but if it's not far off something that exists and isn't too much effort... ;)

Validate model configuration

Since #55, we now have model configuration/docs as a data structure, so we can expand on that and use NimbleOptions for validation. One thing to keep in mind is that in our case the configuration is incremental, we want to re-configure rather than passing all the options, but we just need to merge things accordingly.

Perhaps we could generate most of the hf/transformers converters based on option types.

Support additional Stable Diffusion modes

As discussed in #111, Stable Diffusion supports a number of different modes.

Currently, only text-to-image is supported but the other modes are considered in-scope, verified by seanmor5.

Currently Supported

Currently Unsupported

As #111 was closed when the general support for Stable Diffusion 2 was added, it seemed appropriate to track these separately.

Huggingface model keeps downloading after stopping cell evaluation

I found that stopping the evaluation of a cell doesn't stop the model download from Huggingface, for a statement like:

Bumblebee.load_model({:hf, repository_id, subdir: "text_encoder"},
    log_params_diff: false
  )

The network activity stops only when I completely shutdown the Livebook server.

Cannot set temperature, top-k, etc for GPT-2 models

Text generation tasks are very susceptible to repetition for anything longer than the shortest outputs. How can we set temperature, top-k or other parameters that are normally used to avoid this in GPT-2 output?

image

Unable to configure XLA_TARGET=cuda118 to use GPU

I've been trying to use XLA_TARGET=cuda118 within a livebook app.

I'm running using the livebook github repository (https://github.com/livebook-dev/livebook.git, my HEAD is 361455cd4eb1b527e6fd04d5c51f2901cbb4ed90).

I am running using XLA_TARGET=cuda118 MIX_ENV=prod mix phx.server.

I also tried setting XLA_TARGET=cuda118 inside the environment variables of the livebook settings.

No matter what I do, when I run an example Neural Network, it always prints out:

14:30:19.336 [info] TfrtCpuClient created.

If I'm targeting the GPU, I assume it would print out a different client, right?

Notebook dependencies and setup are:

Mix.install(
  [
    {:kino_bumblebee, "~> 0.1.0"},
    {:exla, "~> 0.4.1"}
  ],
  config: [nx: [default_backend: EXLA.Backend]]
)

Bumblee seems to be the correct version:

Application.spec(:kino_bumblebee, :vsn)
'0.1.0'

Also:

Livebook v0.8.0
Elixir v1.14.2

I believe I have a supported version of nvidia on the host:

$ nvidia-smi 
Fri Dec  9 14:37:59 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.56.06    Driver Version: 520.56.06    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   37C    P8     6W / 120W |     15MiB /  6144MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     14189      G   ...xorg-server-1.20.14/bin/X        9MiB |
|    0   N/A  N/A     14217      G   ...hell-43.1/bin/gnome-shell        2MiB |
+-----------------------------------------------------------------------------+

Support Stable Diffusion 2

It looks like there are some minor things broken for stable diffusion 2-1:

** (RuntimeError) conversion failed, expected "attention_head_dim" to be a number, got: [5, 10, 20, 20]
    (bumblebee 0.1.0) lib/bumblebee/shared/converters.ex:20: anonymous fn/3 in Bumblebee.Shared.Converters.convert!/2
    (elixir 1.14.2) lib/enum.ex:2468: Enum."-reduce/3-lists^foldl/2-0-"/3
    (bumblebee 0.1.0) lib/bumblebee/shared/converters.ex:14: Bumblebee.Shared.Converters.convert!/2
    (bumblebee 0.1.0) lib/bumblebee/diffusion/unet_2d_conditional.ex:341: Bumblebee.HuggingFace.Transformers.Config.Bumblebee.Diffusion.UNet2DConditional.load/2
    (bumblebee 0.1.0) lib/bumblebee.ex:279: Bumblebee.load_spec/2
    (bumblebee 0.1.0) lib/bumblebee.ex:372: Bumblebee.load_model/2
    (stdlib 4.0.1) erl_eval.erl:744: :erl_eval.do_apply/7
    (stdlib 4.0.1) erl_eval.erl:492: :erl_eval.expr/6

Integrate tokenizers

We need a wrapper API for tokenizers to automatically handle pairs of sentences, batching, generating attention mask. Also, we should add an API for loading tokenizers, similarly to futurizers.

Simplify tokenizer modules

Currently each tokenizer module is the same, other than the default special tokens (see #141). Maybe we should kill the behaviour altogether, but we still need a place for the default special tokens.

This picture may change if we have a tokenizer that doesn't follow the current scheme exactly, such as whisper (#107).

Error compile XLA

==> xla
Compiling 2 files (.ex)
Generated xla app
rm -f /home/livebook/.cache/xla_extension/tf-d5b57ca93e506df258271ea00fc29cf98383a374/tensorflow/compiler/xla/extension && \
	ln -s "/home/livebook/.cache/mix/installs/elixir-1.14.2-erts-12.3.2.2/1afae0bfefe756b720b6a2ccf0818979/deps/xla/extension" /home/livebook/.cache/xla_extension/tf-d5b57ca93e506df258271ea00fc29cf98383a374/tensorflow/compiler/xla/extension && \
	cd /home/livebook/.cache/xla_extension/tf-d5b57ca93e506df258271ea00fc29cf98383a374 && \
	bazel build --define "framework_shared_object=false" -c opt   --config=cuda //tensorflow/compiler/xla/extension:xla_extension && \
	mkdir -p /home/livebook/.cache/xla/0.4.2/cache/build/ && \
	cp -f /home/livebook/.cache/xla_extension/tf-d5b57ca93e506df258271ea00fc29cf98383a374/bazel-bin/tensorflow/compiler/xla/extension/xla_extension.tar.gz /home/livebook/.cache/xla/0.4.2/cache/build/xla_extension-x86_64-linux-gnu-cuda.tar.gz
ln: failed to create symbolic link '/home/livebook/.cache/xla_extension/tf-d5b57ca93e506df258271ea00fc29cf98383a374/tensorflow/compiler/xla/extension': No such file or directory
make: *** [Makefile:27: /home/livebook/.cache/xla/0.4.2/cache/build/xla_extension-x86_64-linux-gnu-cuda.tar.gz] Error 1
could not compile dependency :xla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile xla", update it with "mix deps.update xla" or clean it with "mix deps.clean xla"

Using the official docker livebook image with no modifications

Tips for creating a `sentence-transformer` model?

First of all, awesome project! I'm looking forward using many models from here.

One use-case I'm excited about to try out is semantic search based on text and images based on SentenceTransformers. My first try was to export a model to ONNX and use it in Axon, but I ran into this issue mortont/axon_onnx#48. While I'm still trying to fix that I was wondering...

How do you actually create these models? Did you really implement them with the paper as resource or is there a tip on how to implement them? I'm a machine learning beginner. :-)

Support more text generation strategies

Currently Bumblebee.Text.Generation.generate/5 supports only a basic greedy strategy for selecting the next token. We should add options for more sophisticated strategies, in particular beam search and sampling.

Here is a good reference explaining various options. And contrastive search, a more recent development.

Add support for HuggingFace Datasets

Datasets could be downloaded directly instead of via Bumblebee. However, the key need is support for Parquet. It is my understanding that Parquet is going to be the "standard" for HF datasets. I couldn't find an Elixir library that can explode Parquet.

Remap layer names

Ideally we should use any layer names we want and then have an explicit name/pattern mapping from hf/transformers names. This way we can keep the models consistent, and also share more parts of the transformer models (currently they often use different layer naming).

Support for stable diffusion >2.0

Stable Diffusion 1.5 works perfectly (might be worth updating the smart cell to point to 1.5 rather than 1.4), but 2.0 and 2.1 need some attention. (Looks like older models only supported one attention head dim and now the models have multiple?):

** (RuntimeError) conversion failed, expected "attention_head_dim" to be a number, got: [5, 10, 20, 20]
    (bumblebee 0.1.2) lib/bumblebee/shared/converters.ex:20: anonymous fn/3 in Bumblebee.Shared.Converters.convert!/2
    (elixir 1.14.2) lib/enum.ex:2468: Enum."-reduce/3-lists^foldl/2-0-"/3
    (bumblebee 0.1.2) lib/bumblebee/shared/converters.ex:14: Bumblebee.Shared.Converters.convert!/2
    (bumblebee 0.1.2) lib/bumblebee/diffusion/unet_2d_conditional.ex:341: Bumblebee.HuggingFace.Transformers.Config.Bumblebee.Diffusion.UNet2DConditional.load/2
    (bumblebee 0.1.2) lib/bumblebee.ex:279: Bumblebee.load_spec/2
    (bumblebee 0.1.2) lib/bumblebee.ex:372: Bumblebee.load_model/2
    #cell:25h4u7t3mfavivpqrylftjgs5u6ptd3t:10: (file)

to reproduce, just use the SD 1.4 smart-cell and replace repository_id with repository_id = "stabilityai/stable-diffusion-2-1"

`Bumblebee.Diffusion.VaeKl` could use a public `sample` method

When using Stable Diffusion to create an image to image model, the process is:
Image -> VAE encoder -> Posterior -> Sample from posterior to get latent -> Add noise to latent etc.

Right now there's no public method to sample from the posterior that the VAE encoder outputs. It's not hard to write but would be nice to have given it's probably a common thing to do.

Side note - I'm probably wrong about this but is this correct? https://github.com/elixir-nx/bumblebee/blob/main/lib/bumblebee/diffusion/vae_kl.ex#L414

Shouldn't this be:

    posterior.std
    |> Axon.multiply(z)
    |> Axon.add(posterior.mean)
  end

so that the mean isn't also being multiplied by z

Unpickler load op bug

I have a bert-base model I fine-tuned on a token classification problem. I trained on the GPU, so I wonder if this is related to the MPS issue we had earlier. Anyway, loading the model I get:

** (FunctionClauseError) no function clause matching in Unpickler.load_op/2    
    
    The following arguments were given to Unpickler.load_op/2:
    
        # 1
        nil
    
        # 2
        %{
          memo: %{},
          metastack: [],
          object_resolver: #Function<2.26982889/1 in Bumblebee.Conversion.PyTorch.Loader.object_resolver>,
          persistent_id_resolver: #Function<5.26982889/1 in Bumblebee.Conversion.PyTorch.Loader.load_zip!/1>,
          refs: %{},
          stack: []
        }
    
    Attempted function clauses (showing 1 out of 1):
    
        defp load_op(<<opcode, rest::binary()>>, state)
    
    (unpickler 0.1.0) lib/unpickler.ex:236: Unpickler.load_op/2
    (bumblebee 0.1.0) lib/bumblebee/conversion/pytorch/loader.ex:37: Bumblebee.Conversion.PyTorch.Loader.load_zip!/1
    (bumblebee 0.1.0) lib/bumblebee/conversion/pytorch.ex:25: Bumblebee.Conversion.PyTorch.load_params!/4
    (bumblebee 0.1.0) lib/bumblebee.ex:318: Bumblebee.load_params/4
    (bumblebee 0.1.0) lib/bumblebee.ex:295: Bumblebee.load_model/2

Let me know if you want me to send you the models

Load tokenizer special tokens

Currently our tokenizer implementations assume certain special tokens, like [PAD] for BERT, however there are repos on the Hub that override it. Example:

{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"})
Bumblebee.apply_tokenizer(tokenizer, "foo")

This fails, because the padding token is actually <pad>.

EXLA compile issue in Livebook

I'm messing around with the Stable Diffusion example, and I'm getting the following error in Livebook when trying to add the exla dependency:

could not compile dependency :exla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile exla", update it with "mix deps.update exla" or clean it with "mix deps.clean exla"

** (MatchError) no match of right hand side value: {"x86_64", "windows"}
...

Not sure what I could be doing wrong... any suggestions?

Document image format expectations for Bumblebee.Vision.ImageClassification

I would like to contribute some documentation that clarifies the expected image format to Bumblebee.Vision.image_classification. The type t:Bumblebee.Vision.image says:

@type image() :: Nx.Container.t()
A term representing an image.
Either Nx.Tensor in HWC order or a struct implementing Nx.Container and
resolving to such tensor.

However it does not clarify:

  • If the image should be resized first to the same size as that used to train the model (224 x 224 for the resnet models?)
  • Whether the image data should be {:u, 8} or some other type (some models suggest data should be in the range [0.0..1.0]
  • Whether the image can have an alpha layer (reading the code suggests yes, but perhaps that is model dependent)
  • Whether the image should be preprocessed? This stack overflow article suggests they should be?

If I can get some guidance I'll write a doc PR.

Unable to force build of Tokenizers

I'm attempting to get a Google Colab going that runs LiveBook w/ BumbleeBee to give developers easy access to a GPU. The crux is getting everything to work on Ubuntu 18.04 when all the precompiled binaries require newer GLIBC.

Generally that hasn't been a problem for EXLA, but the Tokenizers library (compiled nif via Rustler) isn't behaving correctly.

While Rustler's force_build config option doesn't appear to work at all, setting the env variable TOKENIZERS_BUILD=true works perfectly to compile and launch Livebook, but when running within the livebook, it seems to revert to using the prebuilt binaries.

So, a call to Bumblebee.load_tokenizer results in:

** (UndefinedFunctionError) function Tokenizers.Native.from_file/1 is undefined (module Tokenizers.Native is not available)
    (tokenizers 0.2.0) Tokenizers.Native.from_file("/root/.cache/bumblebee/huggingface/hnu6qkd3fooybwwjvnddfafwua.ei2geojyhbrggy3dhfsggnlbmrqwgzbugazwgmbqmi2dombuhe3tmmjzgy2tiodghara")
    (bumblebee 0.1.2) lib/bumblebee/utils/tokenizers.ex:120: Bumblebee.Utils.Tokenizers.load!/1
    (bumblebee 0.1.2) lib/bumblebee/text/gpt2_tokenizer.ex:37: Bumblebee.HuggingFace.Transformers.Config.Bumblebee.Text.Gpt2Tokenizer.load/2
    (bumblebee 0.1.2) lib/bumblebee.ex:577: Bumblebee.load_tokenizer/2
    #cell:f6mspfr4bdyx57zlfg26jfl3c4fqc2gq:2: (file)

22:11:53.535 [warn] The on_load function for module Elixir.Tokenizers.Native returned:
{:error,
 {:load_failed,
  'Failed to load NIF library: \'/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28\' not found

Add Google Colab link?

Hey! I saw @josevalim 's youtube video announcing BumbleBee and noted that he didn't have easy access to a GPU to demo.

I put together a google colab notebook that runs LiveBook w/ BumbleBee and CUDA support here ->
https://github.com/lukegalea/LiveBook_GoogleColab/blob/main/Google_Colab_hosted_Elixir_LiveBook_%2B_BumbleeBee_on_GPU_(Stable_Diffusion_%2B_GPT_2)_v1_0.ipynb

If you've got Colab Pro+, you can assign a high memory, high performance GPU instance and have 52GB of RAM and 40GB of VRAM, enough to run something like GPT-J, etc.

Think it's worth linking to?

Error on linux when attemptting to load models from huggingface

When trying out image classification in Livebook, the models seem to fail to load. I am running an up to date install of arch linux with the latest versions of erlang and elixir installed through asdf.

hansihe:~/ $ asdf current
elixir          1.14.2-otp-25
erlang          25.1.2

I'm running the following full code snppet from livebook:

Mix.install([
  {:bumblebee, "~> 0.1.0"},
  {:nx, "~> 0.4.1"},
  {:exla, "~> 0.4.1"},
  {:axon, "~> 0.3.1"},
  {:kino, "~> 0.8.0"}
])

Nx.global_default_backend(EXLA.Backend)

{:ok, resnet} = Bumblebee.load_model({:hf, "microsoft/resnet-50"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "microsoft/resnet-50"})

:ok

This errors:

** (File.RenameError) could not rename from "/tmp/bumblebee_yb7uihnimydvsi55mie2rgydcyih322y" to "/home/hansihe/.cache/bumblebee/huggingface/45jmafnchxcbm43dsoretzry4i.eiztamryhfrtsnzzgjstmnrymq3tgyzzheytqmrzmm4dqnbshe3tozjsmi4tanjthera": cross-domain link
    (elixir 1.14.2) lib/file.ex:766: File.rename!/2
    (bumblebee 0.1.0) lib/bumblebee/huggingface/hub.ex:63: Bumblebee.HuggingFace.Hub.cached_download/2
    (bumblebee 0.1.0) lib/bumblebee.ex:250: Bumblebee.load_spec/2
    (bumblebee 0.1.0) lib/bumblebee.ex:372: Bumblebee.load_model/2
    #cell:a3yfrzehbpz4mpgcbs7lpry7b3sia35g:1: (file)

This seems to happen because we are trying to move a file from /tmp to /home, which are mounted on different filesystems:

hansihe:~/ $ mount
/dev/nvme0n1p3 on / type ext4 (rw,relatime)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,nr_inodes=1048576,inode64)
[...]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.