Giter Site home page Giter Site logo

h2oai / h2o-llmstudio Goto Github PK

View Code? Open in Web Editor NEW
3.6K 44.0 385.0 52.63 MB

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/

Home Page: https://gpt-gm.h2o.ai

License: Apache License 2.0

Makefile 0.63% Python 97.68% Shell 0.48% Dockerfile 0.21% Groovy 0.85% Gherkin 0.14%
ai chatbot chatgpt fine-tuning finetuning generative generative-ai gpt llama llama2

h2o-llmstudio's Introduction

Welcome to H2O LLM Studio, a framework and no-code GUI designed for
fine-tuning state-of-the-art large language models (LLMs).

homelogs

Jump to

With H2O LLM Studio, you can

  • easily and effectively fine-tune LLMs without the need for any coding experience.
  • use a graphic user interface (GUI) specially designed for large language models.
  • finetune any LLM using a large variety of hyperparameters.
  • use recent finetuning techniques such as Low-Rank Adaptation (LoRA) and 8-bit model training with a low memory footprint.
  • use Reinforcement Learning (RL) to finetune your model (experimental)
  • use advanced evaluation metrics to judge generated answers by the model.
  • track and compare your model performance visually. In addition, Neptune integration can be used.
  • chat with your model and get instant feedback on your model performance.
  • easily export your model to the Hugging Face Hub and share it with the community.

Quickstart

For questions, discussing, or just hanging out, come and join our Discord!

We offer several ways of getting started quickly.

Using CLI for fine-tuning LLMs:

Kaggle Open in Colab

What's New

  • PR 592 Added KTOPairLoss for DPO modeling allowing to train models with simple preference data. Data currently needs to be manually prepared by randomly matching positive and negative examples as pairs.
  • PR 592 Starting to deprecate RLHF in favor of DPO/IPO optimization. Training is disabled, but old experiments are still viewable. RLHF will be fully removed in a future release.
  • PR 530 Introduced a new problem type for DPO/IPO optimization. This optimization technique can be used as an alternative to RLHF.
  • PR 288 Introduced Deepspeed for sharded training allowing to train larger models on machines with multiple GPUs. Requires NVLink. This feature replaces FSDP and offers more flexibility. Deepspeed requires a system installation of cudatoolkit and we recommend using version 11.8. See Recommended Install.
  • PR 449 New problem type for Causal Classification Modeling allows to train binary and multiclass models using LLMs.
  • PR 364 User secrets are now handled more securely and flexible. Support for handling secrets using the 'keyring' library was added. User settings are tried to be migrated automatically.

Please note that due to current rapid development we cannot guarantee full backwards compatibility of new functionality. We thus recommend to pin the version of the framework to the one you used for your experiments. For resetting, please delete/backup your data and output folders.

Setup

H2O LLM Studio requires a machine with Ubuntu 16.04+ and at least one recent Nvidia GPU with Nvidia drivers version >= 470.57.02. For larger models, we recommend at least 24GB of GPU memory.

For more information about installation prerequisites, see the Set up H2O LLM Studio guide in the documentation.

For a performance comparison of different GPUs, see the H2O LLM Studio performance guide in the documentation.

Recommended Install

The recommended way to install H2O LLM Studio is using pipenv with Python 3.10. To install Python 3.10 on Ubuntu 16.04+, execute the following commands:

System installs (Python 3.10)

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10
sudo apt-get install python3.10-distutils
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10

Installing NVIDIA Drivers (if required)

If deploying on a 'bare metal' machine running Ubuntu, one may need to install the required Nvidia drivers and CUDA. The following commands show how to retrieve the latest drivers for a machine running Ubuntu 20.04 as an example. One can update the following based on their OS.

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2004-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

alternatively, one can install cudatoolkits in a cuda environment:

conda create -n llmstudio python=3.10
conda activate llmstudio
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit

Create virtual environment (pipenv)

The following command will create a virtual environment using pipenv and will install the dependencies using pipenv:

make setup

If you are having troubles installing the flash_attn package, consider running

make setup-no-flash

instead. This will install the dependencies without the flash_attn package. Note that this will disable the use of Flash Attention 2 and model training will be slower and consume more memory.

Using requirements.txt

If you wish to use conda or another virtual environment, you can also install the dependencies using the requirements.txt file:

pip install -r requirements.txt
pip install flash-attn==2.5.5 --no-build-isolation  # optional for Flash Attention 2

Run H2O LLM Studio GUI

You can start H2O LLM Studio using the following command:

make llmstudio

This command will start the H2O wave server and app. Navigate to http://localhost:10101/ (we recommend using Chrome) to access H2O LLM Studio and start fine-tuning your models!

If you are running H2O LLM Studio with a custom environment other than Pipenv, you need to start the app as follows:

H2O_WAVE_APP_ADDRESS=http://127.0.0.1:8756 \
H2O_WAVE_MAX_REQUEST_SIZE=25MB \
H2O_WAVE_NO_LOG=true \
H2O_WAVE_PRIVATE_DIR="/download/@output/download" \
wave run app

Run H2O LLM Studio GUI using Docker from a nightly build

Install Docker first by following instructions from NVIDIA Containers. Make sure to have nvidia-container-toolkit installed on your machine as outlined in the instructions.

H2O LLM Studio images are stored in the h2oai GCR vorvan container repository.

mkdir -p `pwd`/data
mkdir -p `pwd`/output

# make sure to pull latest image if you still have a prior version cached
docker pull gcr.io/vorvan/h2oai/h2o-llmstudio:nightly

# run the container
docker run \
    --runtime=nvidia \
    --shm-size=64g \
    --init \
    --rm \
    -u `id -u`:`id -g` \
    -p 10101:10101 \
    -v `pwd`/data:/workspace/data \
    -v `pwd`/output:/workspace/output \
    -v ~/.cache:/home/llmstudio/.cache \
    gcr.io/vorvan/h2oai/h2o-llmstudio:nightly

Navigate to http://localhost:10101/ (we recommend using Chrome) to access H2O LLM Studio and start fine-tuning your models!

(Note other helpful docker commands are docker ps and docker kill.)

Run H2O LLM Studio GUI by building your own Docker image

docker build -t h2o-llmstudio .

mkdir -p `pwd`/data
mkdir -p `pwd`/output

docker run \
    --runtime=nvidia \
    --shm-size=64g \
    --init \
    --rm \
    -u `id -u`:`id -g` \
    -p 10101:10101 \
    -v `pwd`/data:/workspace/data \
    -v `pwd`/output:/workspace/output \
    -v ~/.cache:/home/llmstudio/.cache \
    h2o-llmstudio

Alternatively, you can run H2O LLM Studio GUI by using our self-hosted Docker image available here.

Run H2O LLM Studio with command line interface (CLI)

You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration .yaml file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell, and then use the following command:

python train.py -Y {path_to_config_yaml_file}

To run on multiple GPUs in DDP mode, run the following command:

bash distributed_train.sh {NR_OF_GPUS} -Y {path_to_config_yaml_file}

By default, the framework will run on the first k GPUs. If you want to specify specific GPUs to run on, use the CUDA_VISIBLE_DEVICES environment variable before the command.

To start an interactive chat with your trained model, use the following command:

python prompt.py -e {experiment_name}

where experiment_name is the output folder of the experiment you want to chat with (see configuration). The interactive chat will also work with model that were finetuned using the UI.

To publish the model to Hugging Face, use the following command:

make shell 

python publish_to_hugging_face.py -p {path_to_experiment} -d {device} -a {api_key} -u {user_id} -m {model_name} -s {safe_serialization}

path_to_experiment is the output folder of the experiment. device is the target device for running the model, either 'cpu' or 'cuda:0'. Default is 'cuda:0'. api_key is the Hugging Face API Key. If user logged in, it can be omitted. user_id is the Hugging Face user ID. If user logged in, it can be omitted. model_name is the name of the model to be published on Hugging Face. It can be omitted. safe_serialization is a flag indicating whether safe serialization should be used. Default is True.

Data format and example data

For details on the data format required when importing your data or example data that you can use to try out H2O LLM Studio, see Data format in the H2O LLM Studio documentation.

Training your model

With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start training your model. Start by creating an experiment. You can then monitor and manage your experiment, compare experiments, or push the model to Hugging Face to share it with the community.

Example: Run on OASST data via CLI

As an example, you can run an experiment on the OASST data via CLI. For instructions, see Run an experiment on the OASST data guide in the H2O LLM Studio documentation.

Model checkpoints

All open-source datasets and models are posted on H2O.ai's Hugging Face page and our H2OGPT repository.

Documentation

Detailed documentation and frequently asked questions (FAQs) for H2O LLM Studio can be found at https://docs.h2o.ai/h2o-llmstudio/. If you wish to contribute to the docs, navigate to the /documentation folder of this repo and refer to the README.md for more information.

Contributing

We are happy to accept contributions to the H2O LLM Studio project. Please refer to the CONTRIBUTING.md file for more information.

License

H2O LLM Studio is licensed under the Apache 2.0 license. Please see the LICENSE file for more information.

h2o-llmstudio's People

Contributors

arnocandel avatar chathurindaranasinghe avatar diegomiranda02 avatar dinukah2o avatar eltociear avatar fatihozturkh2o avatar gaborfodor avatar glavin001 avatar haqishen avatar itsmunishbhardwaj avatar jfarland avatar lakinduakash avatar maxjeblick avatar oshi98 avatar osiire avatar pascal-pfeiffer avatar psinger avatar quetzalcohuatl avatar richardscottoz avatar shaunyogeshwaran avatar sherenem avatar shivance avatar srisatish avatar tmm1 avatar tomkraljevic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

h2o-llmstudio's Issues

[CODE IMPROVEMENT] Can we skip installation of python somehow?

πŸ”§ Proposed code refactoring

When I run make setup, it seems to take a LONG time running the $(PIPENV) install --python $(PYTHON_VERSION) command. I don't need to install python because I've already got it installed to the right version in my conda env. I'd like to be able to skip this if it's already installed somehow.

Motivation

My understanding is that I need to re-run make setup after pulling the latest code to complete the upgrade. Right now I've just commented out the part where it installs python, but I'd like it to be easier to change that behavior.

[BUG] Pipfile.lock is corrupted as of commit 04da297

πŸ› Bug

This commit accidentally included some merge text into the pip lockfile:
04da297

To Reproduce

git pull
make setup
python3.10 -m pip install pip --upgrade
Requirement already satisfied: pip in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (23.1.2)
python3.10 -m pip install pipenv==2022.10.4
Requirement already satisfied: pipenv==2022.10.4 in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (2022.10.4)
Requirement already satisfied: certifi in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from pipenv==2022.10.4) (2022.12.7)
Requirement already satisfied: setuptools>=36.2.1 in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from pipenv==2022.10.4) (67.7.2)
Requirement already satisfied: virtualenv-clone>=0.2.5 in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from pipenv==2022.10.4) (0.5.7)
Requirement already satisfied: virtualenv in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from pipenv==2022.10.4) (20.22.0)
Requirement already satisfied: distlib<1,>=0.3.6 in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from virtualenv->pipenv==2022.10.4) (0.3.6)
Requirement already satisfied: filelock<4,>=3.11 in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from virtualenv->pipenv==2022.10.4) (3.12.0)
Requirement already satisfied: platformdirs<4,>=3.2 in /home/ubuntu/.conda/envs/jupyter/lib/python3.10/site-packages (from virtualenv->pipenv==2022.10.4) (3.3.0)
python3.10 -m pipenv install --python 3.10
Pipfile.lock is corrupted, replaced with (4f56b8)...
Locking [packages] dependencies...
Building requirements...
Resolving dependencies...
✘ Locking Failed!

[CODE IMPROVEMENT] Push to Huggingface improvements

Currently, we only push the model weights to Huggingface. We could improve this process by adding some of these additional artifacts:

  • Tokenizer
  • LLM Studio CFG / YAML
  • Automated Model card
  • Preprocessing/Postprocessing Pipeline

[CODE IMPROVEMENT] Check for available disk space

πŸ”§ Proposed code refactoring

We can check for available space for:

  • Starting an experiment
  • Pushing model to HF

Motivation

Out of space will make an experiment fail as weights need to be saved at the end of the run; same holds for pushing to HF.

[BUG] New Experiment resets API keys

πŸ› Bug

Starting a new experiment via New Experiment from an existing one, appears to reset the secret tokens.

image

Reason is most likely that new YAML functionality does not preserve the tokens, so they would need to be reloaded from default settings.

[CODE IMPROVEMENT] Improve functioanlity for separator and stop tokens

πŸ”§ Proposed code refactoring

Add the separator tokens as special tokens.
Potentially then also add a separate setting to use the separator tokens as stop tokens.
We should at least make the selection of stop tokens easier, maybe with a string list we are splitting.

This also means we need to dump the tokenizer.

So depends also on #5

Motivation

Some separator tokens might currently be encoded as multiple tokens.

Also, it can be cumbersome to manually add new tokens to the list of stop tokens.

[CODE IMPROVEMENT] Improve stopping criteria

We currently cannot apply the stopping criteria for batches larger than 1.

We can try improving it, by either making the stopping work on batch level and/or adding additional post processing steps.

See #32 for discussion.

[BUG] Pushing to HF failed

πŸ› Bug

I tried to push a model to HF, but got the following error with instructions to post the bug here :=)


q.app
script_sources: ['/_f/68bd9c4d-864b-44b4-b1c2-3f3f9b12805c/tmpx7ptm808.min.js']
initialized: True
wave_utils_stack_trace_str: ### stacktrace
Traceback (most recent call last):

  File "/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py", line 441, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)

  File "/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py", line 668, in _save
    zip_file.write_record(name, storage.data_ptr(), num_bytes)

RuntimeError: [enforce fail at inline_container.cc:471] . PytorchStreamWriter failed writing file data/34: file write failed


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "/home/bratanic_tomaz/h2o-llmstudio/./app_utils/handlers.py", line 301, in handle
    await experiment_push_to_huggingface_dialog(q)

  File "/home/bratanic_tomaz/h2o-llmstudio/./app_utils/sections/experiment.py", line 1553, in experiment_push_to_huggingface_dialog
    model.backbone.push_to_hub(

  File "/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/transformers/utils/hub.py", line 781, in push_to_hub
    self.save_pretrained(work_dir, max_shard_size=max_shard_size)

  File "/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1843, in save_pretrained
    save_function(shard, os.path.join(save_directory, shard_file))

  File "/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py", line 440, in save
    with _open_zipfile_writer(f) as opened_zipfile:

  File "/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py", line 291, in __exit__
    self.file_like.write_end_of_file()

RuntimeError: [enforce fail at inline_container.cc:337] . unexpected pos 6548765184 vs 6548765080

q.user
q.client
app_db: <app_utils.db.Database object at 0x7f108171cb80>
client_initialized: True
mode_curr: error
theme_dark: True
default_aws_bucket_name: bucket_name
default_kaggle_username: 
set_max_epochs: 50
set_max_batch_size: 256
set_max_gradient_clip: 10
default_number_of_workers: 8
default_logger: None
default_neptune_project: 
delete_dialogs: True
chart_plot_max_points: 1000
init_interface: True
notification_bar: None
nav/active: experiment/list
experiment/list/mode: train
dataset/list/df_datasets:    id    name              path                                        config_file                   problem type train dataframe train rows validation dataframe validation rows  labels
0   1  Cypher  data/user/Cypher  data/user/Cypher/text_causal_language_modeling...  Text Causal Language Modeling   train_llm.csv        200          val_llm.csv              40  output
experiment/list/df_experiments:    id             name   mode dataset                    config_file                          path  seed  process_id gpu_list          loss metric eta  val metric progress    status               info
0   1  ambitious-raven  train  Cypher  Text Causal Language Modeling  output/user/ambitious-raven/    -1        8354        0  CrossEntropy   GPT4             0.0      1.0  finished  Runtime: 00:14:21
dataset/list: False
dataset/list/table: []
expander: True
experiment/list: True
experiment/list/table: ['0']
experiment/list/compare: False
experiment/list/delete: False
experiment/list/delete/table/dialog: False
experiment/list/new: False
experiment/list/refresh: False
experiment/list/rename: False
experiment/list/stop: False
experiment/list/stop/table: False
experiment/display/id: 0
experiment/display/logs_path: None
experiment/display/preds_path: None
experiment/display/tab: experiment/display/chat
experiment/display/experiment_id: 1
experiment/display/experiment: <app_utils.db.Experiment object at 0x7f10817d0a90>
experiment/display/experiment_path: output/user/ambitious-raven/
experiment/display/charts: {'cfg': {'experiment_name': 'ambitious-raven', 'llm_backbone': 'EleutherAI/pythia-12b-deduped', 'train_dataframe': 'data/user/Cypher/train_llm.csv', 'validation_strategy': 'custom', 'validation_dataframe': 'data/user/Cypher/val_llm.csv', 'validation_size': 0.01, 'data_sample': 1.0, 'data_sample_choice': ('Train', 'Validation'), 'prompt_column': ('instruction',), 'answer_column': 'output', 'text_prompt_start': '', 'text_answer_separator': '\\n', 'add_eos_token_to_prompt': True, 'add_eos_token_to_answer': True, 'mask_prompt_labels': False, 'max_length_prompt': 256, 'max_length_answer': 256, 'max_length': 512, 'padding_quantile': 1.0, 'token_mask_probability': 0.0, 'backbone_dtype': 'float16', 'gradient_checkpointing': False, 'force_embedding_gradients': False, 'intermediate_dropout': 0.0, 'pretrained_weights': '', 'optimizer': 'AdamW', 'learning_rate': 0.0001, 'batch_size': 3, 'epochs': 10, 'schedule': 'Cosine', 'warmup_epochs': 0.0, 'weight_decay': 0.0, 'gradient_clip': 0.0, 'grad_accumulation': 1, 'lora': True, 'lora_r': 4, 'lora_alpha': 16, 'lora_dropout': 0.05, 'lora_target_modules': '', 'save_best_checkpoint': False, 'evaluation_epochs': 1.0, 'evaluate_before_training': True, 'train_validation_data': False, 'metric': 'GPT4', 'min_length_inference': 2, 'max_length_inference': 256, 'batch_size_inference': 0, 'do_sample': False, 'num_beams': 2, 'temperature': 0.3, 'repetition_penalty': 1.2, 'stop_tokens': (), 'gpus': ('0',), 'mixed_precision': True, 'compile_model': False, 'use_fsdp': False, 'find_unused_parameters': False, 'trust_remote_code': False, 'number_of_workers': 8, 'seed': -1, 'logger': 'None', 'neptune_project': '', 'number_of_texts': 10}, 'train': {'loss': {'steps': [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237, 240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273, 276, 279, 282, 285, 288, 291, 294, 297, 300, 303, 306, 309, 312, 315, 318, 321, 324, 327, 330, 333, 336, 339, 342, 345, 348, 351, 354, 357, 360, 363, 366, 369, 372, 375, 378, 381, 384, 387, 390, 393, 396, 399, 402, 405, 408, 411, 414, 417, 420, 423, 426, 429, 432, 435, 438, 441, 444, 447, 450, 453, 456, 459, 462, 465, 468, 471, 474, 477, 480, 483, 486, 489, 492, 495, 498, 501, 504, 507, 510, 513, 516, 519, 522, 525, 528, 531, 534, 537, 540, 543, 546, 549, 552, 555, 558, 561, 564, 567, 570, 573, 576, 579, 582, 585, 588, 591, 594, 597, 600, 603, 606, 609, 612, 615, 618, 621, 624, 627, 630, 633, 636, 639, 642, 645, 648, 651, 654, 657, 660, 663, 666, 669, 672, 675, 678, 681, 684, 687, 690, 693, 696, 699, 702, 705, 708, 711, 714, 717, 720, 723, 726, 729, 732, 735, 738, 741, 744, 747, 750, 753, 756, 759, 762, 765, 768, 771, 774, 777, 780, 783, 786, 789, 792, 795, 798, 801, 804, 807, 810, 813, 816, 819, 822, 825, 828, 831, 834, 837, 840, 843, 846, 849, 852, 855, 858, 861, 864, 867, 870, 873, 876, 879, 882, 885, 888, 891, 894, 897, 900, 903, 906, 909, 912, 915, 918, 921, 924, 927, 930, 933, 936, 939, 942, 945, 948, 951, 954, 957, 960, 963, 966, 969, 972, 975, 978, 981, 984, 987, 990, 993, 996, 999, 1002, 1005, 1008, 1011, 1014, 1017, 1020, 1023, 1026, 1029, 1032, 1035, 1038, 1041, 1044, 1047, 1050, 1053, 1056, 1059, 1062, 1065, 1068, 1071, 1074, 1077, 1080, 1083, 1086, 1089, 1092, 1095, 1098, 1101, 1104, 1107, 1110, 1113, 1116, 1119, 1122, 1125, 1128, 1131, 1134, 1137, 1140, 1143, 1146, 1149, 1152, 1155, 1158, 1161, 1164, 1167, 1170, 1173, 1176, 1179, 1182, 1185, 1188, 1191, 1194, 1197, 1200, 1203, 1206, 1209, 1212, 1215, 1218, 1221, 1224, 1227, 1230, 1233, 1236, 1239, 1242, 1245, 1248, 1251, 1254, 1257, 1260, 1263, 1266, 1269, 1272, 1275, 1278, 1281, 1284, 1287, 1290, 1293, 1296, 1299, 1302, 1305, 1308, 1311, 1314, 1317, 1320, 1323, 1326, 1329, 1332, 1335, 1338, 1341, 1344, 1347, 1350, 1353, 1356, 1359, 1362, 1365, 1368, 1371, 1374, 1377, 1380, 1383, 1386, 1389, 1392, 1395, 1398, 1401, 1404, 1407, 1410, 1413, 1416, 1419, 1422, 1425, 1428, 1431, 1434, 1437, 1440, 1443, 1446, 1449, 1452, 1455, 1458, 1461, 1464, 1467, 1470, 1473, 1476, 1479, 1482, 1485, 1488, 1491, 1494, 1497, 1500, 1503, 1506, 1509, 1512, 1515, 1518, 1521, 1524, 1527, 1530, 1533, 1536, 1539, 1542, 1545, 1548, 1551, 1554, 1557, 1560, 1563, 1566, 1569, 1572, 1575, 1578, 1581, 1584, 1587, 1590, 1593, 1596, 1599, 1602, 1605, 1608, 1611, 1614, 1617, 1620, 1623, 1626, 1629, 1632, 1635, 1638, 1641, 1644, 1647, 1650, 1653, 1656, 1659, 1662, 1665, 1668, 1671, 1674, 1677, 1680, 1683, 1686, 1689, 1692, 1695, 1698, 1701, 1704, 1707, 1710, 1713, 1716, 1719, 1722, 1725, 1728, 1731, 1734, 1737, 1740, 1743, 1746, 1749, 1752, 1755, 1758, 1761, 1764, 1767, 1770, 1773, 1776, 1779, 1782, 1785, 1788, 1791, 1794, 1797, 1800, 1803, 1806, 1809, 1812, 1815, 1818, 1821, 1824, 1827, 1830, 1833, 1836, 1839, 1842, 1845, 1848, 1851, 1854, 1857, 1860, 1863, 1866, 1869, 1872, 1875, 1878, 1881, 1884, 1887, 1890, 1893, 1896, 1899, 1902, 1905, 1908, 1911, 1914, 1917, 1920, 1923, 1926, 1929, 1932, 1935, 1938, 1941, 1944, 1947, 1950, 1953, 1956, 1959, 1962, 1965, 1968, 1971, 1974, 1977, 1980], 'values': [2.7005345821380615, 2.302522897720337, 1.956491470336914, 1.9150255918502808, 1.8014202117919922, 1.6758153438568115, 1.5657685995101929, 1.5556666851043701, 1.4133176803588867, 1.5053870677947998, 1.37370765209198, 1.2334452867507935, 1.1783990859985352, 0.9949309229850769, 1.0254274606704712, 0.8746487498283386, 0.8891000747680664, 0.8846033215522766, 0.7963659167289734, 0.7754319906234741, 0.6905891299247742, 0.6893880367279053, 0.5970458388328552, 0.5925400853157043, 0.6182283759117126, 0.565506637096405, 0.6372941136360168, 0.5870086550712585, 0.4192444682121277, 0.470547616481781, 0.38891106843948364, 0.42675286531448364, 0.3324353098869324, 0.4144511818885803, 0.4014154076576233, 0.2960846424102783, 0.29561805725097656, 0.27316129207611084, 0.3158145248889923, 0.2538573145866394, 0.500518262386322, 0.23089712858200073, 0.2907949686050415, 0.26505544781684875, 0.36868786811828613, 0.3334141969680786, 0.48592719435691833, 0.3995627462863922, 0.8942717909812927, 0.37165963649749756, 0.5605906248092651, 0.3197858929634094, 0.2572709619998932, 0.2638801634311676, 0.26507601141929626, 0.3732762038707733, 0.33656224608421326, 0.37917301058769226, 0.30661553144454956, 0.28488796949386597, 0.3562309443950653, 0.27630001306533813, 0.23262228071689606, 0.24439944326877594, 0.3018169403076172, 0.41877374053001404, 0.18852196633815765, 0.2470918744802475, 0.23445996642112732, 0.21674178540706635, 0.26279863715171814, 0.27394500374794006, 0.29981347918510437, 0.20704591274261475, 0.1738300919532776, 0.3511465787887573, 0.29734349250793457, 0.21237429976463318, 0.23880699276924133, 0.24131713807582855, 0.29926642775535583, 0.3318420648574829, 0.5482502579689026, 0.2979528307914734, 0.28939568996429443, 0.32494276762008667, 0.35852962732315063, 0.231154203414917, 0.28768661618232727, 0.21335403621196747, 0.3887779712677002, 0.28381824493408203, 0.1537625640630722, 0.15151046216487885, 0.28351327776908875, 0.175088033080101, 0.14059166610240936, 0.2383543998003006, 0.2564874291419983, 0.2971342206001282, 0.26006409525871277, 0.33993881940841675, 0.24602243304252625, 0.2935720384120941, 0.20731766521930695, 0.2733956277370453, 0.2169772982597351, 0.15703649818897247, 0.1733141541481018, 0.23006780445575714, 0.2119264155626297, 0.2425447553396225, 0.2241089940071106, 0.2853369116783142, 0.2765333354473114, 0.2700507938861847, 0.3039286732673645, 0.13076189160346985, 0.24192491173744202, 0.21084988117218018, 0.233406201004982, 0.24188132584095, 0.33570823073387146, 0.23672807216644287, 0.21899835765361786, 0.16598547995090485, 0.38361695408821106, 0.18618644773960114, 0.3500533103942871, 0.18113020062446594, 0.21338945627212524, 0.23331589996814728, 0.15453723073005676, 0.2261466532945633, 0.2144254446029663, 0.16852207481861115, 0.24237066507339478, 0.17069299519062042, 0.40777796506881714, 0.2618511915206909, 0.27689531445503235, 0.19780392944812775, 0.16815626621246338, 0.22740958631038666, 0.1712876260280609, 0.18320216238498688, 0.18208426237106323, 0.21153505146503448, 0.23257742822170258, 0.2990114390850067, 0.16364046931266785, 0.1375519186258316, 0.1726541966199875, 0.20596599578857422, 0.14926517009735107, 0.19574953615665436, 0.18683704733848572, 0.1907794177532196, 0.2290017157793045, 0.20418712496757507, 0.16318044066429138, 0.22959786653518677, 0.21411210298538208, 0.1835574209690094, 0.21494224667549133, 0.1402026265859604, 0.2006245255470276, 0.24721215665340424, 0.19678492844104767, 0.18529216945171356, 0.2830003798007965, 0.1124524474143982, 0.22419708967208862, 0.17857970297336578, 0.17074690759181976, 0.19080805778503418, 0.2078687846660614, 0.19645877182483673, 0.2535981237888336, 0.19840507209300995, 0.3035110533237457, 0.2964797019958496, 0.19329579174518585, 0.19803908467292786, 0.201622873544693, 0.3085462152957916, 0.15196900069713593, 0.20379166305065155, 0.17637301981449127, 0.2676140367984772, 0.1721111387014389, 0.20478931069374084, 0.22359776496887207, 0.23517049849033356, 0.279171884059906, 0.11509308964014053, 0.14004448056221008, 0.21327626705169678, 0.17589297890663147, 0.18368183076381683, 0.1853102147579193, 0.1856505274772644, 0.19653092324733734, 0.16532863676548004, 0.22262045741081238, 0.19269070029258728, 0.22251415252685547, 0.2269490659236908, 0.1397353708744049, 0.32053449749946594, 0.10071352869272232, 0.14410358667373657, 0.11226046085357666, 0.17552824318408966, 0.18539272248744965, 0.20792417228221893, 0.1779070645570755, 0.14432401955127716, 0.16836467385292053, 0.15807472169399261, 0.2281680852174759, 0.14784128963947296, 0.19425633549690247, 0.19559171795845032, 0.23866137862205505, 0.2124941349029541, 0.19827401638031006, 0.16166651248931885, 0.25798824429512024, 0.21116191148757935, 0.19667711853981018, 0.22281308472156525, 0.1847141683101654, 0.10651133209466934, 0.19441291689872742, 0.13913078606128693, 0.23716998100280762, 0.15681339800357819, 0.2249138504266739, 0.1786578744649887, 0.21353177726268768, 0.16476143896579742, 0.1809152364730835, 0.13824298977851868, 0.22631458938121796, 0.18196110427379608, 0.16382969915866852, 0.1765332818031311, 0.14035697281360626, 0.1699598878622055, 0.14238061010837555, 0.18301010131835938, 0.23355503380298615, 0.15233899652957916, 0.1637236773967743, 0.17379915714263916, 0.2002621293067932, 0.18987660109996796, 0.16523967683315277, 0.19055990874767303, 0.1559395045042038, 0.12667305767536163, 0.1365269273519516, 0.17389678955078125, 0.1521318256855011, 0.15145382285118103, 0.15021972358226776, 0.18615031242370605, 0.13815099000930786, 0.12494702637195587, 0.12307292222976685, 0.12333181500434875, 0.16486142575740814, 0.1725320816040039, 0.23123182356357574, 0.15633879601955414, 0.1505252569913864, 0.13077612221240997, 0.17620019614696503, 0.18429213762283325, 0.17498892545700073, 0.17733678221702576, 0.15920060873031616, 0.16896894574165344, 0.13590677082538605, 0.22293664515018463, 0.17718620598316193, 0.15363462269306183, 0.1709328591823578, 0.2026413381099701, 0.19703546166419983, 0.18350236117839813, 0.13665421307086945, 0.15926750004291534, 0.23953263461589813, 0.1860048770904541, 0.21971511840820312, 0.15184277296066284, 0.11742778867483139, 0.16357363760471344, 0.15263564884662628, 0.17940445244312286, 0.2229873239994049, 0.13951919972896576, 0.16089138388633728, 0.15421196818351746, 0.1396765261888504, 0.2824001908302307, 0.1213250607252121, 0.1878581941127777, 0.1896020621061325, 0.14482733607292175, 0.16788223385810852, 0.17963142693042755, 0.1855836659669876, 0.19445376098155975, 0.17495854198932648, 0.19788722693920135, 0.14273276925086975, 0.13227199018001556, 0.1532881110906601, 0.18871532380580902, 0.1229504868388176, 0.19230279326438904, 0.17466813325881958, 0.1740817427635193, 0.16332344710826874, 0.13517573475837708, 0.16716258227825165, 0.1151249036192894, 0.12735615670681, 0.1718556135892868, 0.15200398862361908, 0.1468484103679657, 0.18044772744178772, 0.1542312055826187, 0.10406209528446198, 0.1352229118347168, 0.17148429155349731, 0.11693287640810013, 0.15321244299411774, 0.14347036182880402, 0.12207994610071182, 0.12808185815811157, 0.1615636646747589, 0.11591575294733047, 0.15790246427059174, 0.188489630818367, 0.17931358516216278, 0.14615440368652344, 0.18892423808574677, 0.15914300084114075, 0.14596174657344818, 0.10270079225301743, 0.189537912607193, 0.14871275424957275, 0.16594308614730835, 0.15199312567710876, 0.16681747138500214, 0.13937392830848694, 0.14654386043548584, 0.1418081670999527, 0.15638412535190582, 0.11306167393922806, 0.16645842790603638, 0.15118515491485596, 0.14823076128959656, 0.14976219832897186, 0.17277124524116516, 0.15475453436374664, 0.19143173098564148, 0.15149171650409698, 0.1672256588935852, 0.15838871896266937, 0.13406766951084137, 0.1521817445755005, 0.14758649468421936, 0.1431911289691925, 0.15749198198318481, 0.207697331905365, 0.15359358489513397, 0.17392325401306152, 0.14929240942001343, 0.16919097304344177, 0.19408577680587769, 0.14611385762691498, 0.1772640496492386, 0.1973886340856552, 0.10846791416406631, 0.15022847056388855, 0.1321622133255005, 0.1413823366165161, 0.12946535646915436, 0.14187732338905334, 0.16995270550251007, 0.13351304829120636, 0.16179689764976501, 0.10960730165243149, 0.14778536558151245, 0.10853652656078339, 0.12774300575256348, 0.12853221595287323, 0.12429190427064896, 0.1390649378299713, 0.12162577360868454, 0.16752883791923523, 0.12878543138504028, 0.20442502200603485, 0.12951888144016266, 0.1503167450428009, 0.15796451270580292, 0.12980911135673523, 0.14435675740242004, 0.12511269748210907, 0.12876132130622864, 0.15272358059883118, 0.14414291083812714, 0.14418600499629974, 0.13394950330257416, 0.11568888276815414, 0.15823407471179962, 0.10595902055501938, 0.13557782769203186, 0.1874263733625412, 0.14968375861644745, 0.19153402745723724, 0.1266482174396515, 0.1783953607082367, 0.12510333955287933, 0.1211228296160698, 0.14997251331806183, 0.11904552578926086, 0.1489456593990326, 0.14415337145328522, 0.15256448090076447, 0.11151070147752762, 0.11275491118431091, 0.1534186154603958, 0.12211151421070099, 0.17272904515266418, 0.16305990517139435, 0.11322176456451416, 0.15016472339630127, 0.13731276988983154, 0.12865997850894928, 0.1769331991672516, 0.12761656939983368, 0.1588737666606903, 0.13314947485923767, 0.13821887969970703, 0.12848007678985596, 0.17544777691364288, 0.11806510388851166, 0.09593313187360764, 0.14386498928070068, 0.17036674916744232, 0.15617245435714722, 0.16188794374465942, 0.09963300824165344, 0.16554032266139984, 0.13644501566886902, 0.125505730509758, 0.12050557881593704, 0.11563029140233994, 0.1599409133195877, 0.14299921691417694, 0.08736211061477661, 0.1417485773563385, 0.14821231365203857, 0.11124232411384583, 0.11582273989915848, 0.14892755448818207, 0.1502988040447235, 0.13689835369586945, 0.1670726239681244, 0.1321590095758438, 0.14476533234119415, 0.12527397274971008, 0.12640906870365143, 0.1336265504360199, 0.13104763627052307, 0.1607089787721634, 0.12487401068210602, 0.12452788650989532, 0.14707085490226746, 0.17609860002994537, 0.1183212623000145, 0.1493626832962036, 0.10666980594396591, 0.10016914457082748, 0.10729489475488663, 0.1257627308368683, 0.1482212245464325, 0.0996335968375206, 0.136069655418396, 0.11444417387247086, 0.12517504394054413, 0.12942185997962952, 0.10781490057706833, 0.14201568067073822, 0.10586808621883392, 0.10309679806232452, 0.1536419838666916, 0.12234801054000854, 0.13328133523464203, 0.14461883902549744, 0.11500144004821777, 0.1401728242635727, 0.11239509284496307, 0.12659674882888794, 0.11486076563596725, 0.13139407336711884, 0.12120191752910614, 0.16371876001358032, 0.14316654205322266, 0.1214800700545311, 0.13528954982757568, 0.1226634532213211, 0.1549418717622757, 0.17680957913398743, 0.10683499276638031, 0.1373453140258789, 0.10931749641895294, 0.1604369878768921, 0.1670868694782257, 0.11591086536645889, 0.14213287830352783, 0.16500131785869598, 0.1277766078710556, 0.16902069747447968, 0.09819764643907547, 0.1353602409362793, 0.13709020614624023, 0.12323539704084396, 0.12289679050445557, 0.1033812016248703, 0.11794121563434601, 0.15749026834964752, 0.10801248997449875, 0.15402057766914368, 0.12833712995052338, 0.10959704220294952, 0.12308622151613235, 0.09877847135066986, 0.1221306174993515, 0.14059050381183624, 0.1047365665435791, 0.12640836834907532, 0.11854580789804459, 0.11979788541793823, 0.11506649106740952, 0.12404661625623703, 0.12508538365364075, 0.11497911065816879, 0.13256926834583282, 0.12900410592556, 0.14134860038757324, 0.12502038478851318, 0.12992039322853088, 0.10020960867404938, 0.1185808926820755, 0.11127398163080215, 0.15476912260055542, 0.13506540656089783, 0.13581502437591553, 0.14219248294830322, 0.1850404441356659, 0.11176556348800659, 0.12084333598613739, 0.1283043771982193, 0.11249291151762009, 0.1263321042060852, 0.15619421005249023, 0.13487932085990906, 0.15189334750175476, 0.10391250997781754, 0.09316939860582352, 0.1111619845032692, 0.12890823185443878, 0.1298336684703827, 0.11631516367197037, 0.09792515635490417, 0.13358265161514282, 0.11981073766946793, 0.11282122880220413, 0.09676577150821686, 0.1280338615179062, 0.12260109186172485, 0.13251622021198273, 0.1153603047132492, 0.1344323754310608, 0.12015625834465027, 0.12868286669254303, 0.12636686861515045, 0.10729619860649109, 0.11942457407712936, 0.13811303675174713, 0.1182970330119133, 0.13227730989456177, 0.14942096173763275, 0.13330848515033722, 0.10674560815095901, 0.12643598020076752, 0.10060331225395203, 0.12433017790317535, 0.1342850625514984, 0.11677200347185135, 0.08728785067796707, 0.1352870762348175, 0.09224534779787064, 0.12912294268608093, 0.12226913124322891, 0.11452190577983856, 0.12793101370334625, 0.11765899509191513, 0.1531534194946289, 0.11058519035577774, 0.14125916361808777, 0.11545669287443161, 0.0944119244813919, 0.10741814225912094, 0.1373232901096344, 0.13427327573299408, 0.10597988963127136, 0.13542428612709045, 0.11766446381807327, 0.12015216052532196, 0.11785931140184402, 0.14277797937393188, 0.12402219325304031, 0.12147288769483566, 0.1139478012919426, 0.098690465092659, 0.14386127889156342, 0.08663983643054962, 0.11778617650270462, 0.14561352133750916, 0.12624038755893707, 0.11620577424764633, 0.14149042963981628, 0.10495538264513016, 0.1090797707438469, 0.16756057739257812, 0.13079833984375, 0.1271211951971054, 0.10520929843187332, 0.1306414008140564, 0.13574017584323883, 0.11705509573221207, 0.13924552500247955, 0.11314234137535095, 0.12474612146615982, 0.1176811009645462, 0.12103776633739471, 0.10465487092733383, 0.12630315124988556, 0.09887726604938507, 0.12783871591091156, 0.12218356132507324, 0.13511815667152405]}}, 'meta': {'lr': {'steps': [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237, 240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273, 276, 279, 282, 285, 288, 291, 294, 297, 300, 303, 306, 309, 312, 315, 318, 321, 324, 327, 330, 333, 336, 339, 342, 345, 348, 351, 354, 357, 360, 363, 366, 369, 372, 375, 378, 381, 384, 387, 390, 393, 396, 399, 402, 405, 408, 411, 414, 417, 420, 423, 426, 429, 432, 435, 438, 441, 444, 447, 450, 453, 456, 459, 462, 465, 468, 471, 474, 477, 480, 483, 486, 489, 492, 495, 498, 501, 504, 507, 510, 513, 516, 519, 522, 525, 528, 531, 534, 537, 540, 543, 546, 549, 552, 555, 558, 561, 564, 567, 570, 573, 576, 579, 582, 585, 588, 591, 594, 597, 600, 603, 606, 609, 612, 615, 618, 621, 624, 627, 630, 633, 636, 639, 642, 645, 648, 651, 654, 657, 660, 663, 666, 669, 672, 675, 678, 681, 684, 687, 690, 693, 696, 699, 702, 705, 708, 711, 714, 717, 720, 723, 726, 729, 732, 735, 738, 741, 744, 747, 750, 753, 756, 759, 762, 765, 768, 771, 774, 777, 780, 783, 786, 789, 792, 795, 798, 801, 804, 807, 810, 813, 816, 819, 822, 825, 828, 831, 834, 837, 840, 843, 846, 849, 852, 855, 858, 861, 864, 867, 870, 873, 876, 879, 882, 885, 888, 891, 894, 897, 900, 903, 906, 909, 912, 915, 918, 921, 924, 927, 930, 933, 936, 939, 942, 945, 948, 951, 954, 957, 960, 963, 966, 969, 972, 975, 978, 981, 984, 987, 990, 993, 996, 999, 1002, 1005, 1008, 1011, 1014, 1017, 1020, 1023, 1026, 1029, 1032, 1035, 1038, 1041, 1044, 1047, 1050, 1053, 1056, 1059, 1062, 1065, 1068, 1071, 1074, 1077, 1080, 1083, 1086, 1089, 1092, 1095, 1098, 1101, 1104, 1107, 1110, 1113, 1116, 1119, 1122, 1125, 1128, 1131, 1134, 1137, 1140, 1143, 1146, 1149, 1152, 1155, 1158, 1161, 1164, 1167, 1170, 1173, 1176, 1179, 1182, 1185, 1188, 1191, 1194, 1197, 1200, 1203, 1206, 1209, 1212, 1215, 1218, 1221, 1224, 1227, 1230, 1233, 1236, 1239, 1242, 1245, 1248, 1251, 1254, 1257, 1260, 1263, 1266, 1269, 1272, 1275, 1278, 1281, 1284, 1287, 1290, 1293, 1296, 1299, 1302, 1305, 1308, 1311, 1314, 1317, 1320, 1323, 1326, 1329, 1332, 1335, 1338, 1341, 1344, 1347, 1350, 1353, 1356, 1359, 1362, 1365, 1368, 1371, 1374, 1377, 1380, 1383, 1386, 1389, 1392, 1395, 1398, 1401, 1404, 1407, 1410, 1413, 1416, 1419, 1422, 1425, 1428, 1431, 1434, 1437, 1440, 1443, 1446, 1449, 1452, 1455, 1458, 1461, 1464, 1467, 1470, 1473, 1476, 1479, 1482, 1485, 1488, 1491, 1494, 1497, 1500, 1503, 1506, 1509, 1512, 1515, 1518, 1521, 1524, 1527, 1530, 1533, 1536, 1539, 1542, 1545, 1548, 1551, 1554, 1557, 1560, 1563, 1566, 1569, 1572, 1575, 1578, 1581, 1584, 1587, 1590, 1593, 1596, 1599, 1602, 1605, 1608, 1611, 1614, 1617, 1620, 1623, 1626, 1629, 1632, 1635, 1638, 1641, 1644, 1647, 1650, 1653, 1656, 1659, 1662, 1665, 1668, 1671, 1674, 1677, 1680, 1683, 1686, 1689, 1692, 1695, 1698, 1701, 1704, 1707, 1710, 1713, 1716, 1719, 1722, 1725, 1728, 1731, 1734, 1737, 1740, 1743, 1746, 1749, 1752, 1755, 1758, 1761, 1764, 1767, 1770, 1773, 1776, 1779, 1782, 1785, 1788, 1791, 1794, 1797, 1800, 1803, 1806, 1809, 1812, 1815, 1818, 1821, 1824, 1827, 1830, 1833, 1836, 1839, 1842, 1845, 1848, 1851, 1854, 1857, 1860, 1863, 1866, 1869, 1872, 1875, 1878, 1881, 1884, 1887, 1890, 1893, 1896, 1899, 1902, 1905, 1908, 1911, 1914, 1917, 1920, 1923, 1926, 1929, 1932, 1935, 1938, 1941, 1944, 1947, 1950, 1953, 1956, 1959, 1962, 1965, 1968, 1971, 1974, 1977, 1980], 'values': [9.999943356371866e-05, 9.999773426770865e-05, 9.999490215047167e-05, 9.99909372761763e-05, 9.998583973465646e-05, 9.997960964140947e-05, 9.997224713759335e-05, 9.996375239002369e-05, 9.995412559116979e-05, 9.99433669591504e-05, 9.99314767377287e-05, 9.991845519630678e-05, 9.990430262991962e-05, 9.988901935922826e-05, 9.987260573051269e-05, 9.985506211566388e-05, 9.983638891217544e-05, 9.981658654313457e-05, 9.979565545721248e-05, 9.977359612865423e-05, 9.975040905726798e-05, 9.972609476841367e-05, 9.970065381299112e-05, 9.967408676742751e-05, 9.964639423366442e-05, 9.961757683914406e-05, 9.958763523679514e-05, 9.955657010501806e-05, 9.952438214766955e-05, 9.949107209404665e-05, 9.945664069887028e-05, 9.942108874226811e-05, 9.938441702975689e-05, 9.934662639222412e-05, 9.930771768590933e-05, 9.926769179238466e-05, 9.922654961853481e-05, 9.918429209653662e-05, 9.914092018383778e-05, 9.909643486313533e-05, 9.905083714235326e-05, 9.900412805461967e-05, 9.895630865824347e-05, 9.890738003669029e-05, 9.885734329855798e-05, 9.880619957755151e-05, 9.875395003245724e-05, 9.870059584711668e-05, 9.864613823039969e-05, 9.859057841617709e-05, 9.853391766329263e-05, 9.847615725553456e-05, 9.841729850160652e-05, 9.835734273509786e-05, 9.829629131445342e-05, 9.82341456229428e-05, 9.817090706862895e-05, 9.810657708433637e-05, 9.804115712761851e-05, 9.797464868072488e-05, 9.790705325056735e-05, 9.783837236868609e-05, 9.776860759121484e-05, 9.769776049884563e-05, 9.762583269679303e-05, 9.755282581475769e-05, 9.747874150688948e-05, 9.740358145174998e-05, 9.73273473522745e-05, 9.725004093573342e-05, 9.717166395369313e-05, 9.709221818197624e-05, 9.701170542062148e-05, 9.693012749384279e-05, 9.68474862499881e-05, 9.676378356149734e-05, 9.667902132486009e-05, 9.659320146057262e-05, 9.650632591309431e-05, 9.641839665080363e-05, 9.632941566595357e-05, 9.623938497462646e-05, 9.614830661668829e-05, 9.60561826557425e-05, 9.596301517908328e-05, 9.586880629764817e-05, 9.577355814597031e-05, 9.567727288213005e-05, 9.557995268770608e-05, 9.548159976772592e-05, 9.538221635061611e-05, 9.528180468815155e-05, 9.518036705540458e-05, 9.507790575069347e-05, 9.497442309553016e-05, 9.486992143456792e-05, 9.476440313554803e-05, 9.46578705892462e-05, 9.45503262094184e-05, 9.444177243274618e-05, 9.433221171878144e-05, 9.422164654989072e-05, 9.411007943119894e-05, 9.399751289053267e-05, 9.388394947836279e-05, 9.376939176774679e-05, 9.365384235427042e-05, 9.353730385598887e-05, 9.341977891336749e-05, 9.330127018922194e-05, 9.318178036865785e-05, 9.306131215901003e-05, 9.293986828978106e-05, 9.281745151257946e-05, 9.26940646010574e-05, 9.256971035084785e-05, 9.244439157950114e-05, 9.231811112642121e-05, 9.219087185280132e-05, 9.206267664155907e-05, 9.193352839727121e-05, 9.18034300461078e-05, 9.167238453576589e-05, 9.154039483540273e-05, 9.140746393556854e-05, 9.12735948481387e-05, 9.113879060624553e-05, 9.100305426420956e-05, 9.086638889747035e-05, 9.072879760251679e-05, 9.059028349681694e-05, 9.045084971874738e-05, 9.031049942752215e-05, 9.016923580312113e-05, 9.002706204621803e-05, 8.988398137810777e-05, 8.973999704063365e-05, 8.959511229611376e-05, 8.944933042726714e-05, 8.930265473713938e-05, 8.915508854902778e-05, 8.900663520640604e-05, 8.885729807284856e-05, 8.870708053195413e-05, 8.855598598726939e-05, 8.840401786221159e-05, 8.825117959999116e-05, 8.809747466353356e-05, 8.794290653540084e-05, 8.778747871771292e-05, 8.763119473206794e-05, 8.74740581194627e-05, 8.731607244021236e-05, 8.715724127386972e-05, 8.69975682191442e-05, 8.683705689382024e-05, 8.667571093467541e-05, 8.651353399739787e-05, 8.635052975650369e-05, 8.618670190525352e-05, 8.602205415556889e-05, 8.585659023794818e-05, 8.569031390138202e-05, 8.552322891326846e-05, 8.535533905932738e-05, 8.518664814351502e-05, 8.501715998793757e-05, 8.484687843276469e-05, 8.467580733614233e-05, 8.450395057410561e-05, 8.433131204049067e-05, 8.415789564684673e-05, 8.398370532234722e-05, 8.380874501370097e-05, 8.363301868506264e-05, 8.345653031794292e-05, 8.327928391111841e-05, 8.310128348054094e-05, 8.292253305924655e-05, 8.274303669726426e-05, 8.25627984615241e-05, 8.238182243576512e-05, 8.220011272044277e-05, 8.201767343263612e-05, 8.183450870595441e-05, 8.165062269044353e-05, 8.146601955249188e-05, 8.128070347473609e-05, 8.109467865596612e-05, 8.090794931103026e-05, 8.072051967073955e-05, 8.053239398177191e-05, 8.034357650657598e-05, 8.015407152327448e-05, 7.996388332556735e-05, 7.97730162226344e-05, 7.958147453903773e-05, 7.938926261462366e-05, 7.919638480442452e-05, 7.900284547855991e-05, 7.880864902213765e-05, 7.861379983515449e-05, 7.841830233239638e-05, 7.822216094333847e-05, 7.80253801120447e-05, 7.78279642970672e-05, 7.762991797134514e-05, 7.74312456221035e-05, 7.723195175075136e-05, 7.703204087277988e-05, 7.683151751766004e-05, 7.663038622873999e-05, 7.64286515631421e-05, 7.622631809165973e-05, 7.602339039865362e-05, 7.58198730819481e-05, 7.561577075272686e-05, 7.541108803542846e-05, 7.52058295676416e-05, 7.500000000000001e-05, 7.479360399607707e-05, 7.45866462322802e-05, 7.437913139774482e-05, 7.417106419422819e-05, 7.396244933600285e-05, 7.375329154974975e-05, 7.354359557445126e-05, 7.333336616128369e-05, 7.312260807350975e-05, 7.291132608637052e-05, 7.269952498697734e-05, 7.24872095742033e-05, 7.227438465857448e-05, 7.206105506216106e-05, 7.184722561846798e-05, 7.163290117232542e-05, 7.141808657977907e-05, 7.120278670798009e-05, 7.098700643507485e-05, 7.077075065009433e-05, 7.055402425284346e-05, 7.033683215379002e-05, 7.01191792739534e-05, 6.990107054479312e-05, 6.968251090809708e-05, 6.946350531586959e-05, 6.924405873021918e-05, 6.902417612324615e-05, 6.880386247692999e-05, 6.858312278301637e-05, 6.836196204290417e-05, 6.814038526753205e-05, 6.7918397477265e-05, 6.769600370178059e-05, 6.747320897995493e-05, 6.725001835974853e-05, 6.702643689809205e-05, 6.680246966077151e-05, 6.65781217223137e-05, 6.635339816587109e-05, 6.61283040831067e-05, 6.590284457407876e-05, 6.567702474712507e-05, 6.545084971874738e-05, 6.522432461349536e-05, 6.499745456385054e-05, 6.477024471011001e-05, 6.454270020026995e-05, 6.431482618990902e-05, 6.408662784207149e-05, 6.38581103271503e-05, 6.36292788227699e-05, 6.340013851366896e-05, 6.317069459158284e-05, 6.294095225512603e-05, 6.271091670967436e-05, 6.248059316724693e-05, 6.22499868463882e-05, 6.201910297204962e-05, 6.178794677547137e-05, 6.155652349406365e-05, 6.132483837128823e-05, 6.109289665653944e-05, 6.0860703605025395e-05, 6.062826447764883e-05, 6.0395584540887963e-05, 6.016266906667711e-05, 5.992952333228728e-05, 5.969615262020657e-05, 5.946256221802051e-05, 5.9228757418292266e-05, 5.8994743518442694e-05, 5.876052582063031e-05, 5.85261096316312e-05, 5.829150026271871e-05, 5.805670302954321e-05, 5.782172325201155e-05, 5.7586566254166583e-05, 5.7351237364066547e-05, 5.7115741913664264e-05, 5.6880085238686454e-05, 5.664427267851271e-05, 5.640830957605465e-05, 5.617220127763474e-05, 5.593595313286526e-05, 5.569957049452703e-05, 5.5463058718448155e-05, 5.522642316338268e-05, 5.4989669190889136e-05, 5.475280216520913e-05, 5.4515827453145765e-05, 5.427875042394199e-05, 5.404157644915907e-05, 5.3804310902554754e-05, 5.3566959159961615e-05, 5.3329526599165204e-05, 5.3092018599782155e-05, 5.2854440543138406e-05, 5.26167978121472e-05, 5.2379095791187124e-05, 5.2141339865980134e-05, 5.1903535423469505e-05, 5.166568785169781e-05, 5.142780253968481e-05, 5.1189884877305375e-05, 5.095194025516733e-05, 5.0713974064489367e-05, 5.047599169697884e-05, 5.023799854470963e-05, 5e-05, 4.9762001455290385e-05, 4.952400830302117e-05, 4.928602593551065e-05, 4.9048059744832666e-05, 4.881011512269463e-05, 4.85721974603152e-05, 4.83343121483022e-05, 4.8096464576530507e-05, 4.7858660134019884e-05, 4.762090420881289e-05, 4.738320218785281e-05, 4.71455594568616e-05, 4.6907981400217864e-05, 4.667047340083481e-05, 4.643304084003839e-05, 4.619568909744524e-05, 4.595842355084094e-05, 4.5721249576058027e-05, 4.5484172546854246e-05, 4.5247197834790876e-05, 4.501033080911086e-05, 4.477357683661734e-05, 4.4536941281551864e-05, 4.4300429505472976e-05, 4.4064046867134756e-05, 4.3827798722365264e-05, 4.359169042394536e-05, 4.33557273214873e-05, 4.3119914761313564e-05, 4.288425808633575e-05, 4.2648762635933465e-05, 4.241343374583343e-05, 4.2178276747988446e-05, 4.19432969704568e-05, 4.17084997372813e-05, 4.147389036836881e-05, 4.12394741793697e-05, 4.100525648155731e-05, 4.077124258170774e-05, 4.0537437781979506e-05, 4.0303847379793447e-05, 4.007047666771274e-05, 3.983733093332289e-05, 3.960441545911204e-05, 3.937173552235117e-05, 3.913929639497462e-05, 3.890710334346058e-05, 3.8675161628711776e-05, 3.844347650593635e-05, 3.821205322452863e-05, 3.798089702795038e-05, 3.775001315361183e-05, 3.7519406832753085e-05, 3.728908329032567e-05, 3.705904774487396e-05, 3.6829305408417166e-05, 3.659986148633107e-05, 3.6370721177230116e-05, 3.6141889672849726e-05, 3.591337215792852e-05, 3.568517381009099e-05, 3.545729979973005e-05, 3.522975528989e-05, 3.5002545436149474e-05, 3.4775675386504656e-05, 3.4549150281252636e-05, 3.4322975252874946e-05, 3.4097155425921254e-05, 3.3871695916893314e-05, 3.364660183412892e-05, 3.3421878277686314e-05, 3.3197530339228487e-05, 3.297356310190797e-05, 3.274998164025148e-05, 3.2526791020045086e-05, 3.230399629821942e-05, 3.2081602522734986e-05, 3.1859614732467954e-05, 3.163803795709583e-05, 3.141687721698363e-05, 3.119613752307002e-05, 3.097582387675385e-05, 3.075594126978084e-05, 3.053649468413043e-05, 3.0317489091902935e-05, 3.0098929455206904e-05, 2.988082072604661e-05, 2.9663167846209998e-05, 2.9445975747156545e-05, 2.9229249349905684e-05, 2.901299356492516e-05, 2.8797213292019926e-05, 2.858191342022095e-05, 2.8367098827674578e-05, 2.8152774381532033e-05, 2.7938944937838923e-05, 2.7725615341425525e-05, 2.7512790425796718e-05, 2.7300475013022663e-05, 2.708867391362948e-05, 2.687739192649026e-05, 2.6666633838716314e-05, 2.6456404425548774e-05, 2.6246708450250256e-05, 2.603755066399718e-05, 2.5828935805771802e-05, 2.5620868602255197e-05, 2.5413353767719805e-05, 2.520639600392295e-05, 2.500000000000001e-05, 2.4794170432358415e-05, 2.4588911964571553e-05, 2.4384229247273155e-05, 2.418012691805191e-05, 2.3976609601346394e-05, 2.3773681908340284e-05, 2.3571348436857904e-05, 2.336961377126001e-05, 2.3168482482339955e-05, 2.296795912722014e-05, 2.2768048249248648e-05, 2.2568754377896516e-05, 2.2370082028654866e-05, 2.2172035702932825e-05, 2.1974619887955294e-05, 2.1777839056661554e-05, 2.1581697667603633e-05, 2.1386200164845526e-05, 2.119135097786236e-05, 2.09971545214401e-05, 2.0803615195575475e-05, 2.061073738537635e-05, 2.0418525460962285e-05, 2.0226983777365604e-05, 2.0036116674432654e-05, 1.9845928476725524e-05, 1.9656423493424048e-05, 1.946760601822809e-05, 1.927948032926047e-05, 1.9092050688969738e-05, 1.8905321344033898e-05, 1.8719296525263922e-05, 1.8533980447508137e-05, 1.8349377309556486e-05, 1.8165491294045593e-05, 1.7982326567363888e-05, 1.7799887279557237e-05, 1.7618177564234905e-05, 1.7437201538475916e-05, 1.725696330273575e-05, 1.7077466940753444e-05, 1.6898716519459074e-05, 1.6720716088881594e-05, 1.6543469682057106e-05, 1.6366981314937376e-05, 1.619125498629904e-05, 1.601629467765277e-05, 1.5842104353153287e-05, 1.566868795950932e-05, 1.549604942589441e-05, 1.5324192663857674e-05, 1.5153121567235335e-05, 1.4982840012062426e-05, 1.481335185648498e-05, 1.4644660940672627e-05, 1.4476771086731567e-05, 1.4309686098617975e-05, 1.414340976205183e-05, 1.3977945844431118e-05, 1.3813298094746491e-05, 1.3649470243496326e-05, 1.3486466002602133e-05, 1.3324289065324608e-05, 1.3162943106179749e-05, 1.3002431780855817e-05, 1.2842758726130283e-05, 1.2683927559787655e-05, 1.2525941880537307e-05, 1.236880526793207e-05, 1.2212521282287092e-05, 1.2057093464599157e-05, 1.1902525336466464e-05, 1.1748820400008843e-05, 1.1595982137788403e-05, 1.144401401273062e-05, 1.1292919468045877e-05, 1.1142701927151456e-05, 1.099336479359398e-05, 1.0844911450972229e-05, 1.0697345262860636e-05, 1.0550669572732863e-05, 1.0404887703886251e-05, 1.0260002959366349e-05, 1.0116018621892237e-05, 9.972937953781986e-06, 9.830764196878872e-06, 9.689500572477855e-06, 9.549150281252633e-06, 9.409716503183074e-06, 9.271202397483215e-06, 9.133611102529654e-06, 8.996945735790447e-06, 8.861209393754477e-06, 8.7264051518613e-06, 8.592536064431467e-06, 8.459605164597267e-06, 8.327615464234129e-06, 8.196569953892202e-06, 8.066471602728803e-06, 7.937323358440935e-06, 7.809128147198691e-06, 7.681888873578786e-06, 7.555608420498872e-06, 7.430289649152156e-06, 7.305935398942598e-06, 7.182548487420554e-06, 7.060131710218959e-06, 6.9386878409899715e-06, 6.818219631342149e-06, 6.698729810778065e-06, 6.580221086632516e-06, 6.462696144011149e-06, 6.346157645729589e-06, 6.230608232253227e-06, 6.116050521637218e-06, 6.002487109467347e-06, 5.889920568801055e-06, 5.778353450109286e-06, 5.667788281218567e-06, 5.558227567253832e-06, 5.449673790581611e-06, 5.34212941075381e-06, 5.235596864451975e-06, 5.13007856543209e-06, 5.025576904469842e-06, 4.922094249306558e-06, 4.819632944595415e-06, 4.7181953118484556e-06, 4.617783649383905e-06, 4.5184002322740785e-06, 4.4200473122939456e-06, 4.322727117869951e-06, 4.2264418540297e-06, 4.131193702351827e-06, 4.036984820916723e-06, 3.9438173442575e-06, 3.851693383311722e-06, 3.760615025373543e-06, 3.6705843340464286e-06, 3.581603349196372e-06, 3.4936740869057073e-06, 3.406798539427386e-06, 3.3209786751399187e-06, 3.2362164385026706e-06, 3.1525137500119207e-06, 3.069872506157212e-06, 2.9882945793785367e-06, 2.9077818180237693e-06, 2.8283360463068785e-06, 2.7499590642665774e-06, 2.6726526477254987e-06, 2.596418548250029e-06, 2.52125849311054e-06, 2.4471741852423237e-06, 2.3741673032069756e-06, 2.3022395011543686e-06, 2.2313924087851656e-06, 2.161627631313923e-06, 2.0929467494326614e-06, 2.0253513192751373e-06, 1.9588428723814946e-06, 1.8934229156636452e-06, 1.8290929313710513e-06, 1.7658543770572189e-06, 1.70370868554659e-06, 1.6426572649021476e-06, 1.582701498393474e-06, 1.523842744465437e-06, 1.4660823367073751e-06, 1.4094215838229176e-06, 1.3538617696003064e-06, 1.2994041528833266e-06, 1.2460499675427729e-06, 1.1938004224484988e-06, 1.1426567014420297e-06, 1.0926199633097157e-06, 1.0436913417565365e-06, 9.958719453803278e-07, 9.491628576467515e-07, 9.035651368646648e-07, 8.590798161622227e-07, 8.157079034633974e-07, 7.734503814651906e-07, 7.323082076153509e-07, 6.922823140906753e-07, 6.533736077758868e-07, 6.15582970243117e-07, 5.78911257731879e-07, 5.4335930112972e-07, 5.089279059533658e-07, 4.756178523304622e-07, 4.434298949819449e-07, 4.123647632048644e-07, 3.824231608559492e-07, 3.536057663355852e-07, 3.2591323257248893e-07, 2.993461870088921e-07, 2.7390523158633554e-07, 2.4959094273201977e-07, 2.2640387134577058e-07, 2.0434454278752123e-07, 1.8341345686543332e-07, 1.6361108782456113e-07, 1.449378843361271e-07, 1.2739426948732424e-07, 1.109806407717462e-07, 9.56973700803887e-08, 8.15448036932176e-08, 6.852326227130834e-08, 5.663304084960186e-08, 4.5874408830215434e-08, 3.624760997631982e-08, 2.7752862406654757e-08, 2.0390358590538504e-08, 1.4160265343549083e-08, 9.06272382371065e-09, 5.097849528334919e-09, 2.265732291356626e-09, 5.664362813406765e-10, 0.0]}}, 'validation': {'loss': {'steps': [0, 198, 396, 594, 792, 990, 1188, 1386, 1584, 1782, 1980], 'values': [2.2451398372650146, 0.2929806709289551, 0.25653645396232605, 0.24155215919017792, 0.24525807797908783, 0.24528813362121582, 0.23837484419345856, 0.24173060059547424, 0.2531217634677887, 0.24336600303649902, 0.24261152744293213]}, 'GPT4': {'steps': [0, 198, 396, 594, 792, 990, 1188, 1386, 1584, 1782, 1980], 'values': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}}, 'html':  'total_validation_steps': {'steps': [0], 'values': [462.0]}, 'global_start_time': {'steps': [0], 'values': [1682075779.2404716]}, 'current_val_step': {'steps': [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237, 240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273, 276, 279, 282, 285, 288, 291, 294, 297, 300, 303, 306, 309, 312, 315, 318, 321, 324, 327, 330, 333, 336, 339, 342, 345, 348, 351, 354, 357, 360, 363, 366, 369, 372, 375, 378, 381, 384, 387, 390, 393, 396, 399, 402, 405, 408, 411, 414, 417, 420, 423, 426, 429, 432, 435, 438, 441, 444, 447, 450, 453, 456, 459, 462], 'values': [3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0, 27.0, 30.0, 33.0, 36.0, 39.0, 42.0, 45.0, 48.0, 51.0, 54.0, 57.0, 60.0, 63.0, 66.0, 69.0, 72.0, 75.0, 78.0, 81.0, 84.0, 87.0, 90.0, 93.0, 96.0, 99.0, 102.0, 105.0, 108.0, 111.0, 114.0, 117.0, 120.0, 123.0, 126.0, 129.0, 132.0, 135.0, 138.0, 141.0, 144.0, 147.0, 150.0, 153.0, 156.0, 159.0, 162.0, 165.0, 168.0, 171.0, 174.0, 177.0, 180.0, 183.0, 186.0, 189.0, 192.0, 195.0, 198.0, 201.0, 204.0, 207.0, 210.0, 213.0, 216.0, 219.0, 222.0, 225.0, 228.0, 231.0, 234.0, 237.0, 240.0, 243.0, 246.0, 249.0, 252.0, 255.0, 258.0, 261.0, 264.0, 267.0, 270.0, 273.0, 276.0, 279.0, 282.0, 285.0, 288.0, 291.0, 294.0, 297.0, 300.0, 303.0, 306.0, 309.0, 312.0, 315.0, 318.0, 321.0, 324.0, 327.0, 330.0, 333.0, 336.0, 339.0, 342.0, 345.0, 348.0, 351.0, 354.0, 357.0, 360.0, 363.0, 366.0, 369.0, 372.0, 375.0, 378.0, 381.0, 384.0, 387.0, 390.0, 393.0, 396.0, 399.0, 402.0, 405.0, 408.0, 411.0, 414.0, 417.0, 420.0, 423.0, 426.0, 429.0, 432.0, 435.0, 438.0, 441.0, 444.0, 447.0, 450.0, 453.0, 456.0, 459.0, 462.0]}, 'current_step': {'steps': [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237, 240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273, 276, 279, 282, 285, 288, 291, 294, 297, 300, 303, 306, 309, 312, 315, 318, 321, 324, 327, 330, 333, 336, 339, 342, 345, 348, 351, 354, 357, 360, 363, 366, 369, 372, 375, 378, 381, 384, 387, 390, 393, 396, 399, 402, 405, 408, 411, 414, 417, 420, 423, 426, 429, 432, 435, 438, 441, 444, 447, 450, 453, 456, 459, 462, 465, 468, 471, 474, 477, 480, 483, 486, 489, 492, 495, 498, 501, 504, 507, 510, 513, 516, 519, 522, 525, 528, 531, 534, 537, 540, 543, 546, 549, 552, 555, 558, 561, 564, 567, 570, 573, 576, 579, 582, 585, 588, 591, 594, 597, 600, 603, 606, 609, 612, 615, 618, 621, 624, 627, 630, 633, 636, 639, 642, 645, 648, 651, 654, 657, 660, 663, 666, 669, 672, 675, 678, 681, 684, 687, 690, 693, 696, 699, 702, 705, 708, 711, 714, 717, 720, 723, 726, 729, 732, 735, 738, 741, 744, 747, 750, 753, 756, 759, 762, 765, 768, 771, 774, 777, 780, 783, 786, 789, 792, 795, 798, 801, 804, 807, 810, 813, 816, 819, 822, 825, 828, 831, 834, 837, 840, 843, 846, 849, 852, 855, 858, 861, 864, 867, 870, 873, 876, 879, 882, 885, 888, 891, 894, 897, 900, 903, 906, 909, 912, 915, 918, 921, 924, 927, 930, 933, 936, 939, 942, 945, 948, 951, 954, 957, 960, 963, 966, 969, 972, 975, 978, 981, 984, 987, 990, 993, 996, 999, 1002, 1005, 1008, 1011, 1014, 1017, 1020, 1023, 1026, 1029, 1032, 1035, 1038, 1041, 1044, 1047, 1050, 1053, 1056, 1059, 1062, 1065, 1068, 1071, 1074, 1077, 1080, 1083, 1086, 1089, 1092, 1095, 1098, 1101, 1104, 1107, 1110, 1113, 1116, 1119, 1122, 1125, 1128, 1131, 1134, 1137, 1140, 1143, 1146, 1149, 1152, 1155, 1158, 1161, 1164, 1167, 1170, 1173, 1176, 1179, 1182, 1185, 1188, 1191, 1194, 1197, 1200, 1203, 1206, 1209, 1212, 1215, 1218, 1221, 1224, 1227, 1230, 1233, 1236, 1239, 1242, 1245, 1248, 1251, 1254, 1257, 1260, 1263, 1266, 1269, 1272, 1275, 1278, 1281, 1284, 1287, 1290, 1293, 1296, 1299, 1302, 1305, 1308, 1311, 1314, 1317, 1320, 1323, 1326, 1329, 1332, 1335, 1338, 1341, 1344, 1347, 1350, 1353, 1356, 1359, 1362, 1365, 1368, 1371, 1374, 1377, 1380, 1383, 1386, 1389, 1392, 1395, 1398, 1401, 1404, 1407, 1410, 1413, 1416, 1419, 1422, 1425, 1428, 1431, 1434, 1437, 1440, 1443, 1446, 1449, 1452, 1455, 1458, 1461, 1464, 1467, 1470, 1473, 1476, 1479, 1482, 1485, 1488, 1491, 1494, 1497, 1500, 1503, 1506, 1509, 1512, 1515, 1518, 1521, 1524, 1527, 1530, 1533, 1536, 1539, 1542, 1545, 1548, 1551, 1554, 1557, 1560, 1563, 1566, 1569, 1572, 1575, 1578, 1581, 1584, 1587, 1590, 1593, 1596, 1599, 1602, 1605, 1608, 1611, 1614, 1617, 1620, 1623, 1626, 1629, 1632, 1635, 1638, 1641, 1644, 1647, 1650, 1653, 1656, 1659, 1662, 1665, 1668, 1671, 1674, 1677, 1680, 1683, 1686, 1689, 1692, 1695, 1698, 1701, 1704, 1707, 1710, 1713, 1716, 1719, 1722, 1725, 1728, 1731, 1734, 1737, 1740, 1743, 1746, 1749, 1752, 1755, 1758, 1761, 1764, 1767, 1770, 1773, 1776, 1779, 1782, 1785, 1788, 1791, 1794, 1797, 1800, 1803, 1806, 1809, 1812, 1815, 1818, 1821, 1824, 1827, 1830, 1833, 1836, 1839, 1842, 1845, 1848, 1851, 1854, 1857, 1860, 1863, 1866, 1869, 1872, 1875, 1878, 1881, 1884, 1887, 1890, 1893, 1896, 1899, 1902, 1905, 1908, 1911, 1914, 1917, 1920, 1923, 1926, 1929, 1932, 1935, 1938, 1941, 1944, 1947, 1950, 1953, 1956, 1959, 1962, 1965, 1968, 1971, 1974, 1977, 1980], 'values': [3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0, 27.0, 30.0, 33.0, 36.0, 39.0, 42.0, 45.0, 48.0, 51.0, 54.0, 57.0, 60.0, 63.0, 66.0, 69.0, 72.0, 75.0, 78.0, 81.0, 84.0, 87.0, 90.0, 93.0, 96.0, 99.0, 102.0, 105.0, 108.0, 111.0, 114.0, 117.0, 120.0, 123.0, 126.0, 129.0, 132.0, 135.0, 138.0, 141.0, 144.0, 147.0, 150.0, 153.0, 156.0, 159.0, 162.0, 165.0, 168.0, 171.0, 174.0, 177.0, 180.0, 183.0, 186.0, 189.0, 192.0, 195.0, 198.0, 201.0, 204.0, 207.0, 210.0, 213.0, 216.0, 219.0, 222.0, 225.0, 228.0, 231.0, 234.0, 237.0, 240.0, 243.0, 246.0, 249.0, 252.0, 255.0, 258.0, 261.0, 264.0, 267.0, 270.0, 273.0, 276.0, 279.0, 282.0, 285.0, 288.0, 291.0, 294.0, 297.0, 300.0, 303.0, 306.0, 309.0, 312.0, 315.0, 318.0, 321.0, 324.0, 327.0, 330.0, 333.0, 336.0, 339.0, 342.0, 345.0, 348.0, 351.0, 354.0, 357.0, 360.0, 363.0, 366.0, 369.0, 372.0, 375.0, 378.0, 381.0, 384.0, 387.0, 390.0, 393.0, 396.0, 399.0, 402.0, 405.0, 408.0, 411.0, 414.0, 417.0, 420.0, 423.0, 426.0, 429.0, 432.0, 435.0, 438.0, 441.0, 444.0, 447.0, 450.0, 453.0, 456.0, 459.0, 462.0, 465.0, 468.0, 471.0, 474.0, 477.0, 480.0, 483.0, 486.0, 489.0, 492.0, 495.0, 498.0, 501.0, 504.0, 507.0, 510.0, 513.0, 516.0, 519.0, 522.0, 525.0, 528.0, 531.0, 534.0, 537.0, 540.0, 543.0, 546.0, 549.0, 552.0, 555.0, 558.0, 561.0, 564.0, 567.0, 570.0, 573.0, 576.0, 579.0, 582.0, 585.0, 588.0, 591.0, 594.0, 597.0, 600.0, 603.0, 606.0, 609.0, 612.0, 615.0, 618.0, 621.0, 624.0, 627.0, 630.0, 633.0, 636.0, 639.0, 642.0, 645.0, 648.0, 651.0, 654.0, 657.0, 660.0, 663.0, 666.0, 669.0, 672.0, 675.0, 678.0, 681.0, 684.0, 687.0, 690.0, 693.0, 696.0, 699.0, 702.0, 705.0, 708.0, 711.0, 714.0, 717.0, 720.0, 723.0, 726.0, 729.0, 732.0, 735.0, 738.0, 741.0, 744.0, 747.0, 750.0, 753.0, 756.0, 759.0, 762.0, 765.0, 768.0, 771.0, 774.0, 777.0, 780.0, 783.0, 786.0, 789.0, 792.0, 795.0, 798.0, 801.0, 804.0, 807.0, 810.0, 813.0, 816.0, 819.0, 822.0, 825.0, 828.0, 831.0, 834.0, 837.0, 840.0, 843.0, 846.0, 849.0, 852.0, 855.0, 858.0, 861.0, 864.0, 867.0, 870.0, 873.0, 876.0, 879.0, 882.0, 885.0, 888.0, 891.0, 894.0, 897.0, 900.0, 903.0, 906.0, 909.0, 912.0, 915.0, 918.0, 921.0, 924.0, 927.0, 930.0, 933.0, 936.0, 939.0, 942.0, 945.0, 948.0, 951.0, 954.0, 957.0, 960.0, 963.0, 966.0, 969.0, 972.0, 975.0, 978.0, 981.0, 984.0, 987.0, 990.0, 993.0, 996.0, 999.0, 1002.0, 1005.0, 1008.0, 1011.0, 1014.0, 1017.0, 1020.0, 1023.0, 1026.0, 1029.0, 1032.0, 1035.0, 1038.0, 1041.0, 1044.0, 1047.0, 1050.0, 1053.0, 1056.0, 1059.0, 1062.0, 1065.0, 1068.0, 1071.0, 1074.0, 1077.0, 1080.0, 1083.0, 1086.0, 1089.0, 1092.0, 1095.0, 1098.0, 1101.0, 1104.0, 1107.0, 1110.0, 1113.0, 1116.0, 1119.0, 1122.0, 1125.0, 1128.0, 1131.0, 1134.0, 1137.0, 1140.0, 1143.0, 1146.0, 1149.0, 1152.0, 1155.0, 1158.0, 1161.0, 1164.0, 1167.0, 1170.0, 1173.0, 1176.0, 1179.0, 1182.0, 1185.0, 1188.0, 1191.0, 1194.0, 1197.0, 1200.0, 1203.0, 1206.0, 1209.0, 1212.0, 1215.0, 1218.0, 1221.0, 1224.0, 1227.0, 1230.0, 1233.0, 1236.0, 1239.0, 1242.0, 1245.0, 1248.0, 1251.0, 1254.0, 1257.0, 1260.0, 1263.0, 1266.0, 1269.0, 1272.0, 1275.0, 1278.0, 1281.0, 1284.0, 1287.0, 1290.0, 1293.0, 1296.0, 1299.0, 1302.0, 1305.0, 1308.0, 1311.0, 1314.0, 1317.0, 1320.0, 1323.0, 1326.0, 1329.0, 1332.0, 1335.0, 1338.0, 1341.0, 1344.0, 1347.0, 1350.0, 1353.0, 1356.0, 1359.0, 1362.0, 1365.0, 1368.0, 1371.0, 1374.0, 1377.0, 1380.0, 1383.0, 1386.0, 1389.0, 1392.0, 1395.0, 1398.0, 1401.0, 1404.0, 1407.0, 1410.0, 1413.0, 1416.0, 1419.0, 1422.0, 1425.0, 1428.0, 1431.0, 1434.0, 1437.0, 1440.0, 1443.0, 1446.0, 1449.0, 1452.0, 1455.0, 1458.0, 1461.0, 1464.0, 1467.0, 1470.0, 1473.0, 1476.0, 1479.0, 1482.0, 1485.0, 1488.0, 1491.0, 1494.0, 1497.0, 1500.0, 1503.0, 1506.0, 1509.0, 1512.0, 1515.0, 1518.0, 1521.0, 1524.0, 1527.0, 1530.0, 1533.0, 1536.0, 1539.0, 1542.0, 1545.0, 1548.0, 1551.0, 1554.0, 1557.0, 1560.0, 1563.0, 1566.0, 1569.0, 1572.0, 1575.0, 1578.0, 1581.0, 1584.0, 1587.0, 1590.0, 1593.0, 1596.0, 1599.0, 1602.0, 1605.0, 1608.0, 1611.0, 1614.0, 1617.0, 1620.0, 1623.0, 1626.0, 1629.0, 1632.0, 1635.0, 1638.0, 1641.0, 1644.0, 1647.0, 1650.0, 1653.0, 1656.0, 1659.0, 1662.0, 1665.0, 1668.0, 1671.0, 1674.0, 1677.0, 1680.0, 1683.0, 1686.0, 1689.0, 1692.0, 1695.0, 1698.0, 1701.0, 1704.0, 1707.0, 1710.0, 1713.0, 1716.0, 1719.0, 1722.0, 1725.0, 1728.0, 1731.0, 1734.0, 1737.0, 1740.0, 1743.0, 1746.0, 1749.0, 1752.0, 1755.0, 1758.0, 1761.0, 1764.0, 1767.0, 1770.0, 1773.0, 1776.0, 1779.0, 1782.0, 1785.0, 1788.0, 1791.0, 1794.0, 1797.0, 1800.0, 1803.0, 1806.0, 1809.0, 1812.0, 1815.0, 1818.0, 1821.0, 1824.0, 1827.0, 1830.0, 1833.0, 1836.0, 1839.0, 1842.0, 1845.0, 1848.0, 1851.0, 1854.0, 1857.0, 1860.0, 1863.0, 1866.0, 1869.0, 1872.0, 1875.0, 1878.0, 1881.0, 1884.0, 1887.0, 1890.0, 1893.0, 1896.0, 1899.0, 1902.0, 1905.0, 1908.0, 1911.0, 1914.0, 1917.0, 1920.0, 1923.0, 1926.0, 1929.0, 1932.0, 1935.0, 1938.0, 1941.0, 1944.0, 1947.0, 1950.0, 1953.0, 1956.0, 1959.0, 1962.0, 1965.0, 1968.0, 1971.0, 1974.0, 1977.0, 1980.0]}, 'epoch': {'steps': [198, 396, 594, 792, 990, 1188, 1386, 1584, 1782, 1980], 'values': [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]}}}
experiment/display/chat: True
experiment/display/download_logs: False
experiment/display/download_predictions: False
experiment/display/push_to_huggingface: True
experiment/display/refresh: False
experiment/list/current: False
experiment/display/chat/box: first
experiment/display/chat/messages: [['Create a Cypher statement to answer the following question:List all movies released in 2020', True], ['MATCH (m:Movie {year: 2020}) RETURN {movie: m.title} AS result', False]]
experiment/display/chat/chatbot: Create a Cypher statement to answer the following question:List all movies released in 2020
keep_meta: False
cancel: False
experiment/display/push_to_huggingface/model_name: movie-cypher-generator
experiment/display/push_to_huggingface_submit: True
home: False
report_error: True
q.events
q.args
report_error: True
stacktrace
Traceback (most recent call last):

File β€œ/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py”, line 441, in save _save(obj, opened_zipfile, pickle_module, pickle_protocol)

File β€œ/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py”, line 668, in _save zip_file.write_record(name, storage.data_ptr(), num_bytes)

RuntimeError: [enforce fail at inline_container.cc:471] . PytorchStreamWriter failed writing file data/34: file write failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File β€œ/home/bratanic_tomaz/h2o-llmstudio/./app_utils/handlers.py”, line 301, in handle await experiment_push_to_huggingface_dialog(q)

File β€œ/home/bratanic_tomaz/h2o-llmstudio/./app_utils/sections/experiment.py”, line 1553, in experiment_push_to_huggingface_dialog model.backbone.push_to_hub(

File β€œ/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/transformers/utils/hub.py”, line 781, in push_to_hub self.save_pretrained(work_dir, max_shard_size=max_shard_size)

File β€œ/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/transformers/modeling_utils.py”, line 1843, in save_pretrained save_function(shard, os.path.join(save_directory, shard_file))

File β€œ/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py”, line 440, in save with _open_zipfile_writer(f) as opened_zipfile:

File β€œ/home/bratanic_tomaz/.local/share/virtualenvs/h2o-llmstudio-G1hO3quW/lib/python3.10/site-packages/torch/serialization.py”, line 291, in exit self.file_like.write_end_of_file()

RuntimeError: [enforce fail at inline_container.cc:337] . unexpected pos 6548765184 vs 6548765080

Error
None

To Reproduce

I fine-tuned a EleutherAI/pythia-12b-deduped model and tried to push to HF.

Requirements file

Additionally to the Pipfile we can also generate a requirements.txt file for users to install it outside of pipenv also potentially with other Python versions.

We can add a make command to generate the requirements.txt file from the Pipfile.

[CODE IMPROVEMENT] Save cfg as yaml instead of dill/pickle

πŸ”§ Proposed code refactoring

Move from pickle format of saving cfg_last.p to yaml (or similar format).

Motivation

  • Human readable data format
  • Better comparability to possible code changes (pickle breaks easily)
  • No security concerns when sharing configurations
  • Saving/Ingestion to config.yaml is already implemented in the UI

Note

cfg_last.p stores

['_seed',
'_stop_words_ids',
 '_tokenizer_sep_token',
 '_tokenizer_mask_token_id',
 '_tokenizer_eos_token']

that is currently not stored in config.yaml. Seed is stored in the logs, while the remaining attributes are created in get_tokenizer method.

adding reference to trained checkpoint in GUI

πŸ”§ Proposed code refactoring

Currently, it is possible to start a fine-tuning job based on the last experiment's checkpoint by toggling an option in the GUI. However, I noticed that if I do that more than once(i.e. I create more than 1 'new' experiment from a "base" fine tuned experiment) the new experiment only loads the weights from the backbone and not the checkpoint. This also happens if I update the original data source in the GUI to another fine-tuning dataset. In these cases, I have resorted to using the CLI and hard-coding the checkpoint I want to start my training from. I think it would be a good addition to the GUI to allow us to specify the specific checkpoint instead of just having a toggle button to prevent this error.

Motivation

see above.

[CODE IMPROVEMENT] Improve mask prompt labels for chained parent data

The setting Mask Prompt Labels allows to fully mask the prompt labels and only calculate the loss on the output.
When chaining conversation data while training, we will have a mix of prompts and answers though and currently this setting only calculates loss on the last output.

We potentially want to calculate the loss of all outputs in the whole chained conversation.

[FEATURE] Allow local LLM backbones

πŸš€ Feature

Currently there is a fixed list of LLM backbones hosted and accessible via Huggingface. It would be nice to also be able to specify a local path to a pre-trained model in the UI dropdown. Since "only" CausalLanguageModeling is currently supported, loading or training the pre-trained model is of course subject to certain conditions that would have to be checked accordingly.

Motivation

The scenarios or motivations for the feature are many. On the one hand, I create models myself with the tool, which I may want to fine-tune further. On the other hand, there are models that I try out locally and want to work with. Also, this would be a transitional solution for models located in private-flagged huggingface repositories.

What do you think?

Document H2O LLM Studio

  • Set up Makersaurus + templates in the repo
  • "What is LLM Studio?" page
  • Key terms
  • Concepts
  • Set up LLM Studio + pre-reqs + installation
  • Supported data connectors and data format
  • Import a dataset (for local, upload, AWS, and Kaggle)
  • Configure dataset
  • View datasets
  • Merge datasets
  • Dataset settings
  • Create an experiment
  • View experiments
  • Monitoring experiments
  • Compare experiments
  • Experiment settings
  • Tutorial
  • Document any details about Kaggle and Colab integration
  • Example use case
  • Publish and host documentation

[FEATURE] Add RLHF training

πŸš€ Feature

As a next step, we should add RLHF training to continue fine tuning.

This may include two steps.

  1. Train a reward model
  2. RL using the reward model from 1.

Open questions:
Data labeling? (The human Feedback)

Motivation

Better models

[BUG] failing chat & hf push

πŸ› Bug

I realized that when I have Backbone Dtype selected as "int8", I'm not able to run Chat or push the model into HF with the following error thrown in the logs

Log:

Using cls_token, but it is not set yet.
Using sep_token, but it is not set yet.
2023-04-25 13:45:25,249 - INFO: dtype: torch.float16
2023-04-25 13:45:25,262 - ERROR: Unknown exception
Traceback (most recent call last):
  File "/home/fatih/h2o-llmstudio/./app_utils/handlers.py", line 308, in handle
    await experiment_display(q)
  File "/home/fatih/h2o-llmstudio/./app_utils/sections/experiment.py", line 1063, in experiment_display
    cfg, model, tokenizer = load_cfg_model_tokenizer(
  File "/home/fatih/h2o-llmstudio/./app_utils/sections/experiment.py", line 1600, in load_cfg_model_tokenizer
    model = cfg.architecture.model_class(cfg)
  File "/home/fatih/h2o-llmstudio/./llm_studio/src/models/text_causal_language_modeling_model.py", line 48, in __init__
    self.backbone = create_nlp_backbone(
  File "/home/fatih/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py", line 620, in create_nlp_backbone
    backbone = model_class.from_config(config, **kwargs)
  File "/home/fatih/h2o-llmstudio/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 411, in from_config
    return model_class._from_config(config, **kwargs)
  File "/home/fatih/h2o-llmstudio/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1146, in _from_config
    model = cls(config, **kwargs)
TypeError: OPTForCausalLM.__init__() got an unexpected keyword argument 'device_map'

Based on my tests, the issue is about this backbone dtype setting. If I only change it back to float16, it all works. And if you also look at the logs, at the top, there is still some torch.float16 check? though it's from the int8 experiment.

to produce the issue, you can run an experiment with the train_full.csv (backbone doesn't change the result as far as I've tested)

Using both LORA and FSDP results in error

πŸ› Bug

Setting both LORA and FSDP options to true while fine tuning results in

ValueError: FlatParameter requires uniform dtype but got torch.float16 and torch.float32

To Reproduce

Run an experiment with the OASST data set (https://www.kaggle.com/code/philippsinger/openassistant-conversations-dataset-oasst1?scriptVersionId=126228752) with both LORA and FSDP turned on

I have also attached the experiment configuration and logs
logs_osst-example-fsdp.zip

[CODE IMPROVEMENT] Rework plotting

The current plotting functionality could need some rework, some suggestions:

  • For training plotting, I would just plot the whole sample, and (color)mark the labels.
  • For validation insights we need better visual depictions of the different parts and clearer separation between samples.

FAQ Section

Would be great to have some FAQs and templates/notebooks for common questions.

  • How to generate outputs outside of LLM Studio with trained weights pushed to HF
  • How to continue training from previous experiments, how to load local weights
  • (Lack-of) backward compatibility

Thanks for this repo! (Place to discuss?)

I trained my first LLM yesterday using this, and I really appreciate you creating and open sourcing it!

On a related note, do you happen to have a Slack or Discord anywhere to discuss LLM fine tuning? I'd love to bounce some things off of other folks using this.

[CODE IMPROVEMENT] Improve README for Huggingface integration

πŸ”§ Proposed code refactoring

Add example code on how to use Huggingface models, in particular, how tokenization looks

inputs = tokenizer("<|prompt|>How are you?<|endoftext|><|answer|>", return_tensors="pt", add_special_tokens=False).to("cuda")

Motivation

People may easily miss the correct prompt template.

[CODE IMPROVEMENT] Chat experience

There are several potential things to improve experience of the chat window:

  • Block the chat window if other training runs are active
  • Make the actual model loading procedure a separate step
  • Add sliders for inference settings / or add more helper output

training with 8int leads to error

πŸ› Bug

int8 quantization leads to easy-to-fix error

To Reproduce

In the tutorial notebook, replace 'float16' for the backbone dtype to 'int8'. This leads to the following error:
AttributeError: 'NoneType' object has no attribute 'device'

To Fix

See: tloen/alpaca-lora#14 (comment)

I fixed the code locally by changing line 598 in llm_studio/src/utils/modeling_utils.py from:

kwargs["device_map"] = {"": cfg.environment._device}

into

kwargs["device_map"] = {"":0}

There is probably a more elegant way to do this, as it only works with single GPU training, not distributed training.

[CODE IMPROVEMENT] Redirect stdout logging from huggingface download to the logging module

πŸ”§ Proposed code refactoring

Redirect stdout logging from huggingface download to the logging module

Something on these line could solve it (untested):

import logging
from transformers import logging as transformers_logging

logger = logging.getLogger(__name__)
transformers_logging.set_verbosity_info()
transformers_logging.enable_default_handler()

# Redirect the download progress bar to the logger
transformers_logging.install_verbosity_handler(logger=logger, log_level=logging.INFO)

# Now, any download done by Hugging Face will be logged to the logger

Motivation

Improve logs that are shown in the web app

[BUG] INT8 Push to HF and loading weights

πŸ› Bug

When training a model with INT8 and LORA it does not work pushing it to HF:

2023-04-23 08:33:14,581 - ERROR: Unknown exception                                                                                                                 
Traceback (most recent call last):                                                                                                                                 
  File "/home/philipp/llm_studio/./app_utils/handlers.py", line 301, in handle                                                                                     
    await experiment_push_to_huggingface_dialog(q)                                                                                                                 
  File "/home/philipp/llm_studio/./app_utils/sections/experiment.py", line 1544, in experiment_push_to_huggingface_dialog                                          
    model.backbone = model.backbone.merge_and_unload()                                                                                                             
  File "/home/philipp/.local/share/virtualenvs/llm_studio-BMnV59HB/lib/python3.8/site-packages/peft/tuners/lora.py", line 309, in merge_and_unload                 
    raise ValueError("Cannot merge LORA layers when the model is loaded in 8-bit mode")                                                                            
ValueError: Cannot merge LORA layers when the model is loaded in 8-bit mode 

Additionally, when loading the weights for Chat window, we need to set pretrained=True.

There is also something else that is weird when loading the weights with missing keys, but the inference and score works fine:

2023-04-23 08:26:20,825 - WARNING: Only a part of the pretrained weights was loaded. Some layers can't be initialized with pretrained weights: Error(s) in loading
state_dict for Model:
        Missing key(s) in state_dict: "backbone.base_model.model.gpt_neox.layers.0.attention.query_key_value.weight", "backbone.base_model.model.gpt_neox.layers.0.
attention.dense.weight", "backbone.base_model.model.gpt_neox.layers.0.mlp.dense_h_to_4h.weight", "backbone.base_model.model.gpt_neox.layers.0.mlp.dense_4h_to_h.wei
ght", "backbone.base_model.model.gpt_neox.layers.1.attention.query_key_value.weight", "backbone.base_model.model.gpt_neox.layers.1.attention.dense.weight", "backbo
ne.base_model.model.gpt_neox.layers.1.mlp.dense_h_to_4h.weight", "backbone.base_model.model.gpt_neox.layers.1.mlp.dense_4h_to_h.weight", "backbone.base_model.model
.gpt_neox.layers.2.attention.query_key_value.weight", "backbone.base_model.model.gpt_neox.layers.2.attention.dense.weight", "backbone.base_model.model.gpt_neox.lay
ers.2.mlp.dense_h_to_4h.weight", "backbone.base_model.model.gpt_neox.layers.2.mlp.dense_4h_to_h.weight", ....

[CODE IMPROVEMENT] Update Starlette

πŸ”§ Proposed code refactoring

Update Starlette to v 0.25.0

Motivation

Fix Starlette allows an unauthenticated and remote attacker to specify any number of form fields or files security issue, see here.

[FEATURE] Default dataset

While we describe steps to get and load OASST demo data, one useful improvement could be to directly load the data into the GUI by default.

[BUG] Issues with Cloudflare proxy - Support for RunPod

πŸ› Bug

Hi there!
I'm working on getting native support h2o-llmstudio on RunPod containers.
I was able to get it fully made into docker container that includes training downloads etc.
Issue is that web UI is not being able to connect to UI when we use our cloudflare proxy
image

On the other hand if I setup direct connection using IP:PORT I can access web UI without issues.
Any ideas what could be an issue here. Mayby its issue with some headers?

error in AutoModelForCausalLM.from_pretrained after exporting/merging LoRA llama-based model

πŸ› Bug

After successfully fine-tuning a llama-based model and downloading/merging the model weights through the GUI interface I get the following error whenever I want to load the model back into memory:

Some weights of the model checkpoint at models/model were not used when initializing LlamaForCausalLM: ['lm_head.0.weight']

  • This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Some weights of LlamaForCausalLM were not initialized from the model checkpoint at models/model and are newly initialized: ['lm_head.weight']
    You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

It seems that the saving or the LoRa-merging procedures have an issue with saving the lm_head. This is not a problem if I load the model (no LoRA merging) with the "load_cfg_model_tokenizer" function from the app_utils.sections.experiment directory.

To be clear, fine tuning and inference works great! (thanks for that!). The problem only appears once I want to export the model.

To Reproduce

fine tune a llama-based model (I used elinas/llama-7b-hf-transformers-4.29) and save/download. Try to load into memory afterwards using AutoModelForCausalLM.from_pretrained().

Getting BELU score 0 in validation set

Hi Team, I was trying to train pythia-1.4b-deduped using cli & while training I am getting BELU as 0 for validation set, so basically its not able to generate any text. FYI - I followed the same tutorial shared in the colab notebook of repo

I have shared the logs below, can you please tell if I am going wrong somewhere ?

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/dell/.conda/envs/h2oai/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
2023-04-26 17:53:58,234 - INFO: Global random seed: 1
2023-04-26 17:53:58,234 - INFO: Preparing the data...
2023-04-26 17:53:58,234 - INFO: Setting up automatic validation split...
2023-04-26 17:53:58,376 - INFO: Preparing train and validation data
2023-04-26 17:53:58,377 - INFO: Loading train dataset...
Using pad_token, but it is not set yet.
Using cls_token, but it is not set yet.
Using sep_token, but it is not set yet.
2023-04-26 17:53:58,506 - INFO: Stop token ids: [tensor([])]
2023-04-26 17:53:58,563 - INFO: Sample prompt: As an AI what are your personal feelings on Sarah Connor?<|endoftext|>
2023-04-26 17:53:58,563 - INFO: Loading validation dataset...
Using pad_token, but it is not set yet.
Using cls_token, but it is not set yet.
Using sep_token, but it is not set yet.
2023-04-26 17:53:58,677 - INFO: Stop token ids: [tensor([])]
2023-04-26 17:53:58,681 - INFO: Sample prompt: What types of tests do we have in software development?<|endoftext|>
2023-04-26 17:53:58,681 - INFO: Number of observations in train dataset: 8191
2023-04-26 17:53:58,681 - INFO: Number of observations in validation dataset: 83
2023-04-26 17:53:58,688 - INFO: dtype: torch.float16
/home/dell/.conda/envs/h2oai/lib/python3.10/site-packages/peft/tuners/lora.py:173: UserWarning: fan_in_fan_out is set to True but the target module is not a Conv1D. Setting fan_in_fan_out to False.
warnings.warn(
trainable params: 786432 || all params: 1415434240 || trainable%: 0.055561182411413196
2023-04-26 17:54:05,371 - INFO: Training Epoch: 1 / 1
2023-04-26 17:54:05,372 - INFO: train loss: 0%| | 0/2047 [00:00<?, ?it/s]
Using pad_token, but it is not set yet.
Using cls_token, but it is not set yet.
Using sep_token, but it is not set yet.
2023-04-26 17:54:05,779 - INFO: Stop token ids: [tensor([])]
2023-04-26 17:55:01,890 - INFO: train loss: 2.22: 5%|4 | 102/2047 [00:56<17:57, 1.80it/s]
2023-04-26 17:55:15,374 - INFO: train loss: 2.22: 5%|4 | 102/2047 [01:10<17:57, 1.80it/s]
2023-04-26 17:55:42,947 - INFO: train loss: 2.21: 10%|9 | 204/2047 [01:37<14:16, 2.15it/s]
2023-04-26 17:55:55,376 - INFO: train loss: 2.21: 10%|9 | 204/2047 [01:50<14:16, 2.15it/s]
2023-04-26 17:56:18,324 - INFO: train loss: 2.17: 15%|#4 | 306/2047 [02:12<11:55, 2.43it/s]
2023-04-26 17:56:35,377 - INFO: train loss: 2.17: 15%|#4 | 306/2047 [02:30<11:55, 2.43it/s]
2023-04-26 17:56:55,301 - INFO: train loss: 1.91: 20%|#9 | 408/2047 [02:49<10:42, 2.55it/s]
2023-04-26 17:57:05,378 - INFO: train loss: 1.91: 20%|#9 | 408/2047 [03:00<10:42, 2.55it/s]
2023-04-26 17:57:30,784 - INFO: train loss: 2.27: 25%|##4 | 510/2047 [03:25<09:37, 2.66it/s]
2023-04-26 17:57:45,380 - INFO: train loss: 2.27: 25%|##4 | 510/2047 [03:40<09:37, 2.66it/s]
2023-04-26 17:58:06,714 - INFO: train loss: 2.04: 30%|##9 | 612/2047 [04:01<08:47, 2.72it/s]
2023-04-26 17:58:25,381 - INFO: train loss: 2.04: 30%|##9 | 612/2047 [04:20<08:47, 2.72it/s]
2023-04-26 17:58:41,145 - INFO: train loss: 2.22: 35%|###4 | 714/2047 [04:35<07:57, 2.79it/s]
2023-04-26 17:58:55,383 - INFO: train loss: 2.22: 35%|###4 | 714/2047 [04:50<07:57, 2.79it/s]
2023-04-26 17:59:16,807 - INFO: train loss: 2.03: 40%|###9 | 816/2047 [05:11<07:17, 2.81it/s]
2023-04-26 17:59:35,384 - INFO: train loss: 2.03: 40%|###9 | 816/2047 [05:30<07:17, 2.81it/s]
2023-04-26 17:59:52,680 - INFO: train loss: 1.96: 45%|####4 | 918/2047 [05:47<06:39, 2.82it/s]
2023-04-26 18:00:05,386 - INFO: train loss: 1.96: 45%|####4 | 918/2047 [06:00<06:39, 2.82it/s]
2023-04-26 18:00:33,621 - INFO: train loss: 2.27: 50%|####9 | 1020/2047 [06:28<06:18, 2.71it/s]
2023-04-26 18:00:45,388 - INFO: train loss: 2.27: 50%|####9 | 1020/2047 [06:40<06:18, 2.71it/s]
2023-04-26 18:01:12,389 - INFO: train loss: 2.12: 55%|#####4 | 1122/2047 [07:07<05:44, 2.69it/s]
2023-04-26 18:01:25,389 - INFO: train loss: 2.12: 55%|#####4 | 1122/2047 [07:20<05:44, 2.69it/s]
2023-04-26 18:01:53,475 - INFO: train loss: 2.11: 60%|#####9 | 1224/2047 [07:48<05:13, 2.62it/s]
2023-04-26 18:02:05,391 - INFO: train loss: 2.11: 60%|#####9 | 1224/2047 [08:00<05:13, 2.62it/s]
2023-04-26 18:02:31,822 - INFO: train loss: 2.00: 65%|######4 | 1326/2047 [08:26<04:33, 2.63it/s]
2023-04-26 18:02:45,393 - INFO: train loss: 2.00: 65%|######4 | 1326/2047 [08:40<04:33, 2.63it/s]
2023-04-26 18:03:07,590 - INFO: train loss: 2.02: 70%|######9 | 1428/2047 [09:02<03:49, 2.70it/s]
2023-04-26 18:03:25,394 - INFO: train loss: 2.02: 70%|######9 | 1428/2047 [09:20<03:49, 2.70it/s]
2023-04-26 18:03:47,965 - INFO: train loss: 2.08: 75%|#######4 | 1530/2047 [09:42<03:15, 2.64it/s]
2023-04-26 18:04:05,396 - INFO: train loss: 2.08: 75%|#######4 | 1530/2047 [10:00<03:15, 2.64it/s]
2023-04-26 18:04:23,403 - INFO: train loss: 2.01: 80%|#######9 | 1632/2047 [10:18<02:33, 2.71it/s]
2023-04-26 18:04:35,397 - INFO: train loss: 2.01: 80%|#######9 | 1632/2047 [10:30<02:33, 2.71it/s]
2023-04-26 18:04:59,964 - INFO: train loss: 2.11: 85%|########4 | 1734/2047 [10:54<01:54, 2.73it/s]
2023-04-26 18:05:15,399 - INFO: train loss: 2.11: 85%|########4 | 1734/2047 [11:10<01:54, 2.73it/s]
2023-04-26 18:05:39,237 - INFO: train loss: 2.09: 90%|########9 | 1836/2047 [11:33<01:18, 2.69it/s]
2023-04-26 18:05:55,401 - INFO: train loss: 2.09: 90%|########9 | 1836/2047 [11:50<01:18, 2.69it/s]
2023-04-26 18:06:16,531 - INFO: train loss: 2.05: 95%|#########4| 1938/2047 [12:11<00:40, 2.70it/s]
2023-04-26 18:06:35,402 - INFO: train loss: 2.05: 95%|#########4| 1938/2047 [12:30<00:40, 2.70it/s]
2023-04-26 18:06:55,230 - INFO: train loss: 2.00: 100%|#########9| 2040/2047 [12:49<00:02, 2.68it/s]
2023-04-26 18:06:57,568 - INFO: train loss: 1.93: 100%|##########| 2047/2047 [12:52<00:00, 2.65it/s]
2023-04-26 18:06:57,570 - INFO: Starting validation inference
2023-04-26 18:06:57,570 - INFO: validation progress: 0%| | 0/21 [00:00<?, ?it/s]
2023-04-26 18:06:58,259 - INFO: validation progress: 5%|4 | 1/21 [00:00<00:13, 1.45it/s]
2023-04-26 18:06:58,528 - INFO: validation progress: 10%|9 | 2/21 [00:00<00:08, 2.26it/s]
2023-04-26 18:06:58,768 - INFO: validation progress: 14%|#4 | 3/21 [00:01<00:06, 2.86it/s]
2023-04-26 18:06:58,991 - INFO: validation progress: 19%|#9 | 4/21 [00:01<00:05, 3.34it/s]
2023-04-26 18:06:59,203 - INFO: validation progress: 24%|##3 | 5/21 [00:01<00:04, 3.73it/s]
2023-04-26 18:06:59,507 - INFO: validation progress: 29%|##8 | 6/21 [00:01<00:04, 3.57it/s]
2023-04-26 18:06:59,732 - INFO: validation progress: 33%|###3 | 7/21 [00:02<00:03, 3.81it/s]
2023-04-26 18:06:59,918 - INFO: validation progress: 38%|###8 | 8/21 [00:02<00:03, 4.20it/s]
2023-04-26 18:07:00,148 - INFO: validation progress: 43%|####2 | 9/21 [00:02<00:02, 4.25it/s]
2023-04-26 18:07:00,367 - INFO: validation progress: 48%|####7 | 10/21 [00:02<00:02, 4.34it/s]
2023-04-26 18:07:00,607 - INFO: validation progress: 52%|#####2 | 11/21 [00:03<00:02, 4.29it/s]
2023-04-26 18:07:00,818 - INFO: validation progress: 57%|#####7 | 12/21 [00:03<00:02, 4.42it/s]
2023-04-26 18:07:01,090 - INFO: validation progress: 62%|######1 | 13/21 [00:03<00:01, 4.16it/s]
2023-04-26 18:07:01,353 - INFO: validation progress: 67%|######6 | 14/21 [00:03<00:01, 4.04it/s]
2023-04-26 18:07:01,541 - INFO: validation progress: 71%|#######1 | 15/21 [00:03<00:01, 4.36it/s]
2023-04-26 18:07:01,781 - INFO: validation progress: 76%|#######6 | 16/21 [00:04<00:01, 4.30it/s]
2023-04-26 18:07:02,006 - INFO: validation progress: 81%|######## | 17/21 [00:04<00:00, 4.34it/s]
2023-04-26 18:07:02,221 - INFO: validation progress: 86%|########5 | 18/21 [00:04<00:00, 4.43it/s]
2023-04-26 18:07:02,434 - INFO: validation progress: 90%|######### | 19/21 [00:04<00:00, 4.51it/s]
2023-04-26 18:07:02,800 - INFO: validation progress: 95%|#########5| 20/21 [00:05<00:00, 3.77it/s]
2023-04-26 18:07:03,217 - INFO: validation progress: 100%|##########| 21/21 [00:05<00:00, 3.22it/s]
2023-04-26 18:07:03,269 - INFO: validation progress: 100%|##########| 21/21 [00:05<00:00, 3.69it/s]
2023-04-26 18:07:03,449 - INFO: Mean validation loss: 2.13012
2023-04-26 18:07:03,520 - INFO: Validation BLEU: 0.00000
2023-04-26 18:07:03,891 - INFO: Saving last model checkpoint: val_loss 2.1301, val_BLEU 0.0 to output/demo_oasst-data/
all done

[BUG] Neptune logging not working

πŸ› Bug

It appears the new UX upgrade in Neptune has broken the logging, or it is another issue.

Logs on Neptune are empty.

Suggestion is to upgrade the client and try using append instead of log as described in the docs.

[FEATURE] Support nested tree conversation data

πŸš€ Feature

Support tree-like conversation data - i.e. chain of thoughts such as the OASST data provdes.

Motivation

Currently, we only support prompt/output data structures. While one can manually add previous conversations to the input dataframe, it would be very helpful to support this out of the box.

This will be specifically helpful for conversational bots with history.

Proposed features & solution

A first part of the solution is support providing such information in the input data. I am proposing the following:

  1. Allow to set a parent_column in the dataset. This column will allow to link individual conversations with each other.
  2. Have potentially two extra settings - could be also only one:
    • Probability to link conversations together while training
    • Number of conversations to link together

Additionally, I would suggest to add an augmentation setting:

  • Random probability to link any random conversations together.

This might help to differentiate between unrelated and related context.

Potential additional tasks:

  • Update README
  • Update configs
  • Update Kaggle Dataset notebook

[BUG] The responses from LLM studio and HF differ greatly

πŸ› Bug

I am not sure if this is a bug, but just my lack of understanding. However, I have fine-tuned a model (EleutherAI/pythia-1b) to generate Cypher statement. In the studio chat, it all looks fantastic:
Screenshot from 2023-04-21 15-37-48

I have uploaded the weight to HF, and tried to replicate the text generation with transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import GPTNeoXForCausalLM

device = "cuda:0"

tokenizer = AutoTokenizer.from_pretrained("tomasonjo/movie-generator-small")

model = GPTNeoXForCausalLM.from_pretrained("tomasonjo/movie-generator-small").to(device)


inputs = tokenizer("Create a Cypher statement to answer the following question:What movies did Tom Hanks star in?", return_tensors="pt").to(device)
tokens = model.generate(**inputs, max_new_tokens=256).to(device)
tokenizer.decode(tokens[0])

I get the following result:

Create a Cypher statement to answer the following question:What movies did Tom Hanks star in?Create a movieCreate a movieCreate a movieSend this message to:Create a movieCreate a movieSend this message to:Create a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate a movieCreate

What am I missing?

To Reproduce

  1. Fine-tune a model: EleutherAI/pythia-1b
  2. Push the model to HF
  3. Run model in transformers

I've tried both AutoModelForCausalLM and GPTNeoXForCausalLM

I can share the training data if needed, and have also made the HF model public

[CODE IMPROVEMENT] Store weights in AutoModelForCausalLM format

πŸ”§ Proposed code refactoring

Currently, model weights are stored in LLm Studio format which is a small wrapper around AutoModelForCausalLM. Instead, store model weights in AutoModelForCausalLM format, as well as tokenizer/model configs in the experiment output directory.

Motivation

Allows to directly use grab the output directory within the Huggingface universe.
Related to #10.

[FEATURE] GPT_EVAL_MAX editable via UI

πŸš€ Feature

Soweit ich sehen konnte, gibt es derzeit kein Bedienfeld, mit dem ich die angesprochenen Environment Variable editieren kann, was dazu fΓΌhrt, dass das Statement WARNING: More than 100 validation records. Safeguarding against OpenAI API costs. Setting metric to BLEU. Change GPT_EVAL_MAX to run GPT im Log "verschluckt" wird.

Motivation

If possible, it would probably make sense not to have to reboot the system to set this value. It might be unlikely that several experiments run in parallel - nevertheless, this could probably be avoided.

What do you think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.