linonetwo / moss-dockerfile Goto Github PK

View Code? Open in Web Editor NEW

15.0 2.0 4.0 205 KB

用于在 Docker 里运行复旦的 MOSS 语言模型，使用 GradIO 提供 WebUI。

License: MIT License

Dockerfile 22.24% Python 77.76%

chatgpt docker moss chatglm ai cuda deeplearning pytorch gpu

moss-dockerfile's Introduction

MOSS-DockerFile

https://hub.docker.com/repository/docker/linonetwo/moss/general

用于在 Docker 里运行复旦的 MOSS 语言模型，使用 GradIO 提供 WebUI。

Run

运行环境

Windows 上 docker 运行环境搭建参考2022 最新 Docker 和 WSL2 ，炼丹环境配置指南 - 无人知晓的文章 - 知乎。然后使用 docker desktop 下载和管理镜像。

最小的 int4 量化版本，需要一个 3090ti，将占用 14GB 显存，并随着上下文增加而增加。

下载模型

参考 https://huggingface.co/fnlp/moss-moon-003-sft-int4

# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/fnlp/moss-moon-003-sft-int4

# if you want to clone without large files – just their pointers
# prepend your git clone with the following env var:
GIT_LFS_SKIP_SMUDGE=1

我个人是先带上 GIT_LFS_SKIP_SMUDGE=1 去 git clone，来下载 config.json 等文件。然后单独用浏览器下载 10GB 的 pytorch_model.bin 文件，会比较快。

之后需要将这个 moss-moon-003-sft-int4 文件夹作为 Volume 挂载到 docker 的 /mnt/llm 文件夹上。

运行 docker

Example shell command to run it in windows11 with WSL2 enabled

docker run --gpus all -it --volume=C:\Users\linonetwo\Documents\model\LanguageModel:/mnt/llm -d moss:latest

My models are inside C:\Users\linonetwo\Documents\model\LanguageModel, which includes C:\Users\linonetwo\Documents\model\LanguageModel\MOSS\moss-moon-003-sft-plugin-int4\pytorch_model.bin.

If your model are in different location, change the --volume=xxx:/mnt/llm part in the command.

In Docker Desktop, it looks like this:

Build

如果修改了 Docker file，需要重新构建镜像，这样构建：

docker build -t moss . -f ./moss-int4-cuda117.dockerfile

Screenshots

FAQ

Killed

Docker 显示 Killed，一般是因为内存（RAM）不足，改 WSL2 的配置改大提供给虚拟机的内存即可：

参考 https://stackoverflow.com/a/72693871/4617295

failed to solve with frontend dockerfile.v0

 => ERROR [internal] load metadata for docker.io/nvidia/cuda:11.7.1-cudnn8-devel-ubuntu22.04                                    0.2s
------
 > [internal] load metadata for docker.io/nvidia/cuda:11.7.1-cudnn8-devel-ubuntu22.04:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: unexpected status code [manifests 11.7.1-cudnn8-devel-ubuntu22.04]: 403 Forbidden

可能需要关掉 buildkit ，参考 https://stackoverflow.com/a/70153060/4617295

Failed to run image

可能是没填 ENTRYPOINT，导致尝试基础镜像执行已被删掉的 /opt/nvidia/nvidia_entrypoint.sh

Failed to run image. Error invoking remote method 'docker-run-container': Error: (HTTP code 400) unexpected - failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/opt/nvidia/nvidia_entrypoint.sh": stat /opt/nvidia/nvidia_entrypoint.sh: no such file or directory: unknown

补上即可

ENTRYPOINT ["/usr/bin/python3"]
CMD ["moss_web_demo_gradio.py"]
# ENTRYPOINT ["/usr/bin/env"]
# CMD ["bash"]

moss_web_demo_gradio.py

修改自 https://github.com/OpenLMLab/MOSS/blob/main/moss_web_demo_gradio.py ，那个版本未考虑量化用法，普通人电脑跑不了，故适配了量化用法，并自动修复官方运行 demo 会报的 name 'autotune' is not defined 错误。

GPU no found

关键是带上 --gpus all ，即 docker run --gpus all

Credit

参考了 mortals-debuging/pytorch-docker

moss-dockerfile's People

Contributors

Stargazers

Watchers

Forkers

ruolunhui commissarster tiamohf hansonwong47

moss-dockerfile's Issues

/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py: │
2023-05-06 14:45:59 │ 614 in _get_config_dict │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 611 │ │ │ │
2023-05-06 14:45:59 │ 612 │ │ │ try: │
2023-05-06 14:45:59 │ 613 │ │ │ │ # Load from local folder or from cache or download fro │
2023-05-06 14:45:59 │ ❱ 614 │ │ │ │ resolved_config_file = cached_file( │
2023-05-06 14:45:59 │ 615 │ │ │ │ │ pretrained_model_name_or_path, │
2023-05-06 14:45:59 │ 616 │ │ │ │ │ configuration_file, │
2023-05-06 14:45:59 │ 617 │ │ │ │ │ cache_dir=cache_dir, │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:409 in │
2023-05-06 14:45:59 │ cached_file │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 406 │ user_agent = http_user_agent(user_agent) │
2023-05-06 14:45:59 │ 407 │ try: │
2023-05-06 14:45:59 │ 408 │ │ # Load from URL or cache if already cached │
2023-05-06 14:45:59 │ ❱ 409 │ │ resolved_file = hf_hub_download( │
2023-05-06 14:45:59 │ 410 │ │ │ path_or_repo_id, │
2023-05-06 14:45:59 │ 411 │ │ │ filename, │
2023-05-06 14:45:59 │ 412 │ │ │ subfolder=None if len(subfolder) == 0 else subfolder, │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py │
2023-05-06 14:45:59 │ :112 in _inner_fn │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 109 │ │ │ kwargs.items(), # Kwargs values │
2023-05-06 14:45:59 │ 110 │ │ ): │
2023-05-06 14:45:59 │ 111 │ │ │ if arg_name in ["repo_id", "from_id", "to_id"]: │
2023-05-06 14:45:59 │ ❱ 112 │ │ │ │ validate_repo_id(arg_value) │
2023-05-06 14:45:59 │ 113 │ │ │ │
2023-05-06 14:45:59 │ 114 │ │ │ elif arg_name == "token" and arg_value is not None: │
2023-05-06 14:45:59 │ 115 │ │ │ │ has_token = True │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py │
2023-05-06 14:45:59 │ :160 in validate_repo_id │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 157 │ │ raise HFValidationError(f"Repo id must be a string, not {type( │
2023-05-06 14:45:59 │ 158 │ │
2023-05-06 14:45:59 │ 159 │ if repo_id.count("/") > 1: │
2023-05-06 14:45:59 │ ❱ 160 │ │ raise HFValidationError( │
2023-05-06 14:45:59 │ 161 │ │ │ "Repo id must be in the form 'repo_name' or 'namespace/rep │
2023-05-06 14:45:59 │ 162 │ │ │ f" '{repo_id}'. Use repo_type argument if needed." │
2023-05-06 14:45:59 │ 163 │ │ ) │
2023-05-06 14:45:59 ╰──────────────────────────────────────────────────────────────────────────────╯
2023-05-06 14:45:59 HFValidationError: Repo id must be in the form 'repo_name' or
2023-05-06 14:45:59 'namespace/repo_name': '/mnt/llm/MOSS/moss-moon-003-sft-int4'. Use repo_type
2023-05-06 14:45:59 argument if needed.
2023-05-06 14:45:59
2023-05-06 14:45:59 During handling of the above exception, another exception occurred:
2023-05-06 14:45:59
2023-05-06 14:45:59 ╭───────────────────── Traceback (most recent call last) ──────────────────────╮
2023-05-06 14:45:59 │ /opt/project/MOSS/moss_web_demo_gradio.py:36 in │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 33 if ('int8' in args.model_name or 'int4' in args.model_name) and num_gp │
2023-05-06 14:45:59 │ 34 │ raise ValueError("Quantized models do not support model parallel. │
2023-05-06 14:45:59 │ 35 │
2023-05-06 14:45:59 │ ❱ 36 config = MossConfig.from_pretrained(args.model_name) │
2023-05-06 14:45:59 │ 37 tokenizer = MossTokenizer.from_pretrained(args.model_name) │
2023-05-06 14:45:59 │ 38 │
2023-05-06 14:45:59 │ 39 if num_gpus > 1: │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ /usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py: │
2023-05-06 14:45:59 │ 532 in from_pretrained │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 529 │ │ assert config.output_attentions == True │
2023-05-06 14:45:59 │ 530 │ │ assert unused_kwargs == {"foo": False} │
2023-05-06 14:45:59 │ 531 │ │ ```""" │
2023-05-06 14:45:59 │ ❱ 532 │ │ config_dict, kwargs = cls.get_config_dict(pretrained_model_nam │
2023-05-06 14:45:59 │ 533 │ │ if "model_type" in config_dict and hasattr(cls, "model_type") │
2023-05-06 14:45:59 │ 534 │ │ │ logger.warning( │
2023-05-06 14:45:59 │ 535 │ │ │ │ f"You are using a model of type {config_dict['model_ty │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ /usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py: │
2023-05-06 14:45:59 │ 559 in get_config_dict │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 556 │ │ """ │
2023-05-06 14:45:59 │ 557 │ │ original_kwargs = copy.deepcopy(kwargs) │
2023-05-06 14:45:59 │ 558 │ │ # Get config dict associated with the base config file │
2023-05-06 14:45:59 │ ❱ 559 │ │ config_dict, kwargs = cls._get_config_dict(pretrained_model_na │
2023-05-06 14:45:59 │ 560 │ │ if "_commit_hash" in config_dict: │
2023-05-06 14:45:59 │ 561 │ │ │ original_kwargs["_commit_hash"] = config_dict["_commit_has │
2023-05-06 14:45:59 │ 562 │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ /usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py: │
2023-05-06 14:45:59 │ 635 in _get_config_dict │
2023-05-06 14:45:59 │ │
2023-05-06 14:45:59 │ 632 │ │ │ │ raise │
2023-05-06 14:45:59 │ 633 │ │ │ except Exception: │
2023-05-06 14:45:59 │ 634 │ │ │ │ # For any other exception, we throw a generic error. │
2023-05-06 14:45:59 │ ❱ 635 │ │ │ │ raise EnvironmentError( │
2023-05-06 14:45:59 │ 636 │ │ │ │ │ f"Can't load the configuration of '{pretrained_mod │
2023-05-06 14:45:59 │ 637 │ │ │ │ │ " from 'https://huggingface.co/models', make sure │
2023-05-06 14:45:59 │ 638 │ │ │ │ │ f" name. Otherwise, make sure '{pretrained_model_n │
2023-05-06 14:45:59 ╰──────────────────────────────────────────────────────────────────────────────╯
2023-05-06 14:45:59 OSError: Can't load the configuration of '/mnt/llm/MOSS/moss-moon-003-sft-int4'.
2023-05-06 14:45:59 If you were trying to load it from 'https://huggingface.co/models', make sure
2023-05-06 14:45:59 you don't have a local directory with the same name. Otherwise, make sure
2023-05-06 14:45:59 '/mnt/llm/MOSS/moss-moon-003-sft-int4' is the correct path to a directory
2023-05-06 14:45:59 containing a config.json file

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.