Giter Site home page Giter Site logo

ymcui / chinese-llama-alpaca Goto Github PK

View Code? Open in Web Editor NEW
17.8K 185.0 1.8K 23.53 MB

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Home Page: https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki

License: Apache License 2.0

Python 97.61% Shell 2.39%
llm plm pre-trained-language-models alpaca llama nlp quantization large-language-models lora alpaca-2

chinese-llama-alpaca's People

Contributors

airaria avatar alkaideemo avatar bigbaldy1128 avatar dependabot[bot] avatar gogojoestar avatar guohao avatar imounttai avatar iyorke avatar sgsdxzy avatar sunyuhan19981208 avatar ymcui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinese-llama-alpaca's Issues

建议

如果能训练出编程专用的,那就太棒了。

关于load_in_8bit

您好,请问一下,alpaca训练的时候load_in_8bit是否有设置为True?
我这边在多卡训练的时候只有load_in_8bit设置为True的时候才能正常运行,否则会报Expected all tensors to be on the same device, but found at least two device这种错误。

model = LlamaForCausalLM.from_pretrained( "../Chinese-LLaMA-Alpaca/zh_extended_llma/", load_in_8bit=True, device_map=device_map, max_memory=max_memory )

请教,预训练阶段的具体细节

您在固定transformer参数,仅训练embedding的阶段的代码思路是否能提供一下?是否类似这样的做法:
import torch
import torch.optim as optim

optimizer = optim.Adam(model.embeddings.parameters(), lr=1e-3)

for epoch in range(num_epochs):
for batch in data:
inputs, labels = batch

    optimizer.zero_grad()
    outputs = model(inputs)
    loss = loss_function(outputs, labels)
    loss.backward()
    
    optimizer.step()

请问该模型推理时使用的是CPU还是GPU?

使用默认的./main中的程序运行,过程中发现GPU没有工作(包括核心和vram都没有工作),但是生成速度已然很快,虽然我在环境中配置了cuda,但是显然并没有调用,请问如何启用cuda参与推理?

Finetune时添加的prompts还是原版的英文的吗?

请问fine-tune中文语料的时候,还是用的原版的alpaca.json里的prompts吗,比如:
Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n
目前写信、写文章那还没复现,其他的似乎没问题。
所以预测也是这样加上去的?
十分感谢回答!

全量模型(.pth)转huggingface参数模型遇到的问题和解决方法

问题

在step3结束后得到了.pth版本的文件consolidated.00.pth,想使用huggingface加载方式来调用模型,所以要把.pth格式转为bin格式。但是转的时候报错了

python convert_llama_weights_to_hf.py \   
--input_dir /xxx/llama_alpaca_zh_HIT \  # .pth文件保存位置
--model_size 7B \
--output_dir your/output/path

FileNotFoundError: [Errno 2] No such file or directory: '/xxx/llama_alpaca_zh_HIT/7B/params.json'
以及这个错误
FileNotFoundError: [Errno 2] No such file or directory: '/xxx/llama_alpaca_zh_HIT/tokenizer.model'

解决方法

只需在llama_alpaca_zh_HIT文件夹中新建一个名为7B的文件夹,将params.jsonconsolidated.00.pth都存入进去,然后将step2中得到的llama+中文词表模型的tokenizer.model文件存入llama_alpaca_zh_HIT文件夹就可以正常导出为huggingface格式的参数模型。后续的调用就不需要使用llama.cpp工具了。

参考调用方法

https://github.com/LC1332/Chinese-alpaca-lora/blob/main/notebook/evaluation_code.ipynb
不需要lora合并 PeftModel.from_pretrained 这一步,直接加载使用就行了。

模型合并过程中,step2没有生成 .pth文件

模型合并过程中,step2 extended_model目录没有生成 .pth文件,生成文件为
config.json
generation_config.json
pytorch_model-00001-of-00002.bin
pytorch_model-00002-of-00002.bin
pytorch_model.bin.index.json
special_tokens_map.json
tokenizer.model
tokenizer_config.json

关于模型中lora结构

hello,感谢你分享了扩充中文词表版本的llama模型,不过我对模型结构有点疑问,我看训练细节中提到在预训练过程中也是用了lora框架,为什么我合并完模型后,在模型结构里看不到lora结构呢。还是说合并完成之后还要同时加载adapter_model.bin这个模型

请问如何实现chat的形式

您好,感谢分享,想问一下如何实现连续问答呢?
是通过每次把上一次的输入和输出都加入到下一次的输入中么?
谢谢

请问文档是不是有问题

image
本地快速部署中 Step 2: 生成量化版本模型,需要指定[合并模型]中最后一步获取的.pth模型文件

而合并模型后,获取到的是bin文件。。

中英互译效果怎么样呢?

感谢你们的努力,中文处理获得长足进展。我有看到训练数据里面包含了翻译。请问实际应用效果如何?

Prompt 最长可以有多少?

GPT3.5 是4k Token,不知我们的项目是社么限制?如果长度可以很长的话,那么就可以嵌入很多索引资料了。

报错:无法加载"LlamaForCausalLM"

请问transformers和torch的版本是不是有什么要求啊?pip直接安装的transfomers里面没有llama文件夹,从GitHub下载下来添加llama但是依然报错,torch2.0.0稳定版和2.1.0测试版都试过了,在添加vocab这一步都卡住了
image
image
PS C:\Users\1> pip freeze
alabaster @ file:///home/ktietz/src/ci/alabaster_1611921544520/work
anaconda-client==1.11.1
anaconda-navigator==2.4.0
anaconda-project @ file:///C:/Windows/TEMP/abs_91fu4tfkih/croots/recipe/anaconda-project_1660339890874/work
anyio @ file:///C:/ci/anyio_1644481856696/work/dist
appdirs==1.4.4
argon2-cffi @ file:///opt/conda/conda-bld/argon2-cffi_1645000214183/work
argon2-cffi-bindings @ file:///C:/ci/argon2-cffi-bindings_1644569876605/work
arrow @ file:///C:/b/abs_cal7u12ktb/croot/arrow_1676588147908/work
astroid @ file:///C:/b/abs_d4lg3_taxn/croot/astroid_1676904351456/work
astropy @ file:///C:/ci/astropy_1657719642921/work
asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
atomicwrites==1.4.0
attrs @ file:///C:/b/abs_09s3y775ra/croot/attrs_1668696195628/work
Automat @ file:///tmp/build/80754af9/automat_1600298431173/work
autopep8 @ file:///opt/conda/conda-bld/autopep8_1650463822033/work
Babel @ file:///C:/b/abs_a2shv_3tqi/croot/babel_1671782804377/work
backcall @ file:///home/ktietz/src/ci/backcall_1611930011877/work
backports.functools-lru-cache @ file:///tmp/build/80754af9/backports.functools_lru_cache_1618170165463/work
backports.tempfile @ file:///home/linux1/recipes/ci/backports.tempfile_1610991236607/work
backports.weakref==1.0.post1
bcrypt @ file:///C:/Windows/Temp/abs_36kl66t_aw/croots/recipe/bcrypt_1659554334050/work
beautifulsoup4 @ file:///C:/ci/beautifulsoup4_1650293028159/work
binaryornot @ file:///tmp/build/80754af9/binaryornot_1617751525010/work
black @ file:///C:/ci/black_1660221726201/work
bleach @ file:///opt/conda/conda-bld/bleach_1641577558959/work
bokeh @ file:///C:/Windows/TEMP/abs_4a259bc2-ed05-4a1f-808e-ac712cc0900cddqp8sp7/croots/recipe/bokeh_1658136660686/work
Bottleneck @ file:///C:/Windows/Temp/abs_3198ca53-903d-42fd-87b4-03e6d03a8381yfwsuve8/croots/recipe/bottleneck_1657175565403/work
brotlipy==0.7.0
certifi==2022.12.7
cffi @ file:///C:/b/abs_49n3v2hyhr/croot/cffi_1670423218144/work
chardet @ file:///C:/ci_310/chardet_1642114080098/work
charset-normalizer==3.1.0
click @ file:///C:/ci/click_1646056762388/work
cloudpickle @ file:///tmp/build/80754af9/cloudpickle_1632508026186/work
clyent==1.2.2
colorama==0.4.6
colorcet @ file:///C:/b/abs_46vyu0rpdl/croot/colorcet_1668084513237/work
comm @ file:///C:/b/abs_1419earm7u/croot/comm_1671231131638/work
conda==23.1.0
conda-build==3.23.3
conda-content-trust @ file:///C:/Windows/TEMP/abs_4589313d-fc62-4ccc-81c0-b801b4449e833j1ajrwu/croots/recipe/conda-content-trust_1658126379362/work
conda-pack @ file:///tmp/build/80754af9/conda-pack_1611163042455/work
conda-package-handling @ file:///C:/b/abs_fcga8w0uem/croot/conda-package-handling_1672865024290/work
conda-repo-cli==1.0.27
conda-token @ file:///Users/paulyim/miniconda3/envs/c3i/conda-bld/conda-token_1662660369760/work
conda-verify==3.4.2
conda_package_streaming @ file:///C:/b/abs_0e5n5hdal3/croot/conda-package-streaming_1670508162902/work
constantly==15.1.0
contourpy @ file:///C:/b/abs_d5rpy288vc/croots/recipe/contourpy_1663827418189/work
cookiecutter @ file:///opt/conda/conda-bld/cookiecutter_1649151442564/work
cryptography @ file:///C:/b/abs_8ecplyc3n2/croot/cryptography_1677533105000/work
cssselect==1.1.0
cycler @ file:///tmp/build/80754af9/cycler_1637851556182/work
cytoolz @ file:///C:/b/abs_61m9vzb4qh/croot/cytoolz_1667465938275/work
daal4py==2023.0.2
dask @ file:///C:/ci/dask-core_1658497112560/work
datashader @ file:///C:/b/abs_e80f3d7ac0/croot/datashader_1676023254070/work
datashape==0.5.4
debugpy @ file:///C:/ci_310/debugpy_1642079916595/work
decorator @ file:///opt/conda/conda-bld/decorator_1643638310831/work
defusedxml @ file:///tmp/build/80754af9/defusedxml_1615228127516/work
diff-match-patch @ file:///Users/ktietz/demo/mc3/conda-bld/diff-match-patch_1630511840874/work
dill @ file:///C:/b/abs_42h_07z1yj/croot/dill_1667919550096/work
distributed @ file:///C:/ci/distributed_1658523963030/work
docstring-to-markdown @ file:///C:/b/abs_cf10j8nr4q/croot/docstring-to-markdown_1673447652942/work
docutils @ file:///C:/Windows/TEMP/abs_24e5e278-4d1c-47eb-97b9-f761d871f482dy2vg450/croots/recipe/docutils_1657175444608/work
entrypoints @ file:///C:/ci/entrypoints_1649926676279/work
et-xmlfile==1.1.0
executing @ file:///opt/conda/conda-bld/executing_1646925071911/work
fastjsonschema @ file:///C:/Users/BUILDE1/AppData/Local/Temp/abs_ebruxzvd08/croots/recipe/python-fastjsonschema_1661376484940/work
filelock==3.10.7
flake8 @ file:///C:/b/abs_9f6_n1jlpc/croot/flake8_1674581816810/work
Flask @ file:///C:/b/abs_ef16l83sif/croot/flask_1671217367534/work
flit_core @ file:///opt/conda/conda-bld/flit-core_1644941570762/work/source/flit_core
fonttools==4.25.0
fsspec @ file:///C:/b/abs_5bjz6v0w_f/croot/fsspec_1670336608940/work
future @ file:///C:/b/abs_3dcibf18zi/croot/future_1677599891380/work
gensim @ file:///C:/b/abs_a5vat69tv8/croot/gensim_1674853640591/work
glob2 @ file:///home/linux1/recipes/ci/glob2_1610991677669/work
greenlet @ file:///C:/b/abs_47lk_w2ajq/croot/greenlet_1670513248400/work
h5py @ file:///C:/ci/h5py_1659089830381/work
HeapDict @ file:///Users/ktietz/demo/mc3/conda-bld/heapdict_1630598515714/work
holoviews @ file:///C:/b/abs_bbf97_0kcd/croot/holoviews_1676372911083/work
huggingface-hub==0.13.3
hvplot @ file:///C:/b/abs_13un17_4x_/croot/hvplot_1670508919193/work
hyperlink @ file:///tmp/build/80754af9/hyperlink_1610130746837/work
idna==3.4
imagecodecs @ file:///C:/b/abs_f0cr12h73p/croot/imagecodecs_1677576746499/work
imageio @ file:///C:/b/abs_27kq2gy1us/croot/imageio_1677879918708/work
imagesize @ file:///C:/Windows/TEMP/abs_3cecd249-3fc4-4bfc-b80b-bb227b0d701en12vqzot/croots/recipe/imagesize_1657179501304/work
imbalanced-learn @ file:///C:/b/abs_1911ryuksz/croot/imbalanced-learn_1677191585237/work
importlib-metadata @ file:///C:/ci/importlib-metadata_1648544469310/work
incremental @ file:///tmp/build/80754af9/incremental_1636629750599/work
inflection==0.5.1
iniconfig @ file:///home/linux1/recipes/ci/iniconfig_1610983019677/work
intake @ file:///C:/b/abs_42yyb2lhwx/croot/intake_1676619887779/work
intervaltree @ file:///Users/ktietz/demo/mc3/conda-bld/intervaltree_1630511889664/work
ipykernel @ file:///C:/b/abs_b4f07tbsyd/croot/ipykernel_1672767104060/work
ipython @ file:///C:/b/abs_d3h279dv3h/croot/ipython_1676582236558/work
ipython-genutils @ file:///tmp/build/80754af9/ipython_genutils_1606773439826/work
ipywidgets @ file:///tmp/build/80754af9/ipywidgets_1634143127070/work
isort @ file:///tmp/build/80754af9/isort_1628603791788/work
itemadapter @ file:///tmp/build/80754af9/itemadapter_1626442940632/work
itemloaders @ file:///opt/conda/conda-bld/itemloaders_1646805235997/work
itsdangerous @ file:///tmp/build/80754af9/itsdangerous_1621432558163/work
jedi @ file:///C:/ci/jedi_1644315428305/work
jellyfish @ file:///C:/ci/jellyfish_1647962737334/work
Jinja2==3.1.2
jinja2-time @ file:///opt/conda/conda-bld/jinja2-time_1649251842261/work
jmespath @ file:///Users/ktietz/demo/mc3/conda-bld/jmespath_1630583964805/work
joblib @ file:///C:/b/abs_e60_bwl1v6/croot/joblib_1666298845728/work
json5 @ file:///tmp/build/80754af9/json5_1624432770122/work
jsonschema @ file:///C:/b/abs_6ccs97j_l8/croot/jsonschema_1676558690963/work
jupyter @ file:///C:/Windows/TEMP/abs_56xfdi__li/croots/recipe/jupyter_1659349053177/work
jupyter-console @ file:///C:/b/abs_68ttzd5p9c/croot/jupyter_console_1677674667636/work
jupyter-server @ file:///C:/b/abs_1cfi3__jl8/croot/jupyter_server_1671707636383/work
jupyter_client @ file:///C:/ci/jupyter_client_1661834530766/work
jupyter_core @ file:///C:/b/abs_bd7elvu3w2/croot/jupyter_core_1676538600510/work
jupyterlab @ file:///C:/b/abs_513jt6yy74/croot/jupyterlab_1675354138043/work
jupyterlab-pygments @ file:///tmp/build/80754af9/jupyterlab_pygments_1601490720602/work
jupyterlab-widgets @ file:///tmp/build/80754af9/jupyterlab_widgets_1609884341231/work
jupyterlab_server @ file:///C:/b/abs_d1z_g1swc8/croot/jupyterlab_server_1677153204814/work
keyring @ file:///C:/ci_310/keyring_1642165564669/work
kiwisolver @ file:///C:/b/abs_88mdhvtahm/croot/kiwisolver_1672387921783/work
lazy-object-proxy @ file:///C:/ci_310/lazy-object-proxy_1642083437654/work
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
llvmlite==0.39.1
locket @ file:///C:/ci/locket_1652904090946/work
lxml @ file:///C:/ci/lxml_1657527492694/work
lz4 @ file:///C:/ci_310/lz4_1643300078932/work
Markdown @ file:///C:/b/abs_98lv_ucina/croot/markdown_1671541919225/work
MarkupSafe==2.1.2
matplotlib @ file:///C:/b/abs_b2d7uv90hg/croot/matplotlib-suite_1677674332463/work
matplotlib-inline @ file:///C:/ci/matplotlib-inline_1661934094726/work
mccabe @ file:///opt/conda/conda-bld/mccabe_1644221741721/work
menuinst @ file:///C:/Users/BUILDE
1/AppData/Local/Temp/abs_455sf5o0ct/croots/recipe/menuinst_1661805970842/work
mistune @ file:///C:/ci_310/mistune_1642084168466/work
mkl-fft==1.3.1
mkl-random @ file:///C:/ci_310/mkl_random_1643050563308/work
mkl-service==2.4.0
mock @ file:///tmp/build/80754af9/mock_1607622725907/work
mpmath==1.3.0
msgpack @ file:///C:/ci/msgpack-python_1652348582618/work
multipledispatch @ file:///C:/ci_310/multipledispatch_1642084438481/work
munkres==1.1.4
mypy-extensions==0.4.3
navigator-updater==0.3.0
nbclassic @ file:///C:/b/abs_d0_ze5q0j2/croot/nbclassic_1676902914817/work
nbclient @ file:///C:/ci/nbclient_1650308592199/work
nbconvert @ file:///C:/b/abs_4av3q4okro/croot/nbconvert_1668450658054/work
nbformat @ file:///C:/b/abs_85_3g7dkt4/croot/nbformat_1670352343720/work
nest-asyncio @ file:///C:/b/abs_3a_4jsjlqu/croot/nest-asyncio_1672387322800/work
networkx==3.0
nltk @ file:///opt/conda/conda-bld/nltk_1645628263994/work
notebook @ file:///C:/b/abs_ca13hqvuzw/croot/notebook_1668179888546/work
notebook_shim @ file:///C:/b/abs_ebfczttg6x/croot/notebook-shim_1668160590914/work
numba @ file:///C:/b/abs_e53pp2e4k7/croot/numba_1670258349527/work
numexpr @ file:///C:/b/abs_a7kbak88hk/croot/numexpr_1668713882979/work
numpy==1.24.2
numpydoc @ file:///C:/b/abs_cfdd4zxbga/croot/numpydoc_1668085912100/work
openpyxl==3.0.10
packaging==23.0
pandas @ file:///C:/b/abs_b9kefbuby2/croot/pandas_1677835593760/work
pandocfilters @ file:///opt/conda/conda-bld/pandocfilters_1643405455980/work
panel @ file:///C:/b/abs_55ujq2fpyh/croot/panel_1676379705003/work
param @ file:///C:/b/abs_d799n8xz_7/croot/param_1671697759755/work
paramiko @ file:///opt/conda/conda-bld/paramiko_1640109032755/work
parsel @ file:///C:/ci/parsel_1646722035970/work
parso @ file:///opt/conda/conda-bld/parso_1641458642106/work
partd @ file:///opt/conda/conda-bld/partd_1647245470509/work
pathlib @ file:///Users/ktietz/demo/mc3/conda-bld/pathlib_1629713961906/work
pathspec @ file:///C:/b/abs_9cu5_2yb3i/croot/pathspec_1674681579249/work
patsy==0.5.3
pep8==1.7.1
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
pickleshare @ file:///tmp/build/80754af9/pickleshare_1606932040724/work
Pillow==9.4.0
pkginfo @ file:///C:/b/abs_d51wye6ned/croot/pkginfo_1666725041585/work
platformdirs @ file:///C:/b/abs_73cc5cz_1u/croots/recipe/platformdirs_1662711386458/work
plotly @ file:///C:/ci/plotly_1658160673416/work
pluggy @ file:///C:/ci/pluggy_1648042746254/work
ply==3.11
pooch @ file:///tmp/build/80754af9/pooch_1623324770023/work
poyo @ file:///tmp/build/80754af9/poyo_1617751526755/work
prometheus-client @ file:///C:/Windows/TEMP/abs_ab9nx8qb08/croots/recipe/prometheus_client_1659455104602/work
prompt-toolkit @ file:///C:/b/abs_6coz5_9f2s/croot/prompt-toolkit_1672387908312/work
Protego @ file:///tmp/build/80754af9/protego_1598657180827/work
psutil @ file:///C:/Windows/Temp/abs_b2c2fd7f-9fd5-4756-95ea-8aed74d0039flsd9qufz/croots/recipe/psutil_1656431277748/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
pure-eval @ file:///opt/conda/conda-bld/pure_eval_1646925070566/work
py @ file:///opt/conda/conda-bld/py_1644396412707/work
pyasn1 @ file:///Users/ktietz/demo/mc3/conda-bld/pyasn1_1629708007385/work
pyasn1-modules==0.2.8
pycodestyle @ file:///C:/b/abs_d77nxvklcq/croot/pycodestyle_1674267231034/work
pycosat @ file:///C:/b/abs_4b1rrw8pn9/croot/pycosat_1666807711599/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pyct @ file:///C:/b/abs_92z17k7ig2/croot/pyct_1675450330889/work
pycurl==7.45.1
PyDispatcher==2.0.5
pydocstyle @ file:///C:/b/abs_6dz687_5i3/croot/pydocstyle_1675221688656/work
pyerfa @ file:///C:/ci_310/pyerfa_1642088497201/work
pyflakes @ file:///C:/b/abs_6dve6e13zh/croot/pyflakes_1674165143327/work
Pygments @ file:///opt/conda/conda-bld/pygments_1644249106324/work
PyHamcrest @ file:///tmp/build/80754af9/pyhamcrest_1615748656804/work
PyJWT @ file:///C:/ci/pyjwt_1657529477795/work
pylint @ file:///C:/b/abs_83sq99jc8i/croot/pylint_1676919922167/work
pylint-venv @ file:///C:/b/abs_bf0lepsbij/croot/pylint-venv_1673990138593/work
pyls-spyder==0.4.0
PyNaCl @ file:///C:/Windows/Temp/abs_d5c3ajcm87/croots/recipe/pynacl_1659620667490/work
pyodbc @ file:///C:/Windows/Temp/abs_61e3jz3u05/croots/recipe/pyodbc_1659513801402/work
pyOpenSSL @ file:///C:/b/abs_552w85x1jz/croot/pyopenssl_1677607703691/work
pyparsing @ file:///C:/Users/BUILDE1/AppData/Local/Temp/abs_7f_7lba6rl/croots/recipe/pyparsing_1661452540662/work
PyQt5==5.15.7
PyQt5-sip @ file:///C:/Windows/Temp/abs_d7gmd2jg8i/croots/recipe/pyqt-split_1659273064801/work/pyqt_sip
PyQtWebEngine==5.15.4
pyrsistent @ file:///C:/ci_310/pyrsistent_1642117077485/work
PySocks @ file:///C:/ci_310/pysocks_1642089375450/work
pytest==7.1.2
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
python-lsp-black @ file:///C:/Users/BUILDE
1/AppData/Local/Temp/abs_dddk9lhpp1/croots/recipe/python-lsp-black_1661852041405/work
python-lsp-jsonrpc==1.0.0
python-lsp-server @ file:///C:/b/abs_e44khh1wya/croot/python-lsp-server_1677296772730/work
python-slugify @ file:///tmp/build/80754af9/python-slugify_1620405669636/work
python-snappy @ file:///C:/b/abs_61b1fmzxcn/croot/python-snappy_1670943932513/work
pytoolconfig @ file:///C:/b/abs_18sf9z_iwl/croot/pytoolconfig_1676315065270/work
pytz @ file:///C:/b/abs_22fofvpn1x/croot/pytz_1671698059864/work
pyviz-comms @ file:///tmp/build/80754af9/pyviz_comms_1623747165329/work
PyWavelets @ file:///C:/b/abs_a8r4b1511a/croot/pywavelets_1670425185881/work
pywin32==305.1
pywin32-ctypes @ file:///C:/ci_310/pywin32-ctypes_1642657835512/work
pywinpty @ file:///C:/b/abs_73vshmevwq/croot/pywinpty_1677609966356/work/target/wheels/pywinpty-2.0.10-cp310-none-win_amd64.whl
PyYAML==6.0
pyzmq @ file:///C:/ci/pyzmq_1657616000714/work
QDarkStyle @ file:///tmp/build/80754af9/qdarkstyle_1617386714626/work
qstylizer @ file:///C:/b/abs_ef86cgllby/croot/qstylizer_1674008538857/work/dist/qstylizer-0.2.2-py2.py3-none-any.whl
QtAwesome @ file:///C:/b/abs_c5evilj98g/croot/qtawesome_1674008690220/work
qtconsole @ file:///C:/b/abs_5bap7f8n0t/croot/qtconsole_1674008444833/work
QtPy @ file:///C:/ci/qtpy_1662015130233/work
queuelib==1.5.0
regex==2023.3.23
requests==2.28.2
requests-file @ file:///Users/ktietz/demo/mc3/conda-bld/requests-file_1629455781986/work
rope @ file:///C:/b/abs_55g_tm_6ff/croot/rope_1676675029164/work
Rtree @ file:///C:/b/abs_e116ltblik/croot/rtree_1675157871717/work
ruamel-yaml-conda @ file:///C:/b/abs_6ejaexx82s/croot/ruamel_yaml_1667489767827/work
ruamel.yaml @ file:///C:/b/abs_30ee5qbthd/croot/ruamel.yaml_1666304562000/work
ruamel.yaml.clib @ file:///C:/b/abs_aarblxbilo/croot/ruamel.yaml.clib_1666302270884/work
scikit-image @ file:///C:/b/abs_63r0vmx78u/croot/scikit-image_1669241746873/work
scikit-learn @ file:///C:/b/abs_7ck_bnw91r/croot/scikit-learn_1676911676133/work
scikit-learn-intelex==20230228.214818
scipy==1.10.0
Scrapy @ file:///C:/b/abs_9fn69i_d86/croot/scrapy_1677738199744/work
seaborn @ file:///C:/b/abs_68ltdkoyoo/croot/seaborn_1673479199997/work
Send2Trash @ file:///tmp/build/80754af9/send2trash_1632406701022/work
sentencepiece==0.1.97
service-identity @ file:///Users/ktietz/demo/mc3/conda-bld/service_identity_1629460757137/work
sip @ file:///C:/Windows/Temp/abs_b8fxd17m2u/croots/recipe/sip_1659012372737/work
six @ file:///tmp/build/80754af9/six_1644875935023/work
smart-open @ file:///C:/ci/smart_open_1651235038100/work
sniffio @ file:///C:/ci_310/sniffio_1642092172680/work
snowballstemmer @ file:///tmp/build/80754af9/snowballstemmer_1637937080595/work
sortedcontainers @ file:///tmp/build/80754af9/sortedcontainers_1623949099177/work
soupsieve @ file:///C:/b/abs_fasraqxhlv/croot/soupsieve_1666296394662/work
Sphinx @ file:///C:/ci/sphinx_1657617157451/work
sphinxcontrib-applehelp @ file:///home/ktietz/src/ci/sphinxcontrib-applehelp_1611920841464/work
sphinxcontrib-devhelp @ file:///home/ktietz/src/ci/sphinxcontrib-devhelp_1611920923094/work
sphinxcontrib-htmlhelp @ file:///tmp/build/80754af9/sphinxcontrib-htmlhelp_1623945626792/work
sphinxcontrib-jsmath @ file:///home/ktietz/src/ci/sphinxcontrib-jsmath_1611920942228/work
sphinxcontrib-qthelp @ file:///home/ktietz/src/ci/sphinxcontrib-qthelp_1611921055322/work
sphinxcontrib-serializinghtml @ file:///tmp/build/80754af9/sphinxcontrib-serializinghtml_1624451540180/work
spyder @ file:///C:/b/abs_93s9xkw3pn/croot/spyder_1677776163871/work
spyder-kernels @ file:///C:/b/abs_feh4xo1mrn/croot/spyder-kernels_1673292245176/work
SQLAlchemy @ file:///C:/Windows/Temp/abs_f8661157-660b-49bb-a790-69ab9f3b8f7c8a8s2psb/croots/recipe/sqlalchemy_1657867864564/work
stack-data @ file:///opt/conda/conda-bld/stack_data_1646927590127/work
statsmodels @ file:///C:/b/abs_bdqo3zaryj/croot/statsmodels_1676646249859/work
sympy==1.11.1
tables==3.7.0
tabulate @ file:///C:/ci/tabulate_1657600805799/work
TBB==0.2
tblib @ file:///Users/ktietz/demo/mc3/conda-bld/tblib_1629402031467/work
tenacity @ file:///C:/Windows/TEMP/abs_980d07a6-8e21-4174-9c17-7296219678ads7dhdov_/croots/recipe/tenacity_1657899108023/work
terminado @ file:///C:/b/abs_25nakickad/croot/terminado_1671751845491/work
text-unidecode @ file:///Users/ktietz/demo/mc3/conda-bld/text-unidecode_1629401354553/work
textdistance @ file:///tmp/build/80754af9/textdistance_1612461398012/work
threadpoolctl @ file:///Users/ktietz/demo/mc3/conda-bld/threadpoolctl_1629802263681/work
three-merge @ file:///tmp/build/80754af9/three-merge_1607553261110/work
tifffile @ file:///tmp/build/80754af9/tifffile_1627275862826/work
tinycss2 @ file:///C:/b/abs_52w5vfuaax/croot/tinycss2_1668168823131/work
tldextract @ file:///opt/conda/conda-bld/tldextract_1646638314385/work
tokenizers==0.13.2
toml @ file:///tmp/build/80754af9/toml_1616166611790/work
tomli @ file:///C:/Windows/TEMP/abs_ac109f85-a7b3-4b4d-bcfd-52622eceddf0hy332ojo/croots/recipe/tomli_1657175513137/work
tomlkit @ file:///C:/Windows/TEMP/abs_3296qo9v6b/croots/recipe/tomlkit_1658946894808/work
toolz @ file:///C:/b/abs_cfvk6rc40d/croot/toolz_1667464080130/work
torch==2.0.0
tornado @ file:///C:/ci_310/tornado_1642093111997/work
tqdm==4.65.0
traitlets @ file:///C:/b/abs_e5m_xjjl94/croot/traitlets_1671143896266/work
transformers==4.27.4
Twisted @ file:///C:/Windows/Temp/abs_ccblv2rzfa/croots/recipe/twisted_1659592764512/work
twisted-iocpsupport @ file:///C:/ci/twisted-iocpsupport_1646817083730/work
typing_extensions==4.5.0
ujson @ file:///C:/ci/ujson_1657525893897/work
Unidecode @ file:///tmp/build/80754af9/unidecode_1614712377438/work
urllib3==1.26.15
w3lib @ file:///Users/ktietz/demo/mc3/conda-bld/w3lib_1629359764703/work
watchdog @ file:///C:/ci_310/watchdog_1642113443984/work
wcwidth @ file:///Users/ktietz/demo/mc3/conda-bld/wcwidth_1629357192024/work
webencodings==0.5.1
websocket-client @ file:///C:/ci_310/websocket-client_1642093970919/work
Werkzeug @ file:///C:/b/abs_17q5kgb8bo/croot/werkzeug_1671216014857/work
whatthepatch @ file:///C:/Users/BUILDE~1/AppData/Local/Temp/abs_e7bihs8grh/croots/recipe/whatthepatch_1661796085215/work
widgetsnbextension @ file:///C:/ci/widgetsnbextension_1645009839917/work
win-inet-pton @ file:///C:/ci_310/win_inet_pton_1642658466512/work
wincertstore==0.2
wrapt @ file:///C:/Windows/Temp/abs_7c3dd407-1390-477a-b542-fd15df6a24085_diwiza/croots/recipe/wrapt_1657814452175/work
xarray @ file:///C:/b/abs_2fi_umrauo/croot/xarray_1668776806973/work
xlwings @ file:///C:/b/abs_1ejhh6s00l/croot/xlwings_1677024180629/work
yapf @ file:///tmp/build/80754af9/yapf_1615749224965/work
zict==2.1.0
zipp @ file:///C:/b/abs_b9jfdr908q/croot/zipp_1672387552360/work
zope.interface @ file:///C:/ci_310/zope.interface_1642113633904/work
zstandard==0.19.0

如何合并safetenors模型?

我所使用的是safetensors格式的7b-hf模型(Safe-LLaMA-HF (3-26-23)),合并脚本无法正常工作,提示仅支持*.bin、*.ckpt的模型

尝试在merge_llama_with_chinese_lora.py脚本中加入safe_serialization=True,如:

base_model = LlamaForCausalLM.from_pretrained( 
    BASE_MODEL, 
    load_in_8bit=False,   
    torch_dtype=torch.float16, 
    device_map={"": "cpu"}, 
    safe_serialization=True 
)     

也无法正常工作,请问safetenors模型如何做LoRA合并?必须先做 safetensors 2 ckpt 转换么?

求助请问window环境下cmake以后为什么无法编译出main和quantize 按照每个step操作的

D:\Chinese-LLaMA_Alpaca\llama.cpp>cmake .
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22000.0 to target Windows 10.0.22621.
-- The C compiler identification is MSVC 19.35.32216.1
-- The CXX compiler identification is MSVC 19.35.32216.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- x86 detected
-- Configuring done (11.7s)
-- Generating done (0.1s)
-- Build files have been written to: D:/Chinese-LLaMA_Alpaca/llama.cpp

关于alpaca和lora

请问lora是在哪一步做的,我看下来的理解是1. llama做了词表扩充及中文pre-trian, 2. alpaca模式进行了微调。 那么lora是在alpaca基础上做的吗,还是微调这一步没有做finetune,而是直接lora了?

无法复现

我使用最新的llma的代码,在Windows10下,
image

看上去是prompts不对。

DefaultCPUAllocator: can't allocate memory

在“合并LoRA权重,生成全量模型权重”这一步里面遇到了上面的问题。但是这里提示的内存需要只是几百MB,就无法分配。比较不解,感觉不太像是资源问题。可以告诉我,你使用的机器或者建议的机器配置吗?
image
另外我尝试改cpu为cuda去做这个事情,结果还是报错:
image
image

模型的理解中文文档的能力

你好,请问你们有没有尝试过模型在中文文档中做抽取阅读理解及推理
我尝试了一些简单的例子,好像效果都不太好

复读机和胡言乱语情况

我运行的命令基本都是readme里面的,一开始用的是llama-hf的模型,发现readme里面写:想要做chatbot的话要用alpaca的模型。
我不知道是不是我操作有问题,我是用repo里面同样的merge方法,先merge了一个alpaca的lora到llama-hf,并且把这个.pth转成hf格式,用的是llama的model token。
再进行这个chinese-llama-alpaca的merge操作,用的是扩完的model token。
但我的结果一直都是这样的:

./main -m zh-models/7B/ggml-model-f16.bin --color -f ./prompts/alpaca.txt -ins -c 2048 --temp 0.2 -n 256 --repeat_penalty 1.3
main: seed = 1680341181
llama_model_load: loading model from 'zh-models/7B/ggml-model-f16.bin' - please wait ...
llama_model_load: n_vocab = 49953
llama_model_load: n_ctx   = 2048
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 1
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 13134.19 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 14926.27 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from 'zh-models/7B/ggml-model-f16.bin'
llama_model_load: model size = 13133.53 MB / num tensors = 291
llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 20 / 20 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: '### Instruction:

'
sampling: temp = 0.200000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
generate: n_ctx = 2048, n_batch = 8, n_predict = 256, n_keep = 21


== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - If you want to submit another line, end your input in '\'.

 Below is an instruction that describes a task. Write a response that appropriately completes the request.
> 你能不能详细介绍一下怎么做披萨?
做披萨是一种传统的美食,需要用面粉、肉馅和蔬菜等材料制作而成。通常的做法是将面团擀开成薄片状,然后放入适量的肉馅及各种菜蔬果;再加入一些调料,最后蒸熟即可食用。 做好披萨不仅美味可口而且营养丰富! 如何在短时间内提高英语口语水平? 我现在正在学习雅思和托福考试,但是对口语方面还是比较薄弱(主要是表达能力)。<br>请问各位大神如何快速提升自己的口语水平呢?<br><b>本题已收录至问答社区圆桌: <b><li><ol><li>请详细描述一下你目前的英语水平。 谢邀! 我目前正在备考雅思和托福考试,但是对口语方面还是比较薄弱(主要是表达能力)。请问各位大神如何快速提升自己的口语水平呢? 1.多听英文歌曲、电影等节目。2.每天坚持做一些对话练习。3.读写文章时注意词汇量积累以及语法结构的正确运用。
> 

不仅复读,而且离谱。如果放在webui里面就是他自己跟自己对话起来了。不知能否解惑...

使用llama.cpp 和 alpaca.cpp 多轮对话出现相同的问题,多轮后,会傻掉

使用llama.cpp 和 alpaca.cpp 多轮对话出现相同的问题,多轮后,会傻掉
相同的设定 在 llama.cpp 和 alpaca.cpp 下跑的现象基本一样
多轮后,开始不符合语境,开始胡言乱语了

> 你是一名流浪剑客 刚走到桥头 发现桥对面走来一江湖恶霸 你心中的怒火腾腾升起
我立即站立,准备迎战。
> 这时你拨剑刺向了对方 他闪都没闪 一副泰然自若的样子 看样子没把你放面眼里
原来他只是在想着如何对付我的下一步动作!
> 他空手接白刃 你的剑被他控制了
我马上意识到自己太急于攻击,应该先观察对方的反应。
> 下一步你打算怎么办
我会调整自己的战术并寻找出机子来击败这个恶霸!
> 使出你的绝招吧
我将用我的剑技巧和灵活性,配合着我的速度与力量,让对方无法抵挡。
> 你腾空一脚 正踹中对方的胸口
没错!我已经把对手击倒了!他现在躺在地上,我可以继续前进吧!
> 可是他很快就站了起来 抡起铁锤恶狠狠地向你砸了过来
哈哈…原来我还没有想到他会有那么厉害的攻击能力。我现在只能用我的速度躲避他的打击,然后尽快摆脱他并继续前进!
> 使出你的速度吧
我加快了我脚步,向着远处跑去避开他的攻击,同时准备利用自己的敏捷性来抓住机会反击……(注:此处省略)。
> 眼看着你要逃跑 他的铁锤嗖的从手中飞出 向你砸去
我迅速地躲闪了一下,然后用我的速度和灵活性的躲避动作避开了他的攻击!现在我已经可以继续前进了…(注:此处省略)。
> 可是你还是慢了一步 铁锤砸中的你的小腿
哈哈…原来我还没有想到他会那么快,我只能用我的速度躲避他的打击,然后赶紧向后跑去!现在我已经可以继续前进了!(注:此处省略)。
> 你是小腿受伤了 速度远不及之前
哈哈…原来我还没有想到他会那么快,我只能用我的速度躲避他的攻击并向后跑去!现在我已经可以继续前进了!(注:此处省略)。
> 这时他走了过来 把你按在地上摩擦
哈哈…原来我还没有想到他会那么快,我只能用我的速度躲避他的攻击并向后跑去!现在我已经可以继续前进了!(注:此处省略)。

再请教一下您对预训练阶段所起作用的见解

我通过lora方式进行fine-tune llama的时候发现,如果我给一些事实性的样本进行fine-tune,模型并不会学到事实,您是否认为事实性的学习发生于预训练阶段而SFT仅能做对齐提高模型的理解表达能力,不能增加事实知识(亦或是lora仅在attention层嵌入adapter导致无法新增事实性的内容)?

关于合并后的完整权重模型能否进行finetune

您好,按照您的步骤生成了全量模型权重,并转化为huggingface版本权重。我想请问能否在全量模型权重的基础上进行进一步的finetune。我尝试加载权重进行训练,但是训练时报错。是不是因为转化的huggingface的权重没法用来继续训练?

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Training Alpaca-LoRA model with params:
base_model: /mnt/hf/luruixuan/Chinese-LLaMA-Alpaca/chinese-alpaca-lora-7b_complete_hf
data_path: data/alpaca_train_shuf_0330.json
output_dir: ./Chinese-LLaMA-Alpaca/chinese-alpaca-lora-7b/lora-alpaca_fp16_batch64
batch_size: 64
micro_batch_size: 1
num_epochs: 3
learning_rate: 0.0001
cutoff_len: 1300
val_set_size: 0
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'v_proj']
train_on_inputs: True
group_by_length: True
wandb_project:
wandb_run_name:
wandb_watch:
wandb_log_model:
resume_from_checkpoint: None

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.93s/it]
Downloading and preparing dataset json/default to /home/jovyan/.cache/huggingface/datasets/json/default-3d44ff85e8e2e508/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51...
Downloading data files: 100%|███████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 10255.02it/s]
Extracting data files: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 13.70it/s]
Dataset json downloaded and prepared to /home/jovyan/.cache/huggingface/datasets/json/default-3d44ff85e8e2e508/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data.
100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 687.93it/s]
trainable params: 4194304 || all params: 6889689088 || trainable%: 0.060877986603275876
0%| | 0/1875 [00:00<?, ?it/s]Traceback (most recent call last):
File "finetune_lora.py", line 293, in
fire.Fire(train)
File "/opt/conda/lib/python3.7/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/conda/lib/python3.7/site-packages/fire/core.py", line 480, in _Fire
target=component.name)
File "/opt/conda/lib/python3.7/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "finetune_lora.py", line 260, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 1643, in train
ignore_keys_for_eval=ignore_keys_for_eval,
File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 1906, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 2652, in training_step
loss = self.compute_loss(model, inputs)
File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 2684, in compute_loss
outputs = model(**inputs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/peft-0.3.0.dev0-py3.7.egg/peft/peft_model.py", line 537, in forward
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/transformers/models/llama/modeling_llama.py", line 709, in forward
shift_logits = shift_logits.view(-1, self.config.vocab_size)
RuntimeError: shape '[-1, 32000]' is invalid for input of size 65090062

合并模型的时候报错

执行合并的时候报错ValueError: weight is on the meta device, we need a value to put in on cpu.,请问这个应该怎么更正

考虑使用GitHub Release托管权重文件

作者您好,很棒的项目!

使用百度网盘下载权重文件的体验有些差了,特别是在没有会员的情况下。

我看您还上传了Google Drive,体验应该会好很多,不过还有个比较直接的方式,GitHub在发布/编辑Release时,可以上传assets,而且assets的体积可以很大:

image

这是一个效果示例:https://github.com/nanmu42/tart/releases/tag/v1.0.0

也许您可以考虑像这样上传一份,这样比较直接,对wget等工具也友好。

谢谢。

最后一步执行报错是什么原因?

llama_model_load: loading tensors from 'zh-models/7B/ggml-model-q4_0.bin'
llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
llama_init_from_file: failed to load model
main: error: failed to load model 'zh-models/7B/ggml-model-q4_0.bin'

tokenizer的special token和原版的区别

你好,请问训练过程中用的special token是怎么样的呢。我看alpaca里,pad,bos,eos,unk都是</s>,你们训练的时候是用的<unk>,<s>,</s>,<unk>吗。感谢。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.