yu-yang-li / starwhisper Goto Github PK

View Code? Open in Web Editor NEW

232.0 2.0 7.0 162.72 MB

StarWhisper：LLM for Astronomy

License: Apache License 2.0

Jupyter Notebook 100.00%

astronomy astrophysics large-language-models llm

starwhisper's People

Stargazers

Watchers

Forkers

r00mz wanghaitaozj joechen007 xenos-code

starwhisper's Issues

SFT

Dear Dr. Li
The work of StarGLM has inspired me a lot. What kind of fine-tuning method is this based on?

Question about the model key idea on generating galaxy images in the future work?

Your model notices that NGC7714 is a spiral galaxy. That's cool but actually, it is also a merging galaxy. Your generated images of the galaxy look artistic instead of scientific. Can your model generate real .fit files for science or do you hope to generate artistic images only for the public?

有点困惑请教一下

readme的功能展示说明是20w条天文对话数据分布微调
我想请教一下，一个大模型纯微调是应该没有办法增加新的领域知识的吧，所以这里我们仅仅只做了微调（sft），还是也做了预训练（pt）。
如果能够通过微调就增加新的领域知识的话，我们微调的硬件大概是一个什么配置，谢谢。

在监督微调中，如何具体地调整通用数据和专业数据的比例，以缓解灾难性遗忘问题？

您好，关于release2.0版本提及的1.通过数据集清洗再训练，缓解了先前版本经过Agent/工具学习训练后对原有知识的灾难性遗忘，

能否问一下在SFT中具体采用的方法吗？包括通用数据具体采用了何种数据集，和专业数据的具体比例，以及训练前数据预处理过程？是否需要shuffle？或者别的处理？

Potentially open sourcing the model on HF and create a demo there?

Hi YuYang,

Congratulations on your great work! It would be really nice if you can upload the model to Hugging Face hub.

This would help model discovery and integration with tools.

For example, this is the ChatGLM repo on Hugging Face. https://huggingface.co/THUDM/chatglm2-6b
With that, the model can be invoked with a few lines of code.

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)

Also you can fork nice demos like https://huggingface.co/spaces/mikeee/chatglm2-6b-4bit to create your own demos. It will make it very easy for users to use and amplify the impact of your project.

If you run into any issues, feel free to let us know and we're happy to help. :-) My WeChat ID is zhou_a_zhou

有没有考虑发表论文

[Feature Request] Support InternLM

Dear StarGLM developer,

我是 InternLM 社区开发者&志愿者尖米, 大佬开源的工作对我的启发很大，希望可以探讨使用 InternLM 实现 StarGLM 的可能性和实现路径，我的微信是 mzm312，希望可以取得联系进行更深度的交流；

Best regards,
尖米

yu-yang-li / starwhisper Goto Github PK

starwhisper's People

Stargazers

Watchers

Forkers

starwhisper's Issues

SFT

模型训练次数

Question about the model key idea on generating galaxy images in the future work?

有点困惑请教一下

在监督微调中，如何具体地调整通用数据和专业数据的比例，以缓解灾难性遗忘问题？

Potentially open sourcing the model on HF and create a demo there?

有没有考虑发表论文

[Feature Request] Support InternLM

训练数据

二次训练数据

Agent能力数据

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent