Giter Site home page Giter Site logo

chatdoctor's Introduction

Yunxiang Li1, Zihan Li2, Kai Zhang3, Ruilong Dan4, Steve Jiang1, You Zhang1

1 UT Southwestern Medical Center, USA
2 University of Illinois at Urbana-Champaign, USA
3 Ohio State University, USA
4 Hangzhou Dianzi University, China

License Python 3.9+ Page

Resources List

Autonomous ChatDoctor with Disease Database Demo.

100k real conversations between patients and doctors from HealthCareMagic-100k.

Real conversations between patients and doctors from icliniq-10k.

Checkpoints of ChatDoctor, link.

Stanford Alpaca data for basic conversational capabilities. Alpaca link.


In a conda env with pytorch available, run:

pip install -r requirements.txt

Interactive Demo Page:

Demo Page: It is worth noting that our model has not yet achieved 100% accurate output, please do not apply it to real clinical scenarios.

For those who want to try the online demo, please register for hugging face and fill out this form link.

Data and model:

1. ChatDoctor Dataset:

You can download the following training dataset

100k real conversations between patients and doctors from HealthCareMagic-100k.

10k real conversations between patients and doctors from icliniq-10k.

5k generated conversations between patients and physicians from ChatGPT GenMedGPT-5k and disease database.

Our model was firstly be fine-tuned by Stanford Alpaca's data to have some basic conversational capabilities. Alpaca link

2. Model Weights:

Place the model weights file in the ./pretrained folder.

How to fine-tuning

torchrun --nproc_per_node=4 --master_port=<your_random_port> \
   --model_name_or_path <your_path_to_hf_converted_llama_ckpt_and_tokenizer> \
   --data_path ./HealthCareMagic-100k.json \
   --bf16 True \
   --output_dir pretrained \
   --num_train_epochs 1 \
   --per_device_train_batch_size 4 \
   --per_device_eval_batch_size 4 \
   --gradient_accumulation_steps 8 \
   --evaluation_strategy "no" \
   --save_strategy "steps" \
   --save_steps 2000 \
   --save_total_limit 1 \
   --learning_rate 2e-6 \
   --weight_decay 0. \
   --warmup_ratio 0.03 \
   --lr_scheduler_type "cosine" \
   --logging_steps 1 \
   --fsdp "full_shard auto_wrap" \
   --fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \
   --tf32 True

Fine-tuning with Lora

WORLD_SIZE=6 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 torchrun --nproc_per_node=6 --master_port=4567 \
  --base_model './weights-alpaca/' \
  --data_path 'HealthCareMagic-100k.json' \
  --output_dir './lora_models/' \
  --batch_size 32 \
  --micro_batch_size 4 \
  --num_epochs 1 \
  --learning_rate 3e-5 \
  --cutoff_len 256 \
  --val_set_size 120 \
  --adapter_name lora

How to inference

You can build a ChatDoctor model on your own machine and communicate with it.

cd Autonomous_ChatDoctor_csv
cd Autonomous_ChatDoctor_Wikipedia


We fine-tuned our model on HealthCareMagic-100K, and test our model on iCliniq-10k.

Patient question from icliniq: Hi doctor,I am getting one-sided headache. I do not get them often. They are behind my right eye and temple area. Now this time I am also fighting with heavy cold and cough. It came on late morning today and it hurts more when I cough or shake my head or bend over. Prior to this, I had the same headache about four months ago. At that time it came on midafternoon and lasted till next morning. Prior to that, I had the same headache exactly on the right side and lasted for a day. I am on CCB and Protonix. I am scared of brain tumor. Please help

Real doctor answer from icliniq: Hi. Based on what you have told, this is more likely to be due to sinusitis. Sinuses are small air filled cavities in your skull bone. They can get infected and when they are infected, you will get headaches behind eyes. Such headaches will be more common during cold and cough. Brain tumor headache will be on both sides, more in forehead and you will also have vomiting in that case. Based on what you have told, this is more likely due to simple sinus infection instead of brain tumor. CT brain and CT paranasal sinuses.

Answer form ChatGPT: Hello! I'm sorry to hear you're experiencing these headaches. Based on the location of the pain, it sounds like it may be a tension headache or a migraine. These types of headaches can sometimes be triggered by colds or other illnesses. However, it's always best to get checked out by your doctor to rule out any other underlying conditions. In terms of a brain tumor, that is a very rare cause of headaches. Most people who experience headaches do not have a brain tumor. That being said, if you have any concerning symptoms such as vision changes, weakness, or numbness, it is important to seek medical attention right away. In the meantime, you can try taking over-the-counter pain relievers such as ibuprofen or acetaminophen to help with the headache. It's also important to get plenty of rest and stay hydrated. If the headaches persist or become more frequent, please make an appointment with your doctor. Take care!

Answer form ChatDoctor: Hi, Thank you for posting your query. First, I would like to reassure you that there is no need to worry about brain tumor. Also, we should keep migraine apart as well. The most likely cause of your headaches could be sinusitis. This can be confirmed by doing an X-ray of face. Treatment would consist of using decongestant tablets/ nasal sprays. Best wishes, Chat Doctor.

ChatDoctor accurately mentions sinusitis, which is also present in the real conversation in icliniq.


ChatDoctor is a next-generation AI doctor model that is based on the LLaMA model. The goal of this project is to provide patients with an intelligent and reliable healthcare companion that can answer their medical queries and provide them with personalized medical advice.

The ChatDoctor is an advanced language model that is specifically designed for medical applications. It has been trained on a large corpus of medical literature and has a deep understanding of medical terminology, procedures, and diagnoses. This model serves as the foundation for ChatDoctor, enabling it to analyze patients' symptoms and medical history, provide accurate diagnoses, and suggest appropriate treatment options.

The ChatDoctor model is designed to simulate a conversation between a doctor and a patient, using natural language processing (NLP) and machine learning techniques. Patients can interact with the ChatDoctor model through a chat interface, asking questions about their health, symptoms, or medical conditions. The model will then analyze the input and provide a response that is tailored to the patient's unique situation.

One of the key features of the ChatDoctor model is its ability to learn and adapt over time. As more patients interact with the model, it will continue to refine its responses and improve its accuracy. This means that patients can expect to receive increasingly personalized and accurate medical advice over time.

Patient-physician Conversation Dataset

The first step in fine-tuning is to collect a dataset of patient-physician conversations. In patient-physician conversations, the patient's descriptions of disease symptoms are often colloquial and cursory. If we manually construct the synthesized patient-physician conversation dataset, it often leads to the problem of insufficient diversity and over-specialized descriptions, which are often spaced out from real scenarios. Collecting real patient-physician conversations is a better solution. Therefore, we collected about 100k real doctor-patient conversations from an online medical consultation website HealthCareMagic( We filtered these data both manually and automatically, removed the identity information of the doctor and patient, and used language tools to correct grammatical errors, and we named this dataset HealthCareMagic-100k. In addition, we collected approximately 10k patient-physician conversations from the online medical consultation website iCliniq to evaluate the performance of our model.

Autonomous ChatDoctor based on Knowledge Brain

Equipped with the external knowledge brain, i.e., Wikipedia or our constructed database encompassing over 700 diseases, ChatDoctor could retrieve the corresponding knowledge and reliable sources to answer patients' inquiries more accurately. After constructing the external knowledge brain, we need to let our ChatDoctor retrieve the knowledge he needs autonomously, which can generally be achieved in a large language model by constructing appropriate prompts. To automate this process, we design keyword mining prompts for ChatDoctor to extract key terms for relevant knowledge seeking. Then, the top-ranked relevant passages were retrieved from Knowledge Brain with a term-matching retrieval system. As for the disease database, since the model cannot read all the data at once, we first let the model read the data in batches and select for itself the data entries that might help answer the patient's question. Finally, all the data entries selected by the model are given to the model for a final answer. This approach better ensures that patients receive well-informed and precise responses backed by credible references.


We emphasize that ChatDoctor is for academic research only and any commercial use and clinical use is prohibited. There are three factors in this decision: First, ChatDoctor is based on LLaMA and has a non-commercial license, so we necessarily inherited this decision. Second, our model is not licensed for healthcare-related purposes. Also, we have not designed sufficient security measures, and the current model still does not guarantee the full correctness of medical diagnoses.


ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge

  title={ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge},
  author={Li, Yunxiang and Li, Zihan and Zhang, Kai and Dan, Ruilong and Jiang, Steve and Zhang, You},

chatdoctor's People


kent0n-li avatar saharmor avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chatdoctor's Issues

Why did you chose to train in two steps?

Thanks for your sharing, your attempt is very interesting and valuable.

However, I have some questions about the training process.

I notice that ChatDoctor is first trained using 52K instruction-following data from provided by Stanford Alpaca, and then finetuned on the your specific data.

Why not finetune the model using a mixture of two parts of the data?

What is the insight of this finetuning process?
What is the insight behind this finetuning model?

Have you ever tried to train with two pieces of data mixed together?

Training from scratch


Can we train the model from scratch? Do you have plans to release the training code as well without loading pre-trained weights?

HFValidationError when running and files

I got the following error when I run the :

raise HFValidationError(
huggingface_hub.utils.validators.HFValidationError: Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'chatDoctor100k/'.


pytorch pretrained

Greetings, and thank you for this model!

Please i cant find or know what to do to find and install this file 'pytorch_model-00001-of-00003.bin'

If you can guide me please, thanks in advance

Question about output in HealthCareMagic-100k.json

Hi, I noticed that a lot of "Chat Doctor" appears in the outputs of HealthCareMagic-100k.json. For example:
""Hi thanks for contacting Chat Doctor ... Your brother have both hepatitis b and c positive...."
"Hi and welcome to Chat Doctor."
"Hi and welcome to Chat Doctor. Thank you for your query. I am Chat Doctor.."

I wonder if that cost by some post processing? Is there any data without these words?

model not found in pretrained section

Error :

(base) hemang@hemang-HP-Pavilion-g6-Notebook-PC:~/Documents/GitHub/ChatDoctor$ python3.11
2023-03-30 16:16:25.135057: I tensorflow/core/platform/] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-30 16:16:26.061195: W tensorflow/compiler/tf2tensorrt/utils/] TF-TRT Warning: Could not find TensorRT
Loading ./pretrained/...
/home/hemang/.local/lib/python3.11/site-packages/torch/cuda/ UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
gpu_count 0
Loading checkpoint shards:   0%|                                                    | 0/3 [00:00<?, ?it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/hemang/.local/lib/python3.11/site-packages/transformers/ in           │
│ load_state_dict                                                                                  │
│                                                                                                  │
│    412 │   │   │   )                                                                             │
│    413 │   │   return safe_load_file(checkpoint_file)                                            │
│    414try:                                                                                  │
│ ❱  415 │   │   return torch.load(checkpoint_file, map_location="cpu")                            │
│    416except Exception as e:                                                                │
│    417 │   │   try:                                                                              │
│    418 │   │   │   with open(checkpoint_file) as f:                                              │
│                                                                                                  │
│ /home/hemang/.local/lib/python3.11/site-packages/torch/ in load              │
│                                                                                                  │
│    788if 'encoding' not in pickle_load_args.keys():                                         │
│    789 │   │   pickle_load_args['encoding'] = 'utf-8'                                            │
│    790 │                                                                                         │
│ ❱  791with _open_file_like(f, 'rb') as opened_file:                                         │
│    792 │   │   if _is_zipfile(opened_file):                                                      │
│    793 │   │   │   # The zipfile reader is going to advance the current file position.           │794 │   │   │   # If we want to actually tail call to torch.jit.load, we need to              │
│                                                                                                  │
│ /home/hemang/.local/lib/python3.11/site-packages/torch/ in _open_file_like   │
│                                                                                                  │
│    268                                                                                           │
│    269 def _open_file_like(name_or_buffer, mode):                                                │
│    270if _is_path(name_or_buffer):                                                          │
│ ❱  271 │   │   return _open_file(name_or_buffer, mode)                                           │
│    272else:                                                                                 │
│    273 │   │   if 'w' in mode:                                                                   │
│    274 │   │   │   return _open_buffer_writer(name_or_buffer)                                    │
│                                                                                                  │
│ /home/hemang/.local/lib/python3.11/site-packages/torch/ in __init__          │
│                                                                                                  │
│    249                                                                                           │
│    250 class _open_file(_opener):                                                                │
│    251def __init__(self, name, mode):                                                       │
│ ❱  252 │   │   super().__init__(open(name, mode))                                                │
│    253 │                                                                                         │
│    254def __exit__(self, *args):                                                            │
│    255 │   │   self.file_like.close()                                                            │
FileNotFoundError: [Errno 2] No such file or directory: './pretrained/pytorch_model-00001-of-00003.bin'

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/hemang/Documents/GitHub/ChatDoctor/ in <module>                                  │
│                                                                                                  │
│    40 │                                                                                          │
│    41generator = model.generate                                                             │
│    42                                                                                            │
│ ❱  43 load_model("./pretrained/")                                                                │
│    44                                                                                            │
│    45 First_chat = "ChatDoctor: I am ChatDoctor, what medical questions do you have?"            │
│    46 print(First_chat)                                                                          │
│                                                                                                  │
│ /home/hemang/Documents/GitHub/ChatDoctor/ in load_model                                │
│                                                                                                  │
│    25print('gpu_count', gpu_count)                                                          │
│    26 │                                                                                          │
│    27tokenizer = transformers.LLaMATokenizer.from_pretrained(model_name)                    │
│ ❱  28model = transformers.LLaMAForCausalLM.from_pretrained(                                 │
│    29 │   │   model_name,                                                                        │
│    30 │   │   #device_map=device_map,                                                            │31 │   │   #device_map="auto",                                                                │
│                                                                                                  │
│ /home/hemang/.local/lib/python3.11/site-packages/transformers/ in          │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   2627 │   │   │   │   mismatched_keys,                                                          │
│   2628 │   │   │   │   offload_index,                                                            │
│   2629 │   │   │   │   error_msgs,                                                               │
│ ❱ 2630 │   │   │   ) = cls._load_pretrained_model(                                               │
│   2631 │   │   │   │   model,                                                                    │
│   2632 │   │   │   │   state_dict,                                                               │
│   2633 │   │   │   │   loaded_state_dict_keys,  # XXX: rename?                                   │
│                                                                                                  │
│ /home/hemang/.local/lib/python3.11/site-packages/transformers/ in          │
│ _load_pretrained_model                                                                           │
│                                                                                                  │
│   2936 │   │   │   │   # Skip the load for shards that only contain disk-offloaded weights when  │2937 │   │   │   │   if shard_file in disk_only_shard_files:                                   │
│   2938 │   │   │   │   │   continue                                                              │
│ ❱ 2939 │   │   │   │   state_dict = load_state_dict(shard_file)                                  │
│   2940 │   │   │   │                                                                             │
│   2941 │   │   │   │   # Mistmatched keys contains tuples key/shape1/shape2 of weights in the c  │2942 │   │   │   │   # matching the weights in the model.                                      │
│                                                                                                  │
│ /home/hemang/.local/lib/python3.11/site-packages/transformers/ in           │
│ load_state_dict                                                                                  │
│                                                                                                  │
│    415 │   │   return torch.load(checkpoint_file, map_location="cpu")                            │
│    416except Exception as e:                                                                │
│    417 │   │   try:                                                                              │
│ ❱  418 │   │   │   with open(checkpoint_file) as f:                                              │
│    419 │   │   │   │   if == "version":                                                │
│    420 │   │   │   │   │   raise OSError(                                                        │
│    421 │   │   │   │   │   │   "You seem to have cloned a repository without having git-lfs ins  │
FileNotFoundError: [Errno 2] No such file or directory: './pretrained/pytorch_model-00001-of-00003.bin'

ImportError: cannot import name 'BottleneckConfig' from 'peft'

File "/content/ChatDoctor/", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 17, in
from peft import ( # noqa: E402Traceback (most recent call last):

ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/
File "/content/ChatDoctor/", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/
[2024-02-09 11:16:17,756] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 17760) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/", line 346, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/", line 806, in main
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/", line 797, in run
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/", line 264, in launch_agent
raise ChildFailedError(

/content/ChatDoctor/ FAILED

time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 17761)
error_file: <N/A>
traceback : To enable traceback see:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 17762)
error_file: <N/A>
traceback : To enable traceback see:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 17763)
error_file: <N/A>
traceback : To enable traceback see:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 17764)
error_file: <N/A>
traceback : To enable traceback see:
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 17765)
error_file: <N/A>
traceback : To enable traceback see:

Root Cause (first observed failure):
time : 2024-02-09_11:16:17
host : 10ca0fca3068
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 17760)
error_file: <N/A>
traceback : To enable traceback see:

could it be used in Chinese ?

as the title,
LLM is always used for English very well , but not good for Chinese .
So how about the performance of Chinese ?

demo on the huggingface

I am a medico, i don't have a server with large GPUs. can you online(or re-onine) the demo on the huggingface?
Thank you so much!

gpu_count 0 | Cuda issue

cuda 0

As you can see im in Conda Powershell as Admin. I have installed PyTorch 2.0 with the updated torchvision for acceleration, along with the required downloads for transformers and tokenizer. The models load as well from the pretrained folder. Additionally, I have installed the CUDA toolkit 11.7 with drivers, and my GTX 1060 GPU 6GVRam is listed as available for computing. However, when attempting to activate CUDA, it shows as 0 or false. I am currently in the correct Conda environment and CUDA is installed and activated, but the issue persists. I noticed in the file that the model tokenizer shows as 8-bit floats to be disabled, which leads me to wonder if it is related. Also to mention that the tokenizers LLama name is written falsely perhaps, because i have found a thread on github on it, here is the link treadon/llama-7b-example#1 (comment) . There may be a typo error in your code as well in the file. I have been working on this issue for 3 days and would greatly appreciate any help. Thank you.


Why do the instructions start with "If you are a doctor"

Thank you for your interesting work.
In the chatdoctor5k.json and chatdoctor200k.json I see that the instructions start with "If you are a doctor".
I am curious why the instructions do not start with "You are a doctor".
Is this a common way to perform the alpaca instruction fine-tuning?

request for the pretrained weights

hello, I have filled out the link several times, but I do not receive related weight files. Is there something missing here? (I had check my spam) My email is [email protected], could you please send me the pre-trained weights? Thanks a lot.

wandb error

i got an error when run :
wandb: ERROR api_key not configured (no-tty). call wandb.login(key=[your_api_key])

difference between the paper and code

It seems that in your paper the train dataset is 'InstructorDoctor-205k' but in this repo, from the training command, the dataset is 'HealthCareMagic-100k.json'
In the paper, the training was 'fine tuning on nstructorDoctor-205k (seems to be one step?)', but in this repo: 'Our model was firstly be fine-tuned by Stanford Alpaca's data to have some basic conversational capabilities.' does it mean the repo contains updated method?
Training time difference: paper - 18 hours. repo - 30 minutes
Can you help to provide some clarifications?

pretrained/pytorch_model-00001-of-00003.bin FileNotFoundError

Hi, thank you for this model!
I am trying to build this app and getting this error message:

File "/home/ChatDoctor-main/env_doct/lib/python3.11/site-packages/transformers/", line 2939, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/home/ChatDoctor-main/env_doct/lib/python3.11/site-packages/transformers/", line 418, in load_state_dict
with open(checkpoint_file) as f:
FileNotFoundError: [Errno 2] No such file or directory: './pretrained/pytorch_model-00001-of-00003.bin'

Already filled this form: link. :

Can you please share this file 'pytorch_model-00001-of-00003.bin'

thanks in advance

What's the code for "how to utilize conversation demonstrations synthesized via ChatGPT"?

It seems that there is not code for " utilize conversation demonstrations synthesized via ChatGPT to finetune the LLaMA model "
in the code , I see that your use HealthCareMagic-200k.json ,not the "5k generated conversations between patients and physicians from ChatGPT [GenMedGPT-5k]" ,
how to utilize conversation demonstrations synthesized via ChatGPT ? Can you show us the code for this ?

Abandoned (core dumped)

Hello, I am a college student reading your paper. My server GPU is only 48G, does that mean I don't have enough memory in my GPU to do the inference

How do you create format_dataset.csv?

Hi, @Kent0n-Li in the paper you mentioned that you used MedlinePlus as database to create the format_dataset.csv, but I found if directly typing the name in format_dataset.csv (e.g.: Panic disorder) in MedlinePlus system, there are multiple results and symptoms may not be complete. Just wondering how you create this file? Is there any scripts or you just manually select it? Can you give an example of how you utilize MedlinePlus system to derive Symptom,reason, TestsAndProcedures, commonMedications in format_dataset.csv(e.g.: Panic disorder )

What prompt was used to generate the dataset?


I am curious to know which prompt you used to generate the dataset. I couldnt find it in the`.

Also you might want to remove your OpenAI API key from the

generate duplicate text

I had trained a ChatDoctor using lora following your instruction, and evaluate in iCliniq-10k without providing external documents (e.g. wikpedia or medical databse). The problem is the model generates duplicate context as the below example:

Patient: 'Hi doctor,My wife missed her periods and we doubt she is pregnant. Our first kid is just 1 year old now and it was a Cesarean. We are not in a position to have a second child now. Her LMP was 22nd of last month. I read that Mifepristone can induce abortion. Is it a safe drug? Do we have to consult doctor for taking this drug? What are all the possible side effects? Is that drug 100 percent reliable? Is there any other options? Whether this drug is available without doctor\s prescription? Thank you.'
Predictions: 1. Mifepristone is a safe drug. 2. Mifepristone is a safe drug. 3. Mifepristone is a safe drug. 4. Mifepristone is a safe drug. 5. Mifepristone is a safe drug. 6. Mifepristone is a safe drug. 7. Mifepristone is a safe drug. 8. Mifepristone is a safe drug. 9. Mifepristone is a safe drug. 10. Mifepristone is a safe drug. 11. Mifepristone is a safe drug. 12. Mifepristone is a safe drug. 13. Mifepristone is a safe drug. 14. Mifepristone is a safe drug. 15. Mifepristone is a safe drug. 16. Mifepristone is a safe drug. 17. Mifepristone is a safe drug. 18. Mifepristone is a safe drug. 19. Mifepristone is a safe drug.
ChatDoctor: 1. No, mifepristone is not available without doctors prescription. 2. It is a safe medication, but as with any medications there may be some side effects which include nausea, vomiting, headache, diarrhea etc. These symptoms usually subside within 48 hours after ingestion.3. Yes, it is 100% reliable if taken under supervision by a trained gynecologist.4. There are many other options for termination of early pregnancy like medical methods (like MTP kit) or surgical methods (like suction evacuation). But these need to be done under proper guidance of a qualified Gynecologist.'

In my genertaed texts, this phenomeon is pretty common. Did you encounter this problem before? How did you solve this? thx!

About Instruction data generation

The first step in building a physician-patient conversation dataset is to collect the disease database that serves as the gold standard. Therefore, we collected and organized a database of diseases, which contains about 700 diseases with their relative symptoms, medical tests, and recommended medications. To train high-quality conversation models on an academic budget, we input each message from the disease database separately as a prompt into the ChatGPT API to automatically generate instruction data. It is worth noting that our prompts to the ChatGPT API contain the gold standard of diseases and symptoms, and drugs, so our fine-tuned ChatDoctor is not only able to achieve ChatGPT's conversational fluency but also higher diagnostic accuracy compared to ChatGPT. We finally collected 5K doctor-patient conversation instructions and named it InstructorDoctor-5K.

I'm confused by this process. Can anyone explain it more precisely?

Dataset License

Is there a specific license for the associated datasets?

DDP expects same model across all ranks, but Rank 0 has 128 params, while rank 1 has inconsistent 0 params.

Hi,I met a problem that said ranks have different model. Followings are details.


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in y
our application as needed.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to:

bin /root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/bitsandbytes/
CUDA SETUP: CUDA runtime path found: /root/anaconda3/envs/chat-doctor/lib/
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/bitsandbytes/
bin /root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/bitsandbytes/

Finetuning model with params:
base_model: /disk2/data/xk/retr-llm/files/model/llama-7b/
data_path: /disk2/data/xk/retr-llm/files/datasets/mental_health_chatbot_dataset.json
output_dir: ./lora-chatDoctor_bs192_Mbs24_ep3_len512_lr3e-5_fromAlpacaLora
batch_size: 192
micro_batch_size: 24
num_epochs: 3
learning_rate: 3e-05
cutoff_len: 256
val_set_size: 120
use_gradient_checkpointing: False
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: None
bottleneck_size: 256
non_linearity: tanh
adapter_dropout: 0.0
use_parallel_adapter: False
use_adapterp: False
train_on_inputs: True
scaling: 1.0
adapter_name: lora
target_modules: None
group_by_length: False
resume_from_checkpoint: None
Loading checkpoint shards: 100%|##########| 33/33 [00:12<00:00, 2.58it/s]
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
Map: 100%|##########| 52/52 [00:00<00:00, 687.22 examples/s]
Map: 100%|##########| 120/120 [00:00<00:00, 765.56 examples/s]
[E ProcessGroupNCCL.cpp:828] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807082 milliseconds before timi
ng out.
Traceback (most recent call last):
File "", line 353, in
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "", line 299, in train
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/", line 1662, in train
return inner_training_loop(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/", line 1749, in _inner_training_loop
model = self._wrap_model(self.model_wrapped)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/", line 1569, in _wrap_model
model = nn.parallel.DistributedDataParallel(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/nn/parallel/", line 674, in init
_verify_param_shape_across_processes(self.process_group, parameters)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/distributed/", line 118, in _verify_param_shape_across_processes
return dist._verify_params_across_processes(process_group, tensors, logger)
RuntimeError: DDP expects same model across all ranks, but Rank 0 has 128 params, while rank 1 has inconsistent 0 params.
[E ProcessGroupNCCL.cpp:455] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplet
e data.
[E ProcessGroupNCCL.cpp:460] To avoid data inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807082 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807414 milliseconds before timi
ng out.
Traceback (most recent call last):
File "", line 353, in
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "", line 299, in train
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/", line 1662, in train
return inner_training_loop(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/", line 1749, in _inner_training_loop
model = self._wrap_model(self.model_wrapped)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/", line 1569, in _wrap_model
model = nn.parallel.DistributedDataParallel(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/nn/parallel/", line 674, in init
_verify_param_shape_across_processes(self.process_group, parameters)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/distributed/", line 118, in _verify_param_shape_across_processes
return dist._verify_params_across_processes(process_group, tensors, logger)
RuntimeError: DDP expects same model across all ranks, but Rank 3 has 128 params, while rank 0 has inconsistent 0 params.
[E ProcessGroupNCCL.cpp:455] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplet
e data.
[E ProcessGroupNCCL.cpp:460] To avoid data inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807414 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807716 milliseconds before timi
ng out.

my environment:
GPU:8 X A100 80GB
pytorch version:2.0.1

How can I solve this bug?Thanks!

ImportError: cannot import name 'openai_object' from 'openai'

Traceback (most recent call last):
File "/content/ChatDoctor/", line 25, in
import utils
File "/content/ChatDoctor/", line 15, in
from openai import openai_object
ImportError: cannot import name 'openai_object' from 'openai' (/usr/local/lib/python3.10/dist-packages/openai/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 25, in
import utils
File "/content/ChatDoctor/", line 15, in
from openai import openai_object
ImportError: cannot import name 'openai_object' from 'openai' (/usr/local/lib/python3.10/dist-packages/openai/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 25, in
import utils
File "/content/ChatDoctor/", line 15, in
from openai import openai_object
ImportError: cannot import name 'openai_object' from 'openai' (/usr/local/lib/python3.10/dist-packages/openai/
Traceback (most recent call last):
File "/content/ChatDoctor/", line 25, in
import utils
File "/content/ChatDoctor/", line 15, in
from openai import openai_object
ImportError: cannot import name 'openai_object' from 'openai' (/usr/local/lib/python3.10/dist-packages/openai/
[2024-02-09 11:04:21,257] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 14644) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/", line 346, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/", line 806, in main
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/", line 797, in run
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/", line 264, in launch_agent
raise ChildFailedError(

/content/ChatDoctor/ FAILED

time : 2024-02-09_11:04:21
host : 10ca0fca3068
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 14645)
error_file: <N/A>
traceback : To enable traceback see:
time : 2024-02-09_11:04:21
host : 10ca0fca3068
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 14646)
error_file: <N/A>
traceback : To enable traceback see:
time : 2024-02-09_11:04:21
host : 10ca0fca3068
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 14647)
error_file: <N/A>
traceback : To enable traceback see:

Root Cause (first observed failure):
time : 2024-02-09_11:04:21
host : 10ca0fca3068
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 14644)
error_file: <N/A>
traceback : To enable traceback see:

Colab run out of ram memory ?

I cannot load the model because Colab Free does not have enough ram memory. Does anyone have a solution for this ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.