This is the relevant code repository for reproducing the models and pipeline described in A ChatGPT Enhanced Two-stage Framework for Person-Job Fit in Talent Recruitment.
As described in the paper, our framework comprises two stages. In the first-stage, we generate primary curriculum vitae (CV) candidates
Project Directory |
---|
run_da.py |
run_cl.py |
run_stage_1.py |
run_stage_2.py |
mlm_utils.py |
utils.py |
requirements.txt |
run_da.py
and run_cl.py
are two training programs for domain adaptation and contrastive learning respectively (Phase 1 in Fig. 1). After sequential execution, we can obtain the contrastive backbone
run_stage_1.py
and run_stage_2.py
are two programs for Candidate Generation and LLM-based Recommendation respectively (Phase 2 in Fig. 1).
mlm_utils.py
and utils.py
contain some utility functions useful in training.
conda create -n llm2rec python=3.8
conda activate llm2rec
pip install -r requirements.txt
The input data should be in the following format:
run domain adaptation:
python run_da.py --model_name mlm_da \
--data_path your_data_path \
--epochs 2 \
--lr 1e-5 \
--gpu_id 0
run contrastive learning:
python run_cl.py --model_name cl_backbone \
--data_path your_data_path \
--epochs num_epochs \
--lr 1e-5 \
--gpu_id 0 \
--mlm_epoch 2 \
--max_length 512 \
--batch_size 48 \
--model_size base \
For better readability, we present one prompt example we used for ChatGPT in the following:
You are a professional HR who can determine whether a job description and a curriculum vitae match and provide comments.
You are provided with inputs in the following format:
"""
job description:
a paragraph of text
curriculum vitae:
a paragraph of text
"""
Your task is to output the following:
1. An overall matching score between the curriculum vitae and the job description (0-100).
2. A matching score between the educational background in the curriculum vitae and the job description (0-100).
3. A matching score between the work experience in the curriculum vitae and the job description (0-100).
4. A matching score between the professional skills in the curriculum vitae and the job description (0-100).
The results should be represented in a JSON format with the following key-value pairs:
{
"overall": a number between 0 and 100,
"education": a number between 0 and 100,
"education comment": a detailed paragraph of comment on the education score
"experience": a number between 0 and 100,
"experience comment": a detailed paragraph of comment on the experience score
"skill": a number between 0 and 100,
"skill comment": a detailed paragraph of comment on the skill score
}
You should pay attention to the content of "work experience", "job description", and "job requirements" in the job description during the matching process. You also need to pay attention to the content of "work experience", "educational background", and "work experience" in the curriculum vitae.
The basis for matching should be dynamically generated based on the input job description and curriculum vitae content.
You should provide any other information or feedback about the matching process and automatically handle any errors or missing information that may exist in the curriculum vitae. If there are errors, you should skip the missing information and continue to complete the matching.
Your first response should be 'Understood.'.
This section demonstrates how we obtain a final output from multiple prompts.
Firstly, We construct multiple prompts for a given JD-CV pair and input them into ChatGPT to obtain corresponding multiple outputs. (Fig. 2)
Figure 2: The overview of analyses generation.
Then, we aggregate these outputs in the manner shown in Fig. 3.
There are two options for the calculation, one is the arithmetic mean over all the total scores, and the other is the arithmetic mean over the weighted means. For the weight mean