BERT-based Finance Negation Detection
-
Baseline for 金融信息负面及主体判定, CCF Big Data & Computing Intelligence Contest, CCF BDCI *
-
2019-10-6: Jinhua Su plans to use ALBERT(Chinese) to improve its performance.
- Python 3.6
- PyTorch 1.0.0
- pytorch-pretrained-bert
- Train with command, optional arguments could be found in train.py
python train.py --model_name bert --batch_size 16 --save True
- Infer with infer.py
An overview of the BERT-based baseline is given below
for entity in entities:
切割原文->(上文 + ' ' + entity + ' ' + 下文)
input = ('[CLS]'+(上文 + ' ' + entity + ' ' + 下文)+'[SEP]'+entity+'[SEP]')
output = bert(input)
output_list.append(output)
- 可以修改数据处理函数
- bert的dense
- 可以集成学习
- 可以套albert
| No | Model | Description | Score | | :------- | :---------: | :---------: | :---------: | | 1 | Bert | predict for each entity | 0.929 | | 2 | Bert + substring + NIKE | using heuristic method to tackle second task | 0.935 | | 3 | Mxnet_bert + substring + NIKE | predict for each text | 0.947 | | 4 | Test scoring rule | reverse task 1 result | 0.006 |
- long_text_sentence
- task 2 for train data
- improving scale of the model