Chinese Traditional Sequence Annotation
git clone https://github.com/iofu728/Chinese_T-Sequence-annotation
cd Chinese_T-Sequence-annotation && git clone https://github.com/google-research/bert
pip install -r requirement.txt --user
## for BiLSTM + CRF
python run.py
## for Bert + CRF
bash run_bert.sh
Model |
Train P |
Train R |
Train F1 |
Dev P |
Dev R |
Dev F1 |
Best Epoch |
BiLSTM + CRF |
98.36 |
98.42 |
98.39 |
93.70 |
93.82 |
93.76 |
12 |
Bert + CRF |
- |
- |
- |
97.97 |
97.60 |
97.78 |
1(only test) |
Table 1 for P, R, F1 for common way
Model |
Train P |
Train R |
Train F1 |
Dev P |
Dev R |
Dev F1 |
Best Epoch |
BiLSTM + CRF |
99.75 |
99.76 |
99.76 |
86.95 |
81.30 |
84.03 |
30 |
Bert + CRF |
- |
- |
- |
96.62 |
97.23 |
96.92 |
3 |
Table 2 for domain performance
Model |
Train Acc |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
Test Acc |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
BiLSTM + CRF |
99.98 |
99.81 |
99.84 |
99.83 |
99.68 |
99.73 |
99.70 |
99.67 |
99.64 |
99.65 |
97.84 |
87.62 |
85.33 |
86.46 |
83.46 |
70.94 |
76.69 |
87.99 |
80.68 |
84.17 |
Bert + CRF |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
99.70 |
98.18 |
98.44 |
98.31 |
93.42 |
95.56 |
94.48 |
98.27 |
98.84 |
98.55 |
Type |
Train Set Num |
Train Set Percent % |
Dev Set Num |
Dev Set Percent % |
N |
1125991 |
90.63 |
479394 |
90.46 |
B-LOC |
25211 |
2.03 |
11002 |
2.08 |
I-LOC |
32022 |
2.58 |
13973 |
2.64 |
B-ORG |
9428 |
0.76 |
4062 |
0.77 |
I_ORG |
15220 |
1.23 |
6605 |
1.23 |
B-PER |
11562 |
0.93 |
4990 |
0.94 |
I-PER |
22858 |
1.84 |
9884 |
1.87 |
other param = {
'Model' = 'BiLSTM CRF',
'Hidden Size' = 512,
'Embed Size' = 256,
'learning rate' = 0.01
}
Batch Size |
Train P |
Train R |
Train F1 |
Dev P |
Dev R |
Dev F1 |
Best Epoch |
32 |
91.15 |
91.30 |
91.23 |
89.77 |
89.97 |
89.87 |
7.8 |
64 |
93.53 |
93.54 |
93.53 |
91.65 |
91.78 |
91.71 |
8.8 |
128 |
95.34 |
95.15 |
95.24 |
92.83 |
92.59 |
92.71 |
9.2 |
256 |
97.21 |
97.04 |
97.13 |
93.58 |
93.37 |
93.48 |
8.2 |
512 |
98.36 |
98.42 |
98.39 |
93.70 |
93.82 |
93.76 |
12.0 |
768 |
99.43 |
99.30 |
99.37 |
93.99 |
93.53 |
93.76 |
19.6 |
Hyperparameter of BiLSTM + CRF in CWS
other param = {
'Model' = 'BiLSTM CRF',
'Batch Size' = 64,
'learning rate' = 0.01
}
Hidden Size |
Embed Size |
Train P |
Train R |
Train F1 |
Dev P |
Dev R |
Dev F1 |
Best Epoch |
512 |
256 |
93.53 |
93.54 |
93.53 |
91.65 |
91.78 |
91.71 |
8.8 |
512 |
512 |
91.81 |
91.92 |
91.86 |
90.38 |
90.56 |
90.47 |
3.8 |
768 |
256 |
93.51 |
93.94 |
93.72 |
91.50 |
92.03 |
91.77 |
10.8 |
other param = {
'Model' = 'Bert CRF',
'Batch Size' = 32,
'learning rate' = 2e-5
}
Dev P |
Dev R |
Dev F1 |
97.97 |
97.60 |
97.78 |
Batch Size of BiLSTM + CRF in NER
other param = {
'Model' = 'BiLSTM CRF',
'Hidden Size' = 512,
'Embed Size' = 256,
'learning rate' = 0.001
}
Table 1 for P, R, F1 for common way
Batch Size |
Train P |
Train R |
Train F1 |
Dev P |
Dev R |
Dev F1 |
Best Epoch |
64 |
99.75 |
99.76 |
99.76 |
86.95 |
81.30 |
84.03 |
30 |
256 |
97.75 |
96.59 |
97.17 |
84.56 |
79.27 |
81.83 |
33 |
512 |
94.75 |
92.12 |
93.42 |
83.54 |
78.14 |
80.75 |
36 |
768 |
88.41 |
83.74 |
86.02 |
81.36 |
75.80 |
78.48 |
38 |
Table 2 for domain performance
Batch Size |
Train Acc |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
Test Acc |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
64 |
99.98 |
99.81 |
99.84 |
99.83 |
99.68 |
99.73 |
99.70 |
99.67 |
99.64 |
99.65 |
97.84 |
87.62 |
85.33 |
86.46 |
83.46 |
70.94 |
76.69 |
87.99 |
80.68 |
84.17 |
256 |
99.72 |
97.57 |
97.08 |
97.33 |
97.21 |
93.82 |
95.48 |
98.57 |
97.78 |
98.17 |
97.61 |
85.75 |
83.40 |
84.56 |
81.62 |
68.27 |
74.35 |
83.92 |
78.94 |
81.35 |
512 |
99.31 |
94.25 |
93.26 |
93.75 |
94.67 |
86.68 |
90.50 |
95.89 |
94.09 |
94.98 |
97.54 |
84.38 |
82.73 |
83.55 |
81.56 |
66.35 |
73.17 |
82.94 |
77.44 |
80.10 |
768 |
98.44 |
88.20 |
86.48 |
87.33 |
86.64 |
72.98 |
79.22 |
90.16 |
86.59 |
97.34 |
97.34 |
82.01 |
79.92 |
80.95 |
79.42 |
64.26 |
71.04 |
81.22 |
75.95 |
78.49 |
Hyperparameter of BiLSTM + CRF in NER
other param = {
'Model' = 'BiLSTM CRF',
'Batch Size' = 64,
'learning rate' = 0.001
}
Table 1 for P, R, F1 for common way
Hidden Size |
Embed Size |
Train P |
Train R |
Train F1 |
Dev P |
Dev R |
Dev F1 |
Best Epoch |
512 |
256 |
99.75 |
99.76 |
99.76 |
86.95 |
81.30 |
84.03 |
30 |
300 |
300 |
99.00 |
97.76 |
98.38 |
87.38 |
81.10 |
84.13 |
25 |
Table 2 for domain performance
Hidden Size |
Embed Size |
Train Acc |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
Test Acc |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
512 |
256 |
99.98 |
99.81 |
99.84 |
99.83 |
99.68 |
99.73 |
99.70 |
99.67 |
99.64 |
99.65 |
97.84 |
87.62 |
85.33 |
86.46 |
83.46 |
70.94 |
76.69 |
87.99 |
80.68 |
84.17 |
300 |
300 |
99.81 |
99.02 |
98.31 |
98.67 |
98.46 |
96.17 |
97.30 |
99.41 |
97.87 |
98.64 |
97.87 |
87.68 |
85.06 |
86.35 |
84.10 |
71.19 |
77.11 |
89.20 |
80.29 |
84.51 |
other param = {
'Model' = 'Bert CRF',
'Batch Size' = 32,
'learning rate' = 2e-5
}
Epoch |
Dev Acc |
Dev P |
Dev R |
Dev F1 |
Loc P |
Loc R |
Loc F1 |
Org P |
Org R |
Org F1 |
Per P |
Per R |
Per F1 |
1 |
99.63 |
95.28 |
96.42 |
95.85 |
97.53 |
97.83 |
97.68 |
89.99 |
93.85 |
91.88 |
97.49 |
98.62 |
98.05 |
2 |
99.67 |
96.30 |
96.97 |
96.64 |
97.97 |
98.08 |
98.02 |
92.96 |
95.15 |
94.04 |
97.59 |
98.82 |
98.20 |
3 |
99.70 |
96.62 |
97.23 |
96.92 |
98.18 |
98.44 |
98.31 |
93.42 |
95.56 |
94.48 |
98.27 |
98.84 |
98.55 |
MIT