scofield7419 / sequence-labeling-bilstm-crf Goto Github PK

The BiLSTM-CRF model implementation in Tensorflow, for sequence labeling tasks.

License: GNU General Public License v3.0

Python 4.39% HTML 6.08% CSS 14.82% JavaScript 74.36% PHP 0.02% Makefile 0.01% CoffeeScript 0.31%

bilstm-crf tensorflow python35 sequence-labeling ner nlp

sequence-labeling-bilstm-crf's Issues

something not mentioned

1, for prediction, sentence length should be shorter than 1000, according to utils.extractEntity_
2, if ending of entity is in the end of sentence, it cant be predicted,for some bug in utils.extractEntity, line86, can change to
reg_str = r'([0-9][0-9][0-9]B'+label_hyphen + tag_str + r' )([0-9][0-9][0-9]I'+label_hyphen + tag_str + r' )*([0-9][0-9][0-9]E'+label_hyphen + tag_str + r')|([0-9][0-9][0-9]S'+label_hyphen + tag_str + r' )'

always thanks to your code

num_steps

为什么训练和测试的时候num_steps需要一致？

关于数据集

@scofield7419 您好，请问您使用的是什么样的数据集，我试着用里面的train.in迭代到第12次就报错，而且F1值数-1。

char2id和label2id

char2id和label2id是提前生成好的么？

It takes too long to predict a sentence

I just have a test, and produce a model use the example_datasets2. But I found it costs to long to predict, so could you please give me some messages to accelerate the prediction.

ValueError: Cannot reshape a tensor with 28800 elements to shape [128, 6, 6] (4608 elements) We've got an error while stopping in post-mortem: <type 'exceptions.KeyboardInterrupt'>

大神请问下在运行bilstm_crf_word embedding时使用命令python train.py train.in model -v validation.in -e 10 出现了这样的错误
transitions = tf.reshape(tf.concat(0, [transitions] * self.batch_size), [self.batch_size, 6, 6])

ValueError: Cannot reshape a tensor with 28800 elements to shape [128, 6, 6] (4608 elements)
We've got an error while stopping in post-mortem: <type 'exceptions.KeyboardInterrupt'>

该怎么解决啊

请问BILSTM_CRF.py中self.targets_weight的作用是什么？

您好，您的代码包括char序列和word序列的标注，两个model下都有这一句，
y_train_weight_batch = 1 + np.array((y_train_batch == label2id['B']) | (y_train_batch == label2id['E']), float)
self.targets_weight:y_train_weight_batch
貌似不参与运算，请问BILSTM_CRF.py中self.targets_weight的作用是什么？

how to calculate the metrics based on test.out and test.csv

in the HandBook ,you say "calcu_measure_testout.py" can compute the metrics based on test.out and test.csv, but run this file can't get the result, need your help!

Logic Problem in self.parpare (DataManger.py)

Basically, I tested this project on new dataset. But I always got:

training set size: 0 validating set size:0

I output the place to generate such error, it is in the prepare function:

def prepare(self, tokens, labels, is_padding=True, return_psyduo_label=False):
        **X = []
        y = []**
        y_psyduo = []
        tmp_x = []
        tmp_y = []
        tmp_y_psyduo = []

        for record in zip(tokens, labels):
            c = record[0]
            l = record[1]
            if c == -1:  # empty line
                if len(tmp_x) <= self.max_sequence_length:
                    X.append(tmp_x)
                    y.append(tmp_y)
                    if return_psyduo_label: y_psyduo.append(tmp_y_psyduo)
                tmp_x = []
                tmp_y = []
                if return_psyduo_label: tmp_y_psyduo = []
            else:
                **tmp_x.append(c)
                tmp_y.append(l)**
                if return_psyduo_label: tmp_y_psyduo.append(self.label2id["O"])
        if is_padding:
            **X = np.array(self.padding(X))**
        else:
            X = np.array(X)
        y = np.array(self.padding(y))
        if return_psyduo_label:
            y_psyduo = np.array(self.padding(y_psyduo))
            return X, y_psyduo

        return X, y

Based on the is_padding and psyduo_label:

        if is_padding:
            **X = np.array(self.padding(X))**
        else:
            X = np.array(X)

X will always be blank. Please have a check.

Thanks

The speed of prediction is slow

It takes at least 5mins to load vocab and dataManager, the prediction is too slow and I try to make CUDA_VISIBLE_DEVICES=1 but it doesn't use CUDA to extract entity, I want to know why and I make sure I have successfully configure CUDA for tensorflow-gpu.

scofield7419 / sequence-labeling-bilstm-crf Goto Github PK

sequence-labeling-bilstm-crf's Issues

Recommend Projects

Recommend Topics

Recommend Org