Hi , Thanks a lot for the project .It is indeed wonderful. Howev

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Is there any way to replace the current NER ? about bootleg HOT 5 CLOSED

hazyresearch commented on September 14, 2024

Is there any way to replace the current NER ?

from bootleg.

Comments (5)

lorr1 commented on September 14, 2024 1

So I went ahead and added your function as an example in the branch here. If you use the annotator and use the extract method of custom, it should trigger your extractor. I haven't tested it but it should get you started.

from bootleg.

lorr1 commented on September 14, 2024

Hi!

Yes, you can do this. I have a list of possible extractors here. If you want to implement your own extractor function and add it there, you should be able to trigger it being used via this argument here.

As long as you have the same inputs/outputs, it should be possible.

from bootleg.

coolcoder001 commented on September 14, 2024

Hi,
Thanks a lot for the quick response. :)
My extractor function using flair takes input as a string and outputs the extracted entities in a pandas dataframe.

def entity_recognition(text):
    """Given a text document, run a NER on it using flair and return a dataframe with the following columns
    text: actual raw text input
    entity: identified entity text
    entity_start: character start position of entity in raw text
    entity_end: character end position of entity in raw text
    """
    import pandas as pd
    from flair.data import Sentence
    from flair.models import SequenceTagger
    tagger_fast = SequenceTagger.load('ner-ontonotes-fast')
    sentence = Sentence(text)
    tagger_fast.predict(sentence, mini_batch_size=16)
    entities = []
    for i in tqdm(range(len(sentence.to_dict(tag_type='ner')['entities']))):
        str_main=None
        start_pos = -1
        end_pos = -1
        if str(sentence.to_dict(tag_type=
                                'ner')['entities'][i]['labels']
                [0]).split()[0] in 'ORG':
            str_main = str(sentence.to_dict(tag_type='ner')['entities'][i]
                        ['text'])
            start_pos = sentence.to_dict(tag_type='ner')['entities'][i]['start_pos']
            end_pos = sentence.to_dict(tag_type='ner')['entities'][i]['end_pos']
            
        elif str(sentence.to_dict(tag_type=
                                    'ner')['entities'][i]['labels']
                    [0]).split()[0] in 'PERSON':
            str_main = str(sentence.to_dict(tag_type=
                                        'ner')['entities'][i]['text'])
            start_pos = sentence.to_dict(tag_type='ner')['entities'][i]['start_pos']
            end_pos = sentence.to_dict(tag_type='ner')['entities'][i]['end_pos']
            
        elif str(sentence.to_dict(tag_type=
                                    'ner')['entities'][i]['labels']
                    [0]).split()[0] in 'GPE':
            str_main = str(sentence.to_dict(tag_type=
                                        'ner')['entities'][i]['text'])
            start_pos = sentence.to_dict(tag_type='ner')['entities'][i]['start_pos']
            end_pos = sentence.to_dict(tag_type='ner')['entities'][i]['end_pos']
        if str_main is not None and (start_pos!=-1 and end_pos!=-1):
            entities.append([str_main, start_pos, end_pos])
    
    entities = pd.DataFrame(entities, columns=['entity', 'entity_start', 'entity_end'])
    entities['text'] = text
    return entities

Can you please help me with the changes I need to make to this function so that it can work with bootleg?

Thanks in advance.

from bootleg.

coolcoder001 commented on September 14, 2024

Hi @lorr1 , thanks a lot for your help. You are so nice and awesome :)

I am able to run this code using the Flair NER engine.

However, if I have to do some more changes, can I directly push them to the branch you created? or do I need to raise PR ?

from bootleg.

lorr1 commented on September 14, 2024

How about you raise PRs? I'll pretty much approve everything, but I'd like to keep track of what you're finding difficult/useful to implement.

Thanks!

from bootleg.

Is there any way to replace the current NER ? about bootleg HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent