Giter Site home page Giter Site logo

Comments (10)

jenojp avatar jenojp commented on May 19, 2024

Hi @Raghu17s - I've run across this issue too, especially with scispacy as the language model. The NER is making the entity span "no headache" instead of just "headache". Check out the example here: https://github.com/jenojp/negspacy#negations-in-noun-chunks

Basically just modify this line and it should work as you expect.

negex = Negex(nlp, language = "en_clinical", chunk_prefix=["no"])

from negspacy.

Raghu17s avatar Raghu17s commented on May 19, 2024

Thank you.
Then should I mention all negation words in the chunk_prefix like not, but, etc.
Or "no" is the only one which is making issues.

I have observed one strange thing. If I add comma in between no and headache (no, headache instead of no headache) it picks as negation. But this cannot be done manually everytime.

from negspacy.

jenojp avatar jenojp commented on May 19, 2024

I see, that makes sense because the scispaCy NER model does not treat it as 1 entity because of the comma.

Yes you should add other words that you're having the problem with based on the NER. You should be careful to not be too greedy there since in the biomedical domain, many entities could "start with" a negation-like word (e.g., non hodgkin's lymphoma).

from negspacy.

stefano-marchesin avatar stefano-marchesin commented on May 19, 2024

Hi @jenojp,
what about composite terms instead of single words? For example, I believe 'free of' is not detected by negex because in the code (starting at line 307 of negation.py):

if self.chunk_prefix:
if any(
c.text.lower() == doc[e.start].text.lower()
for c in self.chunk_prefix
):
e._.set(self.extension_name, True)

the check is only on the first word (i.e., doc[e.start]) and therefore 'free of' is not appropriately handled.

from negspacy.

jenojp avatar jenojp commented on May 19, 2024

@stefano-marchesin, that's a good catch. I hadn't had any composite terms pop up in my use cases but I can definitely see that being an issue. Could you share an example entity that this is happening on? I'm assuming you're using scispacy?

from negspacy.

stefano-marchesin avatar stefano-marchesin commented on May 19, 2024

Hi @jenojp! yes, I am using scispacy and within negex the language param is set to "en_clinical". As an example, consider the following: "resection margins free of dysplasia"

In this case, free of dysplasia is considered as a single entity and Negex fails to catch "free of" as chunk_prefix. As a workaround I did the following (starting at line 307 of negation.py):

if self.chunk_prefix:
if any(
c.text.lower() == doc[e.start:e.start+len(c)].text.lower()
for c in self.chunk_prefix
):
e._.set(self.extension_name, True)

which solved the problem for me.

from negspacy.

stefano-marchesin avatar stefano-marchesin commented on May 19, 2024

On a side note,

I've also noticed that for the same mention of the previous comment -- i.e., "resection margins free of dysplasia" -- negex finds also "resection margins" as a negative mention. I believe this is related to the "free" pattern included within following_negations. However, is there any chance to remove this behavior? It would create inconsistencies.

from negspacy.

jenojp avatar jenojp commented on May 19, 2024

Hey can you give me a de-id test block of text to work with? I want to try and recreate what you're seeing.

from negspacy.

jenojp avatar jenojp commented on May 19, 2024

Also, you're able to add and remove patterns on the fly in 0.1.8. So you can remove "free" simply by doing the following:

from negspacy.negation import Negex
import spacy

nlp = spacy.load("en_core_sci_sm")
negex = Negex(nlp)
negex.remove_patterns(following_negations=["free"])

from negspacy.

jenojp avatar jenojp commented on May 19, 2024

Closed with release 0.1.9

from negspacy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.