Comments (4)
This is actually by design. Most signature lines are less than 60 chars. So it's a good sanity check. 80 chars is definitely too much - you'll get a lot of false positives. We could uplift it of course a bit e.g. to 65 but there will always be an email with + 1 char. 60 chars seems like a reasonable number.
from talon.
Understood. Looking at some of my mail, many sigs are just basic name, org, url, etc. But lots of mailing lists have signature-like footers with longer list info URLs. And there are the people that send mail with huge blobs of legalese signatures which certainly go over 60 char lines. Feel free to close but maybe add some notes to the docs about the issue.
from talon.
Sure. I think the best solution will be to parse those specific signatures separately prior to detecting "human" signature. The legalese signatures are quite different from "human" signatures and require different detection techniques.
I'm actually thinking about a separate module and maybe a different name for those cases e.g. something like "disclaimer".
from talon.
I've created a separate issue for disclaimers #8. Going to close this one. Thanks for your feedback!
from talon.
Related Issues (20)
- PyPI not up-to-date HOT 6
- TypeError: cannot use a string pattern on a bytes-like object HOT 2
- Can not install using python 3.7
- Feature Request: Provide methods that return cursor and/or placeholder at end of reply
- How to arrange the two methods
- html to lined text issue
- How to run the code and extract the body of the email alone
- Not able to use Custom Classifier HOT 1
- Demo app source code HOT 2
- joblib warning HOT 3
- How to calculate Talon's accuracy score
- Joblib error HOT 9
- Parsing email in other languages HOT 1
- Unable to use signature extraction library HOT 1
- Unable to remove the part containing '--- Forwarded message ---' HOT 2
- Unable extract email signature by using talon HOT 1
- When is the next release planned? HOT 1
- error in importing signature
- What version of this library is compatible with Python 3.6?
- Unable to install Talon on python 3.11.4 due to dependency conflict with cchardet package HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from talon.