Giter Site home page Giter Site logo

alfonsrv / mail-parser-reply Goto Github PK

View Code? Open in Web Editor NEW
39.0 3.0 10.0 84 KB

๐Ÿ“ง Mail reply parser library for Python with multi-language support

Home Page: https://pypi.org/project/mail-parser-reply/

License: MIT License

Python 100.00%
email email-parsing parsing parsing-library python python-email parser

mail-parser-reply's Introduction

Mail Reply Parser ๐Ÿ“ง๐Ÿ

Python Version

Multi-language email reply parsing for international environments ๐ŸŒ

Mail clients handle reply formatting differently, making reliable parsing difficult. Thank god we have standards. This library splits text-based emails into separate replies based on common headers produced by different, multilingual clients usually indicating separation.

Replies can either present the whole mail message body, or strip headers, signatures and common disclaimers if required. Currently supported languages are: English (en), German (de), French (fr), Italian (it), Japanese (ja), Polish (pl) โ€“ adding more languages is quite easy.

This is an improved Python implementation of GitHub's Ruby-based email_reply_parser and an adaptation of Zapier's email-reply-parser which both split the mails in fragments instead of distinct replies. They also only support English.

โญ Features

โญ Easy to implement
โญ Multilanguage Support
โญ Text-based mail parsing
โญ Detect headers, signatures and disclaimers
โญ Fully type annotated
โญ Easy-to-read code and well-tested

Overview ๐Ÿ”ญ

This library makes it easy to split an incoming mail into replies, making working with emails much more manageable and easily providing the text content for each reply โ€“ with or without signatures, disclaimers and headers.

For example, it can turn the following email:

Awesome! I haven't had another problem with it.

Thanks,
alfonsrv

On Wed, Dec 20, 2023 at 13:37, RAUSYS <[email protected]> wrote:

> The good news is that I've found a much better query for lastLocation.
> It should run much faster now. Can you double-check?

Into just the replied text content:

Awesome! I haven't had another problem with it.

Get started ๐Ÿ‘พ

Installation

pip install mail-parser-reply

Parse Replies

from mailparser_reply import EmailReplyParser

mail_body = 'foobar'; languages = ['en', 'de']
mail_message = EmailReplyParser(languages=languages).read(text=mail_body)
print(mail_message.replies)

Or get only the latest reply using:

latest_reply = EmailReplyParser(languages=languages).parse_reply(text=mail_body)

Parser API

EmailMessage.text:              Mail body
EmailMessage.languages:         Languages to use for parsing headers
EmailMessage.replies:           List of EmailReply; single parsed replies
EmailMessage.include_english:   Always include English language for parsing
EmailMessage.default_language:  Default language to use if language dictionary 
                                doesn't include any other language codes

EmailMessage.HEADER_REGEX:      RegEx for identifying headers, separating mails
EmailMessage.SIGNATURE_REGEX:   RegEx for identifying signatures
EmailMessage.DISCLAIMERS_REGEX: RegEx for identifying disclaimers

EmailMessage.read(): Parse EmailMessage.text to EmailReply which are then stored 
                     in EmailMessage.replies
EmailReply.content:     Unprocessed mail body with headers, signatures, disclaimers
EmailReply.body:        Mail body without headers, signatures, disclaimers
EmailReply.full_body:   Mail body; just without headers

EmailReply.headers:     Identified Headers
EmailReply.signatures:  Identified Signatures
EmailReply.disclaimers: Identified disclaimers

Buy me a Coffee

mail-parser-reply's People

Contributors

alfonsrv avatar killinsun avatar mkaczkow avatar tognee avatar westnet-paul avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mail-parser-reply's Issues

Unable to use the parser

I have a function where I am calling your library. By simple copying your example (even using the "foobar" as data) I am getting the following error:
mail_message = EmailReplyParser.read(text=mail_body, languages=languages) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: EmailReplyParser.read() got an unexpected keyword argument 'languages'

Function
`def beautify_body(body):

# Remove HTML code
soup = BeautifulSoup(body, 'html.parser')
body = soup.get_text()

mail_body = 'foobar'; languages = ['en', 'de']
mail_message = EmailReplyParser.read(text=mail_body, languages=languages)
print(mail_message.replies)


return body`

Question - why some test cases are commented?

Hi,

First of all, thank you for the great work you did on this package.

I'm coming from the node world, and after spending some time on looking for 'the-best-solution' (free & open-source) to parse email, with their contents (replies), i found this one, and I think this offers great extensibility.

I was looking at the code, pretty-clean ! Wondering why some test cases are commented tho?

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.