Comments (6)
Hi @ad-m,
in mail-parser
results there is a binary
flag. If it's true
the sample is a binary, else it's false
is not binary.
If you read this part of code you can see:
...
charset = p.get_content_charset('utf-8') # get the charset
...
if filename:
binary = False
mail_content_type = ported_string(p.get_content_type())
transfer_encoding = ported_string(
p.get('content-transfer-encoding', '')).lower()
if transfer_encoding == "base64" or \
(transfer_encoding == "quoted-printable" and
"application" in mail_content_type):
payload = p.get_payload(decode=False)
binary = True
else:
payload = ported_string(
p.get_payload(decode=True), encoding=charset)
mail-parser
gives you the correct payload, so you should do:
if binary:
with open(sample, "wb") as f:
f.write(payload.decode("base64"))
else:
with open(sample, "w") as f:
f.write(payload.encode("utf-8"))
That's it
from mail-parser.
I added the store attachments
function in mail-parser
command line:
$ mailparser -f my_mail -sa -ap /tmp/attachments
from mail-parser.
I understood but mail-parser
is a parser. If you want store sample on filesystem you should manage possible issues.
I can't introduce extra logic.
from mail-parser.
In my opinion, proper decoding of binarny attachments to a binary form native to Python is a part of parsing. In my opinion, the logic around decoder_map
is an element of the project. This is in the scope of the project as much as the date parsing that is present today. Of course, saving attachments is out of the project. Parsing attachments of mail to the native binary form does not require writing them to the disk.
from mail-parser.
First of all @fedelemantuano, thank you for your awesome module.
On this issue though, from my POV as an user, I stand on @ad-m's side.
Maybe you think the attachments decoding is a trivial task, but it is one that will be done by each user that is interested in the attachments, and they will have to implement that little bit of logic in their code after having looked at that issue.
Maybe what is best on the user side is a common binary format for the payload, and additional logic should be kept (and possibly simplified) on the module side ?
from mail-parser.
I use mail-parser in SpamScope tool. I developed a function that writes the attachments (see here).
I use it every days and works well.
from mail-parser.
Related Issues (20)
- Drop simplejson requirement
- Multiple mail thread handling HOT 2
- When parsing eml attachment from Gmail, the attachment is being parsed as email instead as attachment HOT 2
- Ignore warnings - Email content 'calendar' not handled HOT 4
- Mime Header Decoding (RFC 2047) does not correctly resolve in case the display name contains an encoded comma
- from_ attribute contains two tuples for one sender
- mail format
- Is this able to parse latest reply? HOT 1
- Handle multi part/ alternative text emails? HOT 1
- Only parses Undelivered message for emails with bounced emails
- Make the specific receiver of the email available in a field HOT 1
- Extracting mail signature HOT 1
- Disable json indent by default
- newline breaks test
- Empty metadata when using mail-parser to parse .msg outlook emails (email-outlook-message-perl 0.918-2) HOT 1
- Issue when parsing an email message HOT 2
- CID not have a image data
- UnicodeDecodeError when parsing email with "\u" in its body HOT 2
- parse_from_bytes() not working on BytesIO() object HOT 1
- MailParserReceivedParsingError when parsing email domains ending in .id (Indonesia's .co.id and similar)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mail-parser.