Giter Site home page Giter Site logo

email-outlook-message-perl's Introduction

NAME

    Email::Outlook::Message

DESCRIPTION

    This module reads e-mail messages stored as .msg files (such as generated
    by Outlook), and converts them to Email::MIME objects. It also includes a
    command-line interface in the form of the msgconvert script.

    You do not need Outlook installed to use this module.

VERSION

    0.921

INSTALLATION

    To install this module type the following:

      perl Build.PL
      ./Build
      ./Build test
      ./Build install

    You may have to become root for that final step.

DEPENDENCIES

    This module requires these other modules:

      Carp
      Encode
      Getopt::Long
      IO::String
      Pod::Usage

      Email::MIME               - 1.923 or later
      Email::MIME::ContentType  - 1.014 or later
      Email::Sender             - 1.3 or later
      Email::Simple             - 2.206 or later
      OLE::Storage_Lite         - 0.14 or later

    For testing:

      IO::All
      Test::More

COPYRIGHT AND LICENCE

    Copyright 2002--2020 Matijs van Zuijlen.

    This program is free software; you can redistribute it and/or
    modify it under the same terms as Perl itself.

email-outlook-message-perl's People

Contributors

dependabot[bot] avatar gerritdrost avatar mvz avatar ojwb avatar szabgab avatar tsubasaogawa avatar xtaran avatar ztravis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

email-outlook-message-perl's Issues

Wide character errors

I'll get this error on Ubuntu when trying to convert some emails.

Wide character at /usr/local/share/perl/5.30.0/Email/Outlook/Message.pm line 397.

Handle RTF-encapsulated HTML

The attached archive contains .msg files with HTML bodies encoded in base64.
The library fails to convert them correctly to .eml messages.

msg.zip

Does it support the reverse conversion?

The docs don't appear to specify it does but I'm wondering is it possible with this tool (or the same libraries) to take an eml file and convert it to .msg file. I'm aiming to use this as part of an apache nifi workflow on a linux host

Failed test 'Checking if body structure for t/files/plain_jpeg_attached.msg is the same'

While working on packaging msgconvert for Guix, I noticed that one of the tests is failing:

#   Failed test 'Checking if body structure for t/files/plain_jpeg_attached.msg is the same'
#   at t/full_structure.t line 19.
#     Structures begin differing at:
#          $got->[4][1][0] = 'content-disposition: attachment; filename="test.jpg"'
#     $expected->[4][1][0] = 'content-disposition: attachment; filename=test.jpg'
# Looks like you failed 1 test of 12.
t/full_structure.t ....... 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/12 subtests 
t/gpg_signed.t ........... ok
t/internals.t ............ ok
t/plain_jpeg_attached.t .. ok
t/plain_uc_unsent.t ...... ok
t/plain_uc_wc_unsent.t ... ok
t/plain_unsent.t ......... ok
t/pod_coverage.t ......... skipped: Test::Pod::Coverage required for testing pod coverage

Test Summary Report
-------------------
t/full_structure.t     (Wstat: 256 Tests: 12 Failed: 1)
  Failed test:  5
  Non-zero exit status: 1
Files=10, Tests=118,  1 wallclock secs ( 0.04 usr  0.02 sys +  0.83 cusr  0.51 csys =  1.40 CPU)
Result: FAIL
Failed 1/10 test programs. 1/118 subtests failed.
command "./Build" "test" failed with status 255

Please let me know what additional details I can provide about my environment.

Thanks!
Jack

codepage CP28592 unknown

Hello,
I have received this error using msgconvert:

% msgconvert --mbox spam FW_chyba_VUK_pri_UZ.msg
Unknown encoding 'CP28592' at /usr/share/perl5/Email/Outlook/Message.pm line 399

adding the following to $MAP_CODEPAGE helped:

28592 => 'ISO-8859-2',

Conversion to .eml in Windows - extra newlines break attachments

Conversion to .eml in Windows 10 (v5.24.1, MSWin32-x64-multi-thread, Strawberry Perl inside Cygwin) adds a form of "newline" to each line (not just base64). As a result (?) attachments are not recognised as files by email client, and base64 is simply inlined into the message body.

---mbox generation unaffected.

Under Linux conversion to .eml of same .msg succeeds as expected.

Example message body snippet where it transitions to the attachment, whitespace retained:

==========[SNIP BEGINS]===================================
<p class=MsoNormal><font size=3 face="Times New Roman"><span lang=EN-US

style='font-size:12.0pt'><o:p>&nbsp;</o:p></span></font></p>



</div>



</body>



</html>



--14901086630.FEb4.8904

Content-Type: application/rtf

Content-Disposition: inline

Content-Transfer-Encoding: base64



e1xydGYxXGFuc2lcYW5zaWNwZzEyNTJcZnJvbWh0bWwxIFxkZWZmMHtcZm9udHRibAoNe1xmMFxm

c3dpc3MgQXJpYWw7fQoNe1xmMVxmbW9kZXJuIENvdXJpZXIgTmV3O30KDXtcZjJcZm5pbFxmY2hh

cnNldDIgU3ltYm9sO30KDXtcZjNcZm1vZGVyblxmY2hhcnNldDAgQ291cmllciBOZXc7fQoNe1xm

NFxmc3dpc3NcZmNoYXJzZXQwIEFyaWFsO30KDXtcZjVcZnN3aXNzXGZjaGFyc2V0MCBUaW1lcyBO

ZXcgUm9tYW47fQoNe1xmNlxmbmlsXGZjaGFyc2V0MiBXaW5nZGluZ3M7fQoNe1xmN1xmbmlsXGZj
==========[SNIP ENDS]===================================

Run msgconvert as non-root

Hello!

Thanks for this nice tool.

We have a centos distribution and installation via cpan -i Email::Outlook::Message works flawlessly.

However, we need to execute msgconvert from a web application which is run as apache (or www-run).

I'm not able to execute msgconvert from there and it seems quite tricky to get this working.

Any tips or considerations how to approach this problem?

mail format

Do you know what format use Microsoft to store its mails in OST/PST files? HTML/EML/MESSAGE/etc.

line endings on unix system

hello,
msgconvert writes files in dos format. It would be great if it could strip CR characters on unix systems.

msgconvert: keep HTML variants of the email (skips multipart/mixed properties)

Forwarding https://bugs.debian.org/801189

Version: 0.918-1
File: /usr/bin/msgconvert

I attempted to convert a mail containing plain text and HTML variants
but msgconvert only kept the plain text variant, discarding the HTML
variant. It would be nice if it could keep both of them.

pabs@chianamo ~ $ msgconvert --verbose path/to/outlook.msg 
Skipping DIR entry __nameid_version1 0 (Introductory stuff)
...
Skipping property 001F:8004 (UNKNOWN): multipart/mixed; boundary="_009_3C5F9D52E ...
...
Using    property 001F:1000 (BODY_PLAIN): ...
...

Failed tests on macOS

macOS Version: 11.0 Beta 20A4299v
perl version: v5.28.2
cpan version: /usr/bin/cpan script version 1.67, CPAN.pm version 2.20

Installed with cpan -i Email::Outlook::Message

Command failure output is as follows.

Loading internal logger. Log::Log4perl recommended for better logging
Reading '/Users/nep/.cpan/Metadata'
  Database was generated on Sat, 18 Jul 2020 20:17:03 GMT
Running install for module 'Email::Outlook::Message'
Checksum for /Users/nep/.cpan/sources/authors/id/M/MV/MVZ/Email-Outlook-Message-0.919.tar.gz ok
'YAML' not installed, will not store persistent state
Configuring M/MV/MVZ/Email-Outlook-Message-0.919.tar.gz with Build.PL
Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'Email-Outlook-Message' version '0.919'
  MVZ/Email-Outlook-Message-0.919.tar.gz
  /usr/bin/perl Build.PL -- OK
Running Build for M/MV/MVZ/Email-Outlook-Message-0.919.tar.gz
Building Email-Outlook-Message
  MVZ/Email-Outlook-Message-0.919.tar.gz
  ./Build -- OK
Running Build test
t/basics.t ............... ok
t/gpg_signed.t ........... ok
t/internals.t ............ 1/17
#   Failed test at t/internals.t line 66.
#          got: 'text/plain; charset=UTF-8'
#     expected: 'text/plain; charset="UTF-8"'
# Looks like you failed 1 test of 17.
t/internals.t ............ Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/17 subtests
t/plain_jpeg_attached.t .. 1/23
#   Failed test 'Testing content disposition'
#   at t/plain_jpeg_attached.t line 38.
#          got: 'attachment; filename=test.jpg'
#     expected: 'attachment; filename="test.jpg"'
# Looks like you failed 1 test of 23.
t/plain_jpeg_attached.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/23 subtests
t/plain_uc_unsent.t ...... ok
t/plain_uc_wc_unsent.t ... ok
t/plain_unsent.t ......... ok
t/pod_coverage.t ......... skipped: Test::Pod::Coverage required for testing pod coverage

Test Summary Report
-------------------
t/internals.t          (Wstat: 256 Tests: 17 Failed: 1)
  Failed test:  9
  Non-zero exit status: 1
t/plain_jpeg_attached.t (Wstat: 256 Tests: 23 Failed: 1)
  Failed test:  21
  Non-zero exit status: 1
Files=8, Tests=87,  1 wallclock secs ( 0.03 usr  0.02 sys +  0.76 cusr  0.25 csys =  1.06 CPU)
Result: FAIL
Failed 2/8 test programs. 2/87 subtests failed.
  MVZ/Email-Outlook-Message-0.919.tar.gz
  ./Build test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:
  reports MVZ/Email-Outlook-Message-0.919.tar.gz

charset name UTF8 do not conform IANA

I tested msgconvert recently and got a quite well formated .eml file.
But I notice this multipart header in the resulting .eml :

--16153750240.8B52dA.31571
Content-Type: text/plain; charset="UTF8"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline

Another software (proprietary one) displaying the mail complain about the charset name UTF8. In the IANA charset list (http://www.iana.org/assignments/character-sets/character-sets.xhtml), I can see "UTF-8" and the alias "csUTF8", but no "UTF8".

Could it be possible to modify the produced charset to "UTF-8" instead of "UTF8" ?

Can't open files with leading space characters

I was processing an Office 365 Individual Message export and it created some msg files with leading spaces in the filename. (Since the filenames are based on the message Subject, this is understandable.)

I modified my local copy of message.pm for line 114 so that it used the filehandle object way as given in the second example of the synopis at http://search.cpan.org/~jmcnamara/OLE-Storage_Lite-0.19/lib/OLE/Storage_Lite.pm


--- Message.pm.orig	2015-07-04 19:20:08.000000000 -0400
+++ Message.pm	2018-01-04 14:57:08.352770214 -0500
@@ -111,7 +111,13 @@
 
   $self->{EMBEDDED} = 0;
 
-  my $msg = OLE::Storage_Lite->new($file);
+  use IO::File;
+  my $oIo = new IO::File;
+  $oIo->open("$file", "r" ,0666);
+  binmode($oIo);
+  my $msg = OLE::Storage_Lite->new($oIo);
+
+  #my $msg = OLE::Storage_Lite->new($file);
   my $pps = $msg->getPpsTree(1);
   $pps or croak "Parsing $file as OLE file failed";
   $self->_set_verbosity($verbose);

and it solved the problem for me (or at least I not longer got the croak error message.

Keep knowledge of encoding for text properties around

Working on #14 brought to light that reading text properties results in a different type of result depending on whether the property type is PT_STRING8 or PT_UNICODE. In the latter case, the result is a Perl string (a sequence of Unicode code points), while in the former case, it is a sequence of bytes.

After reading the property, the knowledge of how it was encoded is discarded, so subsequent code needs to either guess at the data type of the property value, or just ignore it.

It would be better to keep the full knowledge of underlying data and type until the property is used. How it is to be decoded in the case of PT_STRING8 depends both on which property it is, and on the value of the PidTagInternetCodepage and PidTagMessageCodepage properties.

Run as a container

It is useful to run it in Docker because we can run without considering OS/Middleware/Language.

If you'd like, you can merge the branch.

Installation instructions for Windows

I'm developing some Python CLI scripts which parse Office 365 emails for deployment on Windows 10 machines using Anaconda/Miniconda.

Your site gives install instructions for Linux, but not Windows.

What am I missing? Can you orient me to how this package might relate to development on Windows?

a converted message have a broken encoding in a message body (have cp1251 in msg )

input: .msg saved by outlook in cp1251
output:
-headers: good
-attachments: good
-message body: encoding is broken

a converted file itself seams to be in utf-8

have something like:
--16855398770.0877C.31779
Content-Type: text/plain; charset="UTF-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

Óâàæàåìûé ...!

Îçíàêîìüòåñü ñ ... ... ...

need to be like (from the same message re-exported in unicode):
--16855410730.aC1d0Ed.5308
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline

Уважаемый ...!

Ознакомьтесь с ... ... ...

based on this decoder - https://2cyr.com/decode/?lang=en
source encoding:
WINDOWS-1251
displayed as:
WINDOWS-1252

Email::Outlook::Message version: 0.921

Wide character in print at /usr/bin/vendor_perl/msgconvert line 397, error with latest update (0.920)

Thank you for providing this software.

I have been unable to convert a specific email which I can provide if absolutely necessary (company email).

After trying around and recognizing it works in an ubuntu WSL (version 0.919), i tried to install the old version on manjaro (0.919) which successfully converts the mail. There the last output is "Wide character in print at /usr/bin/vendor_perl/msgconvert line 58.", but still the email is converted and on first sight includes everything important.

Environments where it doesn't work with 0.920:
Linux 5.9.16-1-MANJARO x86_64 GNU/Linux, AUR package "perl-email-outlook-message"
Alpine based image on same machine, with following packages installed:

apk add --no-cache --virtual .build-deps \
        perl-utils \
        perl-module-build \
        perl-app-cpanminus
    apk add perl-email-address-xs perl-doc perl-params-util
    cpanm Email::Outlook::Message

In the alpine container, the error message is exactly the same. After downgrading to version 0.919 it also works, but with slightly different "Wide character in print at /usr/bin/msgconvert line 62." as final output.

How to display the application/rtf part in Thunderbird?

When I compare the generated .eml file with what Outlook sent to the Internet I see that the .eml file contains the unmodified application/rtf part from the origin .msg file. After importing the message to Thunderbird only the plain text part is shown.

However, if I import the .msg file to Outlook and forward it to the Internet the received mail contains a text/html part and Thunderbird can show the message in its full glory.

I tried to extract the rtf part and convert it with RTF::HTML::Converter, but all the styles are lost. Do you know a better way?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.