Giter Site home page Giter Site logo

depayfi / mac-mails-parse-export-csv-json Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 2.0 580 KB

Parse & extract Mail app (Mac OS) emails & save to structured CSV or JSON

Python 100.00%
apple-mail mail mail-extractor mail-parser macos-mail mail-to-csv mail-to-json

mac-mails-parse-export-csv-json's Introduction

Mac OS Mail - Email Extractor

Mail App Emails to CSV or JSON

What it (currently) does:

Email Export from Apple's Mail App on Mac OS -> Plaintext input -> Parse/Extract -> Save to CSV or JSON

Sometimes it can be handy to get structured data from a mailbox on Mac OS (Mail). This is a quick (still) hacky script which parses a plaintext export email export from the Mail app in Mac OS and returns a structured CSV.

Features

  • TXT mode: Parse manually exported emails from a txt file.
  • EMXL mode: Parse emlx files created by the Mail app
  • Export formats & versions per script execution
    • Export versions
      • "plain"
      • "exploded": email addresses are split in case of multiple email recipients, each row contains only one receiving address (all other fields are duplicated)
    • Formats (for each version)
      • CSV
      • JSON
  • Custom Columns based on Regex-extraction:
    • Custom regex extraction schemes can be defined in config.py:

Input (text file mode)

  1. Select the emails you want to parse in Mac OS Mail (hold shift in order to select multiple, CMD + A in order to select all from the current mailbox)
  2. File > Save As > Select Format: Plain Text > Save to the script directory & rename to "input.txt"

Input (emlx parser mode)

Edit the config.py file with your Mac OS filepath to your Mailbox that you want to extract email data from

Run

Tested with Python 3.10.6

pip install -r requirements.txt
  1. edit config.py
  2. run either in txt file mode:
rename your txt file input.txt, place in script dir, then execute:
python3 extract_txt.py

or

  1. run in emlx parser mode:
don't forget to edit config.py first, then execute:
python3 extract_txt.py

if you want to use a CLI with args (implementation in progress)

python3 mac-os-extract-mails.py

[Options]
-e : run emlx parser
-t : run txt file parser
-c : open config file for editing
-sinput: select input file for txt file parser
-sodir: select folder for outputs

Outputs

Output Columns


- "date_utc"
- "date_iso"
- "date"
- "from"
- "from_name"
- "from_mail"
- "subject"
- "to"
- "to_name"
- "to_mail"
- "from"
- "reply_to"
- "reply_to_name"
- "reply_to_mail"
- "body"
- "x-universally-unique-identifier"
- "message-id"
- "mime-version"
- "content-type"

Todo

  • Plaintext mode
  • Emlx mode
  • Explode outputs: Split multiple recipient email addresses into separate data rows (1 email = 1..* rows). Currently: 1 row = 1 email, recipient email addresses are comma-separated
  • Custom columns from RegEx outputs
  • Export json
  • Command line tool
  • EMXL mode: Add caching
  • EMXL mode: Add mailbox folder picker (tkinter?)
  • TXT mode: Add file picker for input file (+cli path option)
  • Option: Create Google Sheet from Outputs
  • Option: Setting utc min-date
  • Option: Ignore Signatures
  • Option: Traverse all mbox'es if a mail box for mail account folder is undefined

mac-mails-parse-export-csv-json's People

Contributors

lxpzurich avatar spape avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

shawnhank

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.