Giter Site home page Giter Site logo

negachov / p5-sisimai Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sisimai/p5-sisimai

0.0 1.0 0.0 3.83 MB

Mail Analyzing Interface: A Perl module to parse RFC5322 bounce emails and generating structured data as JSON from parsed results. Formerly known as bounceHammer 4: an error mail analyzer.

Home Page: http://libsisimai.org

License: Other

Perl 89.38% Makefile 1.19% Perl 6 9.43%

p5-sisimai's Introduction

License Coverage Status Build Status Perl CPAN

What is Sisimai ? | シシマイ?

Sisimai is a Perl module for analyzing RFC5322 bounce emails and generating structured data from parsed results. Sisimai is the system formerly known as bounceHammer 4. "Sisimai" is a coined word: Sisi (the number 4 is pronounced "Si" in Japanese) and MAI (acronym of "Mail Analyzing Interface").

Sisimai(シシマイ)はRFC5322準拠のエラーメールを解析し、解析結果をデータ構造に 変換するインターフェイスを提供するPerlモジュールです。 "シシマイ"はbounceHammer version 4として開発していたものであり、Version 4なので "シ"から始まりマイ(MAI: Mail Analyzing Interface)を含む名前になりました。

Key Features | 主な特徴的機能

  • Convert Bounce Mails to Structured Data | エラーメールをデータ構造に変換
    • Supported formats are Perl and JSON | Perlのデータ形式とJSONに対応
  • Easy to Install, Use. | インストールも使用も簡単
    • cpanm
    • git clone & make
  • High Precision of Analysis | 高い解析精度
    • 2 times higher than bounceHammer | 解析精度はbounceHammerの二倍
    • Support 21 known MTAs and 5 unknown MTAs | 26種類のMTAに対応
    • Support 19 major MSPs(Mail Service Providers) | 19種類の著名なMSPに対応
    • Support Feedback Loop Message(ARF) | Feedback Loopにも対応
    • Can detect 25 error reasons | 25種類のエラー理由を検出
  • Faster than bounceHammer version 2.7.X | bounceHammer 2.7.Xよりも高速に解析
    • About 1.7 times faster | 1.7倍程高速

Setting Up Sisimai | シシマイを使う準備

System requirements | 動作環境

More details about system requirements are available at available at Sisimai — How to Install and Start page.

Install | インストール

From CPAN

% sudo cpanm Sisimai
--> Working on Sisimai
Fetching http://www.cpan.org/authors/id/A/AK/AKXLIX/Sisimai-4.14.0.tar.gz ... OK
...
1 distribution installed
% perldoc -l Sisimai
/usr/local/lib/perl5/site_perl/5.20.0/Sisimai.pm

From GitHub

% cd /usr/local/src
% git clone https://github.com/sisimai/p5-Sisimai.git
% cd ./p5-Sisimai
% sudo make install-from-local
--> Working on .
Configuring Sisimai-4.14.0 ... OK
1 distribution installed

Usage | 使い方

Basic usage | 基本的な使い方

make() method provides feature for getting parsed data from bounced email messages like following.

#! /usr/bin/env perl
use Sisimai;
my $v = Sisimai->make('/path/to/mbox'); # or path to Maildir/

if( defined $v ) {
    for my $e ( @$v ) {
        print ref $e;                   # Sisimai::Data
        print ref $e->recipient;        # Sisimai::Address
        print ref $e->timestamp;        # Sisimai::Time

        print $e->addresser->address;   # [email protected] # From
        print $e->recipient->address;   # [email protected]   # To
        print $e->recipient->host;      # example.jp
        print $e->deliverystatus;       # 5.1.1
        print $e->replycode;            # 550
        print $e->reason;               # userunknown

        my $h = $e->damn();             # Convert to HASH reference
        my $j = $e->dump('json');       # Convert to JSON string
        print $e->dump('json');         # JSON formatted bounce data
    }
}

# Get JSON string from parsed mailbox or Maildir/
my $j = Sisimai->dump('/path/to/mbox'); # or path to Maildir/
                                        # dump() is added in v4.1.27
print $j;                               # parsed data as JSON
[{"recipient": "[email protected]", "addresser": "[email protected]", "feedbacktype": "", "action": "failed", "subject": "Nyaaaaan", "smtpcommand": "DATA", "diagnosticcode": "550 Unknown user [email protected]", "listid": "", "destination": "example.jp", "smtpagent": "Courier", "lhost": "1jo.example.org", "deliverystatus": "5.0.0", "timestamp": 1291954879, "messageid": "[email protected]", "diagnostictype": "SMTP", "timezoneoffset": "+0900", "reason": "filtered", "token": "ce999a4c869e3f5e4d8a77b2e310b23960fb32ab", "alias": "", "senderdomain": "1jo.example.org", "rhost": "mfsmax.example.jp"}, {"diagnostictype": "SMTP", "timezoneoffset": "+0900", "reason": "userunknown", "timestamp": 1381900535, "messageid": "[email protected]", "token": "9fe754876e9133aae5d20f0fd8dd7f05b4e9d9f0", "alias": "", "senderdomain": "example.org", "rhost": "mx.bouncehammer.jp", "action": "failed", "addresser": "[email protected]", "recipient": "[email protected]", "feedbacktype": "", "smtpcommand": "DATA", "subject": "バウンスメールのテスト(日本語)", "destination": "bouncehammer.jp", "listid": "", "diagnosticcode": "550 5.1.1 <[email protected]>... User Unknown", "deliverystatus": "5.1.1", "lhost": "p0000-ipbfpfx00kyoto.kyoto.example.co.jp", "smtpagent": "Sendmail"}]

上記のようにSisimaiのmake()メソッドをmboxかMaildirのPATHを引数にして実行すると 解析結果が配列リファレンスで返ってきます。

One-Liner | ワンライナーで

Beginning with Sisimai 4.1.27, dump() method is available and you can get parsed data as JSON using the method.

% perl -MSisimai -lE 'print Sisimai->dump(shift)' /path/to/mbox

Sisimai 4.1.27から登場したdump()メソッドを使うとワンライナーでJSON化した解析結果 が得られます。

Sisimai Specification | シシマイの仕様

Differences between ver.2 and Sisimai | 新旧の違い

The following table show the differences between ver.2 (bounceHammer 2.7.13p1) and Sisimai.

Features bounceHammer Sisimai
System requirements(Perl) 5.10 - 5.14 5.10 - 5.22
Command line tools Available N/A
Modules for Commercial MTAs and MPSs N/A Included
WebUI/API Included N/A
Database schema for storing parsed bounce data Available N/A[1]
Analytical precision ratio(2000 emails)[2] 0.52 1.00
The speed of parsing email(2000 emails) 7.07s 4.10s
Parse 2 or more bounces in a single email Only 1st rcpt ALL
Parse FeedBack Loop Message/ARF format mail N/A OK
Classification based on recipient domain Available N/A
Output format of parsed data YAML,JSON,CSV JSON only
Easy to install No Yes
Install using cpan or cpanm command N/A OK
Dependencies (Except core modules of Perl) 24 modules 2 modules
LOC:Source lines of code 18200 lines 8900 lines
The number of tests in t/, xt/ directory 27365 tests 158000 tests
License GPLv2 or Perl 2 clause BSD
Support Contract provided by Developer Available Contact Us
  1. Implement yourself with using DBI or any O/R Mapper you like
  2. See ./ANALYTICAL-PRECISION

公開中のbounceHammer version 2.7.13p1とSisimai(シシマイ)は下記のような違いが あります。

機能 bounceHammer Sisimai
動作環境(Perl) 5.10 - 5.14 5.10 - 5.22
コマンドラインツール あり 無し
商用MTAとMSP対応解析モジュール 無し あり(同梱)
WebUIとAPI あり 無し
解析済バウンスデータを保存するDBスキーマ あり 無し[1]
解析精度の割合(2000通のメール)[2] 0.52 1.00
メール解析速度(2000通のメール) 7.07秒 4.10秒
2件以上のバウンスがあるメールの解析 1件目だけ 全件解析可能
FeedBack Loop/ARF形式のメール解析 非対応 対応済
宛先ドメインによる分類項目 あり 無し
解析結果の出力形式 YAML,JSON,CSV JSON
インストール作業が簡単かどうか やや面倒 簡単で楽
cpanまたはcpanmコマンドでのインストール 非対応 対応済
依存モジュール数(Perlのコアモジュールを除く) 24モジュール 2モジュール
LOC:ソースコードの行数 18200行 8900行
テスト件数(t/,xt/ディレクトリ) 27365件 158000件
ライセンス GPLv2かPerl 二条項BSD
開発会社によるサポート契約 提供中 問合せ
  1. DBIまたは好きなORMを使って自由に実装してください
  2. ./ANALYTICAL-PRECISIONを参照

MTA/MSP Modules | MTA/MSPモジュール一覧

The following table is the list of MTA/MSP:(Mail Service Provider) modules.

Module Name(Sisimai::) Description
MTA::Activehunter TransWARE Active!hunter
MTA::ApacheJames Java Apache Mail Enterprise Server(> v4.1.26)
MTA::Courier Courier MTA
MTA::Domino IBM Domino Server
MTA::Exchange Microsoft Exchange Server
MTA::Exim Exim
MTA::IMailServer IPSWITCH IMail Server
MTA::InterScanMSS Trend Micro InterScan Messaging Security Suite
MTA::MXLogic McAfee SaaS
MTA::MailFoundry MailFoundry
MTA::MailMarshalSMTP Trustwave Secure Email Gateway
MTA::McAfee McAfee Email Appliance
MTA::MessagingServer Oracle Communications Messaging Server
MTA::mFILTER Digital Arts m-FILTER
MTA::Notes Lotus Notes
MTA::OpenSMTPD OpenSMTPD
MTA::Postfix Postfix
MTA::qmail qmail
MTA::Sendmail V8Sendmail: /usr/sbin/sendmail
MTA::SurfControl WebSense SurfControl
MTA::V5sendmail Sendmail version 5
MTA::X1 Unknown MTA #1
MTA::X2 Unknown MTA #2
MTA::X3 Unknown MTA #3
MTA::X4 Unknown MTA #4 qmail clones(> v4.1.23)
MTA::X5 Unknown MTA #5 (> v4.13.0 )
MSP::DE::EinsUndEins 1&1: http://www.1and1.de
MSP::DE::GMX GMX: http://www.gmx.net
MSP::JP::Biglobe BIGLOBE: http://www.biglobe.ne.jp
MSP::JP::EZweb au EZweb: http://www.au.kddi.com/mobile/
MSP::JP::KDDI au by KDDI: http://www.au.kddi.com
MSP::RU::MailRu @mail.ru: https://mail.ru
MSP::RU::Yandex Yandex.Mail: http://www.yandex.ru
MSP::UK::MessageLabs Symantec.cloud http://www.messagelabs.com
MSP::US::AmazonSES AmazonSES(Sending): http://aws.amazon.com/ses/
MSP::US::Aol Aol Mail: http://www.aol.com
MSP::US::Bigfoot Bigfoot: http://www.bigfoot.com
MSP::US::Facebook Facebook: https://www.facebook.com
MSP::US::Google Google Gmail: https://mail.google.com
MSP::US::Outlook Microsoft Outlook.com: https://www.outlook.com/
MSP::US::ReceivingSES AmazonSES(Receiving): http://aws.amazon.com/ses/
MSP::US::SendGrid SendGrid: http://sendgrid.com/
MSP::US::Verizon Verizon Wireless: http://www.verizonwireless.com
MSP::US::Yahoo Yahoo! MAIL: https://www.yahoo.com
MSP::US::Zoho Zoho Mail: https://www.zoho.com
ARF Abuse Feedback Reporting Format
RFC3464 Fallback Module for MTAs
RFC3834 Detector for auto replied message (> v4.1.28)

上記はSisimaiに含まれてるMTA/MSP(メールサービスプロバイダ)モジュールの一覧です。

Bounce Reason List | バウンス理由の一覧

Sisimai can detect the following 25 bounce reasons. More details about reasons are available at Sisimai — Bounce Reason List Sisimai detects page.

Reason(理由) Description 理由の説明
Blocked Blocked due to client IP address IPアドレスによる拒否
ContentError Invalid format email 不正な形式のメール
ExceedLimit Message size exceeded the limit(5.2.3) メールサイズの超過
Expired Delivery time expired 配送時間切れ
Feedback Bounced for a complaint of the message 元メールへの苦情によるバウンス
Filtered Rejected after DATA command DATAコマンド以降で拒否された
HasMoved Destination mail addrees has moved 宛先メールアドレスは移動した
HostUnknown Unknown destination host name 宛先ホスト名が存在しない
MailboxFull Recipient's mailbox is full メールボックスが一杯
MailerError Mailer program error メールプログラムのエラー
MesgTooBig Message size is too big(5.3.4) メールが大き過ぎる
NetworkError Network error: DNS or routing DNS等ネットワーク関係のエラー
SpamDetected Detected a message as spam メールはスパムとして判定された
NotAccept Destinaion does not accept any message 宛先ホストはメールを受けとらない
OnHold Deciding the bounce reason is on hold エラー理由の特定は保留
Rejected Rejected due to envelope from address エンベロープFromで拒否された
NoRelaying Relay access denied リレーの拒否
SecurityError Virus detected or authentication error ウィルスの検出または認証失敗
Suspend Recipient's account is suspended 宛先アカウントは一時的に停止中
SystemError Some error on the destination host 宛先サーバでのOSレベルのエラー
SystemFull Disk full on the destination host 宛先サーバのディスクが一杯
TooManyConn Connection rate limit exceeded 接続制限数を超過した
UserUnknown Recipient's address does not exist 宛先メールアドレスは存在しない
Vacation Auto replied message (> v4.1.28 ) 自動応答メッセージ(> v4.1.28 )
Undefined Could not decide the error reason バウンスした理由は特定出来ず

Sisimaiは上記のエラー25種を検出します。

Parsed data structure | 解析後のデータ構造

The following table shows a data structure(Sisimai::Data) of parsed bounce mail. More details about data structure are available at available at Sisimai — Data Structure of Sisimai::Data Object page.

Name Description 値の説明
action The value of Action: header Action:ヘッダの値
addresser The From address 送信者のアドレス
alias Alias of the recipient 受信者アドレスのエイリアス
destination The domain part of the "recipinet" "recipient"のドメイン部分
deliverystatus Delivery Status(DSN) 配信状態(DSN)の値
diagnosticcode Error message エラーメッセージ
diagnostictype Error message type エラーメッセージの種別
feedbacktype Feedback Type Feedback-Typeのフィールド
lhost local host name(local MTA) 送信側MTAのホスト名
listid List-Id: header of each ML List-Idヘッダの値
messageid Message-Id: of the original message 元メールのMessage-Id
reason Detected bounce reason 検出したバウンスした理由
recipient Recipient address which bounced バウンスした受信者のアドレス
replycode SMTP Reply Code SMTP応答コード
rhost Remote host name(remote MTA) 受信側MTAのホスト名
senderdomain The domain part of the "addresser" "addresser"のドメイン部分
softbounce The bounce is soft bounce or not ソフトバウンスであるかどうか
smtpagent MTA name(Sisimai::MTA::, MSP::) MTA名(Sisimai::MTA::,MSP::)
smtpcommand The last SMTP command in the session セッション中最後のSMTPコマンド
subject Subject of the original message(UTF8) 元メールのSubject(UTF-8)
timestamp Date: header in the original message 元メールのDate
timezoneoffset Time zone offset(seconds) タイムゾーンの時差
token MD5 value of addresser and recipient 送信者と受信者のハッシュ値

上記の表は解析後のバウンスメールの構造(Sisimai::Data)です。

Emails could not be parsed | 解析出来ないメール

Bounce mails which could not be parsed by Sisimai are saved in the directory set-of-emails/to-be-debugged-because/sisimai-cannot-parse-yet. If you find any bounce email cannot be parsed using Sisimai, please add the email into the directory and send Pull-Request to this repository.

解析出来ない__バウンスメール__はset-of-emails/to-be-debugged-because/sisimai-cannot-parse-yet ディレクトリにはいっています。もしもSisimaiで解析出来ないメールを見つけたら、 このディレクトリに追加してPull-Requestを送ってください。

Other Information | その他の情報

Related Sites | 関連サイト

SEE ALSO | 参考サイト

AUTHOR | 作者

@azumakuniyuki

COPYRIGHT | 著作権

Copyright (C) 2014-2016 azumakuniyuki, All Rights Reserved.

LICENSE | ライセンス

This software is distributed under The BSD 2-Clause License.

p5-sisimai's People

Contributors

0xcdcdcdcd avatar azumakuniyuki avatar jcbf avatar jleroy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.