Comments (34)
OK grabbed latest beta:
Traceback (most recent call last):
File "dupReport.py", line 213, in
globs.inServer.processMessage(nxtMsg)
File "C:\dupReport-2.1.0_Beta\dremail.py", line 307, in processMessage
decode1 = email.header.decode_header(msg['Message-Id'])[0][0]
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\email\header.py", line 80, in decode_header
if not ecre.search(header):
TypeError: expected string or bytes-like object
Attached is pruned log, last entry (same email, it looks like.)
dupReport.log
from dupreport.
OK, as I suspected, for some reason dR is not getting a message ID from the email, so everything else is crashing behind it. Highly unusual. The easy fix is to add something like
if (no header)
skip message
However, before I do that, would it be possible to post the entire email that's causing the error, including the headers? I want to see if there is a message format I need to code for. Also, did you say this worked on other mail servers or was that the original error you were talking about?
from dupreport.
I did also try it on a cPanel-based mail server and it worked without any issues (although it's an empty mailbox right now.)
I will look for that specific email later tonight and pm you privately the contents.
from dupreport.
OK, thanks. I've coded up a potential fix. I can post that now and see if it solves the problem, or wait for your email and do some additional research to see if more changes are needed. Your choice.
from dupreport.
Uploaded a possible fix to the Beta branch. Please have a look and see if it addresses your issue.
from dupreport.
Not sure if this is related to above but I was able to reproduce the error above by having an email message with a blank subject.
This was done with the most current code.
File "./dupReport.py", line 214, in <module>
globs.inServer.processMessage(nxtMsg)
File "/root/dupReport-2.1.0_Beta/dremail.py", line 337, in processMessage
decode = email.header.decode_header(msg['Subject'])[0]
File "/usr/lib/python3.5/email/header.py", line 80, in decode_header
if not ecre.search(header):
from dupreport.
@dcurrey , interesting, because a message with a blank subject should be simply rejected as "not a message of interest." Can you post the error message you're getting and the last 20 lines of the log file?
from dupreport.
Think I am on to something here looking at the log and the messageid it looks like dupreport is trying to add the blank subject message.
Database.searchForMessage(<20171228112123.Horde.p77A5h25wJZ9ESmi-G1jyFY@hordetest.mydomain.com>)
Database.execSqlStmt(): Executing SQL command.
Message ID [<20171228112123.Horde.p77A5h25wJZ9ESmi-G1jyFY@hordetest.mydomain.com>] does not exist. Adding to DB
As you stated it shouldn't
Here is the full message header notice something missing
Return-Path: <[email protected]>
Delivered-To: [email protected]
Received: from localhost (localhost.localdomain [127.0.0.1])
by beta.mydomain.com (Postfix) with ESMTP id 92ED84543EEB
for <[email protected]>; Thu, 28 Dec 2017 11:21:23 -0500 (EST)
X-Virus-Scanned: Debian amavisd-new at beta.mydomain.com
X-Spam-Flag: NO
X-Spam-Score: 6.956
X-Spam-Level: ******
X-Spam-Status: No, score=6.956 tagged_above=3 required=999
tests=[HELO_DYNAMIC_IPADDR=3.243, MISSING_SUBJECT=1.767,
RDNS_DYNAMIC=0.363, SPF_FAIL=0.919, TO_EQ_FM_DIRECT_MX=2.499,
TO_EQ_FM_DOM_SPF_FAIL=0.001, TO_EQ_FM_SPF_FAIL=0.001, TXREP=-1.837]
autolearn=no autolearn_force=no
Received: from beta.mydomain.com ([127.0.0.1])
by localhost (beta.mydomain.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id oF2DDRqrgkR2 for <[email protected]>;
Thu, 28 Dec 2017 11:21:23 -0500 (EST)
Received: by beta.mydomain.com (Postfix, from userid 5002)
id 10BEF4543ED3; Thu, 28 Dec 2017 11:21:23 -0500 (EST)
Received: from mh-esr1-xxx-xxx-xxx-xxx.,myisp.net (mh-esr1-xxx-xxx-xxx-xxx.myisp.net
[xxx.xxx.xxx.xxx]) by hordetest.mydomain.com (Horde Framework) with HTTPS;
Thu, 28 Dec 2017 11:21:23 -0500
Date: Thu, 28 Dec 2017 11:21:23 -0500
Message-ID: <20171228112123.Horde.p77A5h25wJZ9ESmi-G1jyFY@hordetest.mydomain.com>
From: [email protected]
To: [email protected]
User-Agent: Horde Application Framework 5
Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes
MIME-Version: 1.0
Content-Disposition: inline
some text
from dupreport.
OK updated.. it runs for a long time (log is too large to read in n++), but it ends with this error:
Traceback (most recent call last):
File "dupReport.py", line 214, in
globs.inServer.processMessage(nxtMsg)
File "C:\dupReport-2.1.0_Beta\dremail.py", line 443, in processMessage
globs.log.write(3, 'endTimeStamp=[{}] beginTimeStamp=[{}]'.format(drdatetime.fromTimestamp(dateParts['endTimestamp']), drdatetime.fromTimestamp(dateParts['beginTimestamp'])))
File "C:\dupReport-2.1.0_Beta\drdatetime.py", line 134, in fromTimestamp
dt = datetime.datetime.fromtimestamp(float(ts))
TypeError: float() argument must be a string or a number, not 'NoneType'
My .rc file is unchanged from default other than the server access/credentials:
[main]
version = 2.1.0
dbpath = C:\dupReport-2.1.0_Beta
logpath = C:\dupReport-2.1.0_Beta
verbose = 1
logappend = false
subjectregex = ^Duplicati Backup report for
srcregex = \w*
destregex = \w*
srcdestdelimiter = -
dateformat = MM/DD/YYYY
timeformat = HH:MM:SS
warnoncollect = false
applyutcoffset = false
show24hourtime = true
purgedb = false
Another question - I tried running it again without verbose logging so I could read the .log, and I noticed that more often than not, it says "Message ID [[email protected]] does not exist. Adding to DB"
However, since I've already run it once before, should the majority (if not all) of these existing messages now be in the DB, even if it crashes out?
from dupreport.
Quick and dirty fix for no subject problem
Line 336 dremail.py
# get Subject
if not msg['Subject']:
msg['Subject'] = '[No Subject]'
decode = email.header.decode_header(msg['Subject'])[0]
from dupreport.
@ninja6o4 , the program doesn't commit the database until all the emails are parsed, so if it crashed in the middle it won't register those emails in the db. They will be scanned again on the next run.
@dcurrey , I'm thinking an easier fix is to just skip the message entirely if no subject, since it won't be from a Duplicati backup session. dR won't be able to do anything with it anyway. Continuing to process, even with a bogus subject will just lead to problems down the line. Thoughts?
from dupreport.
Hence the quick and dirty fix. :)
It does look like the check is just a few lines down in the dremail.py
So just do a return at the if statement instead of bogus subject
if not msg['Subject']:
globs.log.write(1, 'Message [{}] is not a Message of Interest, No Subject.'.format(msgParts['messageId']))
return None, None # Not a message of Interest
But I am good on whatever works.
from dupreport.
On the road today. Will take a look at it tonight an work up a permanent fix. Will also look at the ts parsing problem, which I suspect is related.
from dupreport.
Just posted potential fix code to the Beta branch. I basically punted the problem by checking for essential message elements and skipping the message if any of them are missing:
if msg['Message-Id'] is None or msg['Message-Id'] == '':
globs.log.write(3,'No message-Id. Abandoning processMessage()')
return None, None
if msg['Subject'] is None or msg['Subject'] == '':
globs.log.write(3,'No Subject. Abandoning processMessage()')
return None, None
if msg['Date'] is None or msg['Date'] == '':
globs.log.write(3,'No Date. Abandoning processMessage()')
return None, None
I also removed the debug lines that logged multiple copies of the email. That'll cut down on the log size a bit.
@ninja6o4 , I'll revise my earlier explanation for the multiple "does not exist in DB" messages. It's been a while since I've been in this section of the code. "Upon further review" it looks like dR only stores the data for emails that it successfully parses as Duplicati result messages. If it can't match an acceptable subject line (as defined in [main] subjectregex= in the .rc file) it just outputs an error message ("... not a message of interest...") and skips to the next message. As a result, if you have a lot of non-Duplicati messages in your inbox they will be reviewed every time the program runs. This was a time/space trade-off. I sacrificed some processing time to prevent storing lots of irrelevant message data in the database.
@mr-flibble, you may want to download and test this fix. You seem to have emails that like to break dR (:-)). Since this fix altered the way email headers are processed and matched it would be good to have you run through it as well.
Let me know how it goes.
from dupreport.
My current log folder is ~4000 messages, so it does take a few minutes to run, but probably not a big deal once i get this working and automated. My main concern is that it looks like it is literally downloading the entire folder every time it runs, which is ~200MB each time. I don't know if it would overly burden the db to log all message IDs and a flag as to if it is relevant or not. Maybe an option for the user as to whether or not to save it?
In any event, still getting the odd date format error. Log shows:
, Failed: .*, 24, 1
options.getRcSectionDateTimeFmt(, )
drDateTime.toTimestamp(, MM/DD/YYYY, HH:MM:SS, -28800)
drDateTime.toTimestamp(, None, None, -28800)
drdatetime.fromTimestamp(None, MM/DD/YYYY, HH:MM:SS)
from dupreport.
@ninja6o4, can you post the 20 or so lines around the format error message so I can see what's going on there?
from dupreport.
, Errors: [.*^], 24, 1
EmailServer.searchMesagePart(RGVsZXRlZEZpbGVzOiAwDQpEZWxldGVkRm9sZGVyczogMA0KTW9kaWZpZWRGaWxlczogMA0KRXhh
bWluZWRGaWxlczogMzY4MQ0KT3BlbmVkRmlsZXM6IDANCkFkZGVkRmlsZXM6IDANClNpemVPZk1v
ZGlmaWVkRmlsZXM6IDANClNpemVPZkFkZGVkRmlsZXM6IDANClNpemVPZkV4YW1pbmVkRmlsZXM6
IDI4ODEwNTc1OTQNClNpemVPZk9wZW5lZEZpbGVzOiAwDQpOb3RQcm9jZXNzZWRGaWxlczogMA0K
QWRkZWRGb2xkZXJzOiAwDQpUb29MYXJnZUZpbGVzOiAwDQpGaWxlc1dpdGhFcnJvcjogMA0KTW9k
aWZpZWRGb2xkZXJzOiAwDQpNb2RpZmllZFN5bWxpbmtzOiAwDQpBZGRlZFN5bWxpbmtzOiAwDQpE
ZWxldGVkU3ltbGlua3M6IDANClBhcnRpYWxCYWNrdXA6IEZhbHNlDQpEcnlydW46IEZhbHNlDQpN
YWluT3BlcmF0aW9uOiBCYWNrdXANClBhcnNlZFJlc3VsdDogU3VjY2Vzcw0KVmVyYm9zZU91dHB1
dDogRmFsc2UNClZlcmJvc2VFcnJvcnM6IEZhbHNlDQpFbmRUaW1lOiAxMi8yNy8yMDE3IDEwOjAw
OjMyIEFNDQpCZWdpblRpbWU6IDEyLzI3LzIwMTcgMTA6MDA6MDAgQU0NCkR1cmF0aW9uOiAwMDow
MDozMi42MDg3NjAzDQpNZXNzYWdlczogWw0KICAgIE5vIHJlbW90ZSBmaWxlc2V0cyB3ZXJlIGRl
bGV0ZWQsDQogICAgcmVtb3ZpbmcgZmlsZSBsaXN0ZWQgYXMgVGVtcG9yYXJ5OiBkdXBsaWNhdGkt
YmRiZWZjMzA2ZDRmYzQxYjc5MjQxN2EzZWRhZjU4MTk1LmRibG9jay56aXAuYWVzLA0KICAgIHJl
bW92aW5nIGZpbGUgbGlzdGVkIGFzIFRlbXBvcmFyeTogZHVwbGljYXRpLWliY2MxOGM0NjcyMWY0
MTBlOTc2ZTUxMDFhZWMyNDc1ZS5kaW5kZXguemlwLmFlcw0KXQ0KV2FybmluZ3M6IFtdDQpFcnJv
cnM6IFtdDQoNCg==
, Details: .*, 24, 1
EmailServer.searchMesagePart(RGVsZXRlZEZpbGVzOiAwDQpEZWxldGVkRm9sZGVyczogMA0KTW9kaWZpZWRGaWxlczogMA0KRXhh
bWluZWRGaWxlczogMzY4MQ0KT3BlbmVkRmlsZXM6IDANCkFkZGVkRmlsZXM6IDANClNpemVPZk1v
ZGlmaWVkRmlsZXM6IDANClNpemVPZkFkZGVkRmlsZXM6IDANClNpemVPZkV4YW1pbmVkRmlsZXM6
IDI4ODEwNTc1OTQNClNpemVPZk9wZW5lZEZpbGVzOiAwDQpOb3RQcm9jZXNzZWRGaWxlczogMA0K
QWRkZWRGb2xkZXJzOiAwDQpUb29MYXJnZUZpbGVzOiAwDQpGaWxlc1dpdGhFcnJvcjogMA0KTW9k
aWZpZWRGb2xkZXJzOiAwDQpNb2RpZmllZFN5bWxpbmtzOiAwDQpBZGRlZFN5bWxpbmtzOiAwDQpE
ZWxldGVkU3ltbGlua3M6IDANClBhcnRpYWxCYWNrdXA6IEZhbHNlDQpEcnlydW46IEZhbHNlDQpN
YWluT3BlcmF0aW9uOiBCYWNrdXANClBhcnNlZFJlc3VsdDogU3VjY2Vzcw0KVmVyYm9zZU91dHB1
dDogRmFsc2UNClZlcmJvc2VFcnJvcnM6IEZhbHNlDQpFbmRUaW1lOiAxMi8yNy8yMDE3IDEwOjAw
OjMyIEFNDQpCZWdpblRpbWU6IDEyLzI3LzIwMTcgMTA6MDA6MDAgQU0NCkR1cmF0aW9uOiAwMDow
MDozMi42MDg3NjAzDQpNZXNzYWdlczogWw0KICAgIE5vIHJlbW90ZSBmaWxlc2V0cyB3ZXJlIGRl
bGV0ZWQsDQogICAgcmVtb3ZpbmcgZmlsZSBsaXN0ZWQgYXMgVGVtcG9yYXJ5OiBkdXBsaWNhdGkt
YmRiZWZjMzA2ZDRmYzQxYjc5MjQxN2EzZWRhZjU4MTk1LmRibG9jay56aXAuYWVzLA0KICAgIHJl
bW92aW5nIGZpbGUgbGlzdGVkIGFzIFRlbXBvcmFyeTogZHVwbGljYXRpLWliY2MxOGM0NjcyMWY0
MTBlOTc2ZTUxMDFhZWMyNDc1ZS5kaW5kZXguemlwLmFlcw0KXQ0KV2FybmluZ3M6IFtdDQpFcnJv
cnM6IFtdDQoNCg==
, Failed: .*, 24, 1
options.getRcSectionDateTimeFmt(, )
drDateTime.toTimestamp(, MM/DD/YYYY, HH:MM:SS, -28800)
drDateTime.toTimestamp(, None, None, -28800)
drdatetime.fromTimestamp(None, MM/DD/YYYY, HH:MM:SS)
from dupreport.
Hard to tell from the context, but it looks like these are non-Duplicati emails. In those cases, the date/time strings wouldn't be in the place dupReport expects to find them. dR should catch this and skip the message, but for some reason it is not.
I'll need a bigger patch of log to trace where/why this is happening, since the snip you sent is all email body. Can you snip from the end back to where is says "EmailServer.process_message()"? That will give me a full trace of that message and why it is crashing there.
from dupreport.
I am certain they are non-Duplicati, as I just started using it, but dumping logs into my generic log folder that collects from other various services. Good way to test your email parsing/detection :D
Here it is. I trimmed out the encoded contents of each EmailServer.searchMesagePart(, it is the same as pasted above, and repeats for every instance of that entry.
EmailServer.process_message()
Database.searchForMessage([email protected])
Database.execSqlStmt(): Executing SQL command.
Message ID [[email protected]] does not exist. Adding to DB
drDateTime.toTimestamp(2017/12/27 00:00:40, YYYY/MM/DD, HH:MM:SS, None)
Date/Time converted to [1514361640.0]
drdatetime.fromTimestamp(1514361640.0, MM/DD/YYYY, HH:MM:SS)
Message [[email protected]] is not a Message of Interest.
dremail.getNextMessage()
EmailServer.process_message()
Database.searchForMessage([email protected])
Database.execSqlStmt(): Executing SQL command.
Database.execSqlStmt(): Executing SQL command.
Database.dbCommit(): Commiting transaction.
dremail.getNextMessage()
EmailServer.process_message()
Database.searchForMessage(DD79R41783U4.UJ34CIL6V9EC@Ironhide)
Database.execSqlStmt(): Executing SQL command.
Message ID [DD79R41783U4.UJ34CIL6V9EC@Ironhide] does not exist. Adding to DB
drDateTime.toTimestamp(2017/12/27 02:00:33, YYYY/MM/DD, HH:MM:SS, None)
Date/Time converted to [1514368833.0]
drdatetime.fromTimestamp(1514368833.0, MM/DD/YYYY, HH:MM:SS)
Database.searchSrcDestPair(, )
Database.execSqlStmt(): Executing SQL command.
EmailServer.searchMesagePart(
, DeletedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, DeletedFolders: \d+, 0, 0
EmailServer.searchMesagePart(
, ModifiedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, ExaminedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, OpenedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, AddedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, SizeOfModifiedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, SizeOfAddedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, SizeOfExaminedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, SizeOfOpenedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, NotProcessedFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, AddedFolders: \d+, 0, 0
EmailServer.searchMesagePart(
, TooLargeFiles: \d+, 0, 0
EmailServer.searchMesagePart(
, FilesWithError: \d+, 0, 0
EmailServer.searchMesagePart(
, ModifiedFolders: \d+, 0, 0
EmailServer.searchMesagePart(
, ModifiedSymlinks: \d+, 0, 0
EmailServer.searchMesagePart(
, AddedSymlinks: \d+, 0, 0
EmailServer.searchMesagePart(
, DeletedSymlinks: \d+, 0, 0
EmailServer.searchMesagePart(
, PartialBackup: \w+, 0, 1
EmailServer.searchMesagePart(
, Dryrun: \w+, 0, 1
EmailServer.searchMesagePart(
, MainOperation: \w+, 0, 1
EmailServer.searchMesagePart(
, ParsedResult: \w+, 0, 1
EmailServer.searchMesagePart(
, VerboseOutput: \w+, 0, 1
EmailServer.searchMesagePart(
, VerboseErrors: \w+, 0, 1
EmailServer.searchMesagePart(
, EndTime: .*, 0, 1
EmailServer.searchMesagePart(
, BeginTime: .*, 0, 1
EmailServer.searchMesagePart(
, Duration: .*, 0, 1
EmailServer.searchMesagePart(
, Messages: [.*^], 24, 1
EmailServer.searchMesagePart(
, Warnings: [.*^], 24, 1
EmailServer.searchMesagePart(
, Errors: [.*^], 24, 1
EmailServer.searchMesagePart(
, Details: .*, 24, 1
EmailServer.searchMesagePart(
, Failed: .*, 24, 1
options.getRcSectionDateTimeFmt(, )
drDateTime.toTimestamp(, MM/DD/YYYY, HH:MM:SS, -28800)
drDateTime.toTimestamp(, None, None, -28800)
drdatetime.fromTimestamp(None, MM/DD/YYYY, HH:MM:SS)
from dupreport.
OK, tried some more fixing. Please pull down the latest beta and see if that solves the problem.
Also, make sure you are running at logging level 3 (command line -v 3 or [main] verbose=3). It'll blow up the size of your log, but traces most of the variables in the code to give more debugging info.
Let me know how it goes.
from dupreport.
EmailServer.process_message()
Subject=[Duplicati Backup report for Test - Success]
srcregex=[\w*-] destRegex=[-\w*]
msg[Message-Id]=[DD79R41783U4.UJ34CIL6V9EC@Ironhide]
msgParts[messageId]=[DD79R41783U4.UJ34CIL6V9EC@Ironhide]
Database.searchForMessage(DD79R41783U4.UJ34CIL6V9EC@Ironhide)
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT messageId FROM emails WHERE messageId='DD79R41783U4.UJ34CIL6V9EC@Ironhide']
Message ID [DD79R41783U4.UJ34CIL6V9EC@Ironhide] does not exist. Adding to DB
drDateTime.toTimestamp(2017/12/27 02:00:33, YYYY/MM/DD, HH:MM:SS, None)
Date/Time converted to [1514368833.0]
drdatetime.fromTimestamp(1514368833.0, MM/DD/YYYY, HH:MM:SS)
Converted [1514368833.0] to [12/27/2017 02:00:33]
emailDate=[1514368833.0]-[('12/27/2017', '02:00:33')]
sourceComp=[] destComp=[] emailTimestamp=[1514368833.0] subject=[Duplicati Backup report for Test - Success]
Database.searchSrcDestPair(, )
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT source, destination FROM backupsets WHERE source='' AND destination='']
Source/Destination pair [/] already in database.
Body=[RGVsZXRlZEZpbGVzOiAwDQpEZWxldGVkRm9sZGVyczogMA0KTW9kaWZpZWRGaWxlczogMA0KRXhh
bWluZWRGaWxlczogMzY4MQ0KT3BlbmVkRmlsZXM6IDANCkFkZGVkRmlsZXM6IDANClNpemVPZk1v
ZGlmaWVkRmlsZXM6IDANClNpemVPZkFkZGVkRmlsZXM6IDANClNpemVPZkV4YW1pbmVkRmlsZXM6
IDI4ODEwNTc1OTQNClNpemVPZk9wZW5lZEZpbGVzOiAwDQpOb3RQcm9jZXNzZWRGaWxlczogMA0K
QWRkZWRGb2xkZXJzOiAwDQpUb29MYXJnZUZpbGVzOiAwDQpGaWxlc1dpdGhFcnJvcjogMA0KTW9k
aWZpZWRGb2xkZXJzOiAwDQpNb2RpZmllZFN5bWxpbmtzOiAwDQpBZGRlZFN5bWxpbmtzOiAwDQpE
ZWxldGVkU3ltbGlua3M6IDANClBhcnRpYWxCYWNrdXA6IEZhbHNlDQpEcnlydW46IEZhbHNlDQpN
YWluT3BlcmF0aW9uOiBCYWNrdXANClBhcnNlZFJlc3VsdDogU3VjY2Vzcw0KVmVyYm9zZU91dHB1
dDogRmFsc2UNClZlcmJvc2VFcnJvcnM6IEZhbHNlDQpFbmRUaW1lOiAxMi8yNy8yMDE3IDEwOjAw
OjMyIEFNDQpCZWdpblRpbWU6IDEyLzI3LzIwMTcgMTA6MDA6MDAgQU0NCkR1cmF0aW9uOiAwMDow
MDozMi42MDg3NjAzDQpNZXNzYWdlczogWw0KICAgIE5vIHJlbW90ZSBmaWxlc2V0cyB3ZXJlIGRl
bGV0ZWQsDQogICAgcmVtb3ZpbmcgZmlsZSBsaXN0ZWQgYXMgVGVtcG9yYXJ5OiBkdXBsaWNhdGkt
YmRiZWZjMzA2ZDRmYzQxYjc5MjQxN2EzZWRhZjU4MTk1LmRibG9jay56aXAuYWVzLA0KICAgIHJl
bW92aW5nIGZpbGUgbGlzdGVkIGFzIFRlbXBvcmFyeTogZHVwbGljYXRpLWliY2MxOGM0NjcyMWY0
MTBlOTc2ZTUxMDFhZWMyNDc1ZS5kaW5kZXguemlwLmFlcw0KXQ0KV2FybmluZ3M6IFtdDQpFcnJv
cnM6IFtdDQoNCg==
]
EmailServer.searchMesagePart(
^^ At this point it repeats the email contents for each variable, as before.
from dupreport.
The log entry: Database.searchSrcDestPair(, )
means that the program isn't finding the source and destination properly. The source and destination should be shown between the parentheses, as in Database.searchSrcDestPair(Test,Success )
Can you remove the spaces after "Test" and before "Success" in your backup job name? Instead of
Test - Success
your backup job needs to be named
Test-Success (no spaces).
I did some testing and was able to reproduce this part of your problem. The regex parser is confused by the extra spaces between source and destination and not properly catching the source and destination names. Try running the job under the new name (or somehow editing the email subject lines) and rerun. Regex is a delicate thing, unfortunately. I need to update the docs on this point.
I'm still concerned that your email body looks like encoded text. If the email in question is from a Duplicati backup report, your log should look something like this:
Source/Destination pair [XXX/YYY] already in database.
Body=[DeletedFiles: 0DeletedFolders: 0ModifiedFiles: 0ExaminedFiles: 223314OpenedFiles: 0AddedFiles: 0SizeOfModifiedFiles: 0SizeOfAddedFiles: 0SizeOfExaminedFiles: 885093604482SizeOfOpenedFiles: 0NotProcessedFiles: 0AddedFolders: 0TooLargeFiles: 0FilesWithError: 0ModifiedFolders: 0ModifiedSymlinks: 0AddedSymlinks: 0DeletedSymlinks: 0PartialBackup: FalseDryrun: FalseMainOperation: BackupParsedResult: SuccessVerboseOutput: FalseVerboseErrors: FalseEndTime: 10/27/2017 3:15:24 PMBeginTime: 10/27/2017 2:59:48 PMDuration: 00:15:36.5097932Messages: [ No remote filesets were deleted, removing file listed as Temporary: duplicati-b3039a471f4a843cfb51316b892db4950.dblock.zip.aes, removing file listed as Temporary: duplicati-i127aa9063f0741cd95d96baaaf178c6d.dindex.zip.aes]Warnings: []Errors: []]
EmailServer.searchMesagePart(DeletedFiles: 0DeletedFolders: 0ModifiedFiles: 0ExaminedFiles: 223314OpenedFiles: 0AddedFiles:
.
.
.
The encoded text will definitely confuse the message parser, but let's fix one problem at a time. First, matching the proper source/destination pairs...
from dupreport.
OK, i'll update the subject line. I'll also have to delete the existing reports and wait to see how they do going forward. I'll keep you posted.
EDIT: Actually, I just tried on my cPanel mailbox (which is empty except for Duplicati emails) and it parsed the emails just fine. It doesn't differentiate between the two backup jobs, but I do get a report.
from dupreport.
A speedier option would be to run a manual job or two with the new job name. You won't get much in the way of file difference but you'll at least get couple of new emails to parse. Then run the program and it should only pick up the new result emails.
Your choice, I'm standing by either way. Thanks for helping me dig into this. :-)
from dupreport.
OK I got a different error but the log seems like it thinks it was successful. I think the outbound connection times out and is closed before dR has a chance to finish going through the emails.
btw - I reread what you said. Originally I had added " - %PARSEDRESULT%" to the end of Duplicati's default subject line, so that I can see at a glance if there was a problem with the job or not. Is there some way I can do this and still use dR? Nevermind - reading your documentation on src/dest pairing, I will adjust accordingly
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 354, in send
self.sock.sendall(s)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 972, in sendall
v = self.send(byte_view[count:])
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 941, in send
return self._sslobj.write(data)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 642, in write
return self._sslobj.write(data)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote hostDuring handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "dupReport.py", line 255, in
globs.outServer.sendEmail(msgHtml, msgText)
File "C:\dupReport-2.1.0_Beta\dremail.py", line 484, in sendEmail
self.server.sendmail(sender, receiver, msg.as_string().encode('utf-8'))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 861, in sendmail
(code, resp) = self.mail(from_addr, esmtp_opts)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 534, in mail
self.putcmd("mail", "FROM:%s%s" % (quoteaddr(sender), optionlist))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 367, in putcmd
self.send(str)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 357, in send
raise SMTPServerDisconnected('Server not connected')
smtplib.SMTPServerDisconnected: Server not connected
Here is the end of log snippet:
sendNoBackupWarnings()
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT source, destination FROM backupsets ORDER BY source, destination]
getLatestTimestamp(, )
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT lastTimestamp FROM backupsets WHERE source = "" AND destination = ""]
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT max(endTimeStamp) FROM emails WHERE sourceComp = '' AND destComp= '' AND endTimeStamp > 1514610397.0]
Returning backupsets timestamp: 1514610397.0
report.pastBackupWarningThreshold(, , -1)
options.getRcOption(-, nobackupwarn)
Nobackup warning threshold is 0 days.
pastBackupWarningThreshold returning False
getLatestTimestamp(Test, Success)
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT lastTimestamp FROM backupsets WHERE source = "Test" AND destination = "Success"]
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT max(endTimeStamp) FROM emails WHERE sourceComp = 'Test' AND destComp= 'Success' AND endTimeStamp > 1514612362.0]
Returning backupsets timestamp: 1514612362.0
report.pastBackupWarningThreshold(Test, Success, -1)
options.getRcOption(Test-Success, nobackupwarn)
Nobackup warning threshold is 0 days.
pastBackupWarningThreshold returning False
sendEmail(msgHtml=, msgText=Duplicati Backup Summary Report Date Time Files +/- Size +/- Added Deleted Modified Errors Result ***** Source: Destination: ***** No new activity. Last activity on 12/29/2017 at 21:06:37 (-1 days ago) ***** Source: Test Destination: Success ***** No new activity. Last activity on 12/29/2017 at 21:39:22 (-1 days ago) Running Time: 960.377 seconds. , subject=None, sender=None, receiver=None)
Duplicati Backup Summary Report Date Time Files +/- Size +/- Added Deleted Modified Errors Result Source: Destination: No new activity. Last activity on 12/29/2017 at 21:06:37 (-1 days ago) Source: Test Destination: Success No new activity. Last activity on 12/29/2017 at 21:39:22 (-1 days ago) Running Time: 960.377 seconds.
from dupreport.
ninja6o4 I once used the following with duplicti and dupreport
Duplicti would be setup as
--send-mail-subject=Duplicati %OPERATIONNAME% (%PARSEDRESULT%) report for %backup-name%
DupReport you would set subjectregex in the .rc file
subjectregex = ^Duplicati Backup \((Success|Error|Warn|Failed)\) report for
from dupreport.
Running for 960 seconds (16 minutes) will definitely cause the backend server to timeout. The good news is that should only affect the outgoing summary emails. The incoming Duplicati emails are all read on the front end of the program and that connection is constantly in use, so that shouldn't time out.
As a workaround during debugging, use the -f <file>,html option to send the output to <file> and the -x option to suppress the final summary email. That way you'll get the report without needing to send the final email. I've opened an issue about the timeout problem.
Using %PARSEDRESULT% is certainly an option. @dcurrey had a post about doing this in the Forum at this link. This post shows another way of doing it.
So the last question is should there have been entries in the final report, or is the "No new activity" message correct? I'm assuming the empty Source/Destination output is from the previous runs with the bad source/destination spec?
from dupreport.
@dcurrey Thanks for the suggestion! I will do that.
@HandyGuySoftware I believe no activity to be correct, since no files were uploaded in my test runs. For the timeout, I don't know how python works, but is it possible to do the initial connection at the start to confirm working configuration, close it, and then start another connection at the end once it's ready to send?
from dupreport.
So, I think I figured out the focus of dupReport 2.1.1: More efficient email handling! :-)
It's certainly possible, and in fact the outgoing email connection wasn't made until much later in the program in earlier releases. However, I added a feature to send warning emails to users if they pass the nobackupwarn threshold (see Issue #44 ) and that forced the connection to be opened earlier in the program run.
I'm currently researching the possibility of adding some sort of keepalive signal to the SMTP server as an interim measure. I'll let you know as soon as I work up something.
from dupreport.
Just added a short keepalive routine in the read-messages loop. It's quick & dirty and I'm not sure it will work, but it's worth a try. Uploaded new code to the beta branch.
Remove the -x from the command line and let me know if you're still getting the timeout on the outbound server. (You can keep the -f if you want. I always find it easier to just look at the output file instead of always going back to email to see the result. Personal choice).
from dupreport.
Looks like the whole thing is crashing out prematurely when SMTP is closed now. It's not finished scanning and has this error:
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 387, in getreply
line = self.file.readline(_MAXLINE + 1)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 1009, in recv_into
return self.read(nbytes, buffer)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 871, in read
return self._sslobj.read(len, buffer)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote hostDuring handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "dupReport.py", line 216, in
globs.outServer.keepAlive()
File "C:\dupReport-2.1.0_Beta\dremail.py", line 506, in keepAlive
retval = self.server.noop()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 514, in noop
return self.docmd("noop")
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 421, in docmd
return self.getreply()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 391, in getreply
+ str(e))
smtplib.SMTPServerDisconnected: Connection unexpectedly closed: [WinError 10054] An existing connection was forcibly closed by the remote host
Last entry in the log:
--===============1810988833==--
]
outServer.keepAlive()
from dupreport.
So quick & dirty didn't work, so went with a more thorough approach. Expanded KeepAlive routine to re-connect to server if the connection drops. Tested on a 20-minute delay and seems to work. Give it a try and see if it fixes your timeout issue.
from dupreport.
Yup that did it. Thanks!
outServer.keepAlive()
Server smtp.office365.com timed out. Reconnecting.
EmailServer.Connect(protocol=[smtp] address=[smtp.office365.com] port=[587] account=[xxx.com] passwd=[*] encryption=[ssl])
Using SMTP
Logged in. retVal=235 retMsg=b'2.7.0 Authentication successful target host CY1PR17MB0647.namprd17.prod.outlook.com'
Program completed in 872.879 seconds. Exiting
Closing everything...
Closing inbound email server...
Closing outbound email server...
Closing database file...
Database.dbClose(): Closing database object.
Closing log file...
from dupreport.
Excellent! That'll close out two issues. Let me know if you're having any more problems processing emails as they start to build up.
In version 2.1.1 I'll do some work to make the reading of emails more efficient so you don't have to download all of them. That should cut down on your run time sufficiently!
from dupreport.
Related Issues (20)
- Fix Typos in convert.py file HOT 1
- Add OS Name & Version to Log File HOT 1
- dupReport.py installer changes folder name HOT 2
- Error crash when target destination is unavailable HOT 1
- smtplib.SMTPDataError: (501, b'Syntax error - line too long') HOT 18
- Typo in [report] section HOT 1
- Change description of -verbose command line option HOT 1
- Program crashes when can't connect to SMTP server HOT 1
- Fix Typos in docs/WhatIsDupreport.md HOT 1
- Indicate Warning or Error in subject line of the mail report HOT 16
- Only send on failure HOT 1
- Parsing failures HOT 4
- Enabling "Less secure apps" is required for Gmail access HOT 2
- "Nonetype has no attribute groups" - dupreport crash HOT 7
- Add an Ignore option to the Source-Destination Section of the .RC file HOT 1
- Need more flexibility in defining Subject Line and S-D regex in .rc file (Was: RegEx suddenly does not match anymore) HOT 16
- dupReport is ignoring Backup Warning Threshold HOT 1
- Typo in the docs for the-b option HOT 1
- Syslog output crashes ELK Logstash 8.1.2 HOT 1
- Ignored S/D Pairs still show up in LastSeen Report
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dupreport.