Giter Site home page Giter Site logo

Comments (34)

ninja6o4 avatar ninja6o4 commented on May 25, 2024

OK grabbed latest beta:
Traceback (most recent call last):
File "dupReport.py", line 213, in
globs.inServer.processMessage(nxtMsg)
File "C:\dupReport-2.1.0_Beta\dremail.py", line 307, in processMessage
decode1 = email.header.decode_header(msg['Message-Id'])[0][0]
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\email\header.py", line 80, in decode_header
if not ecre.search(header):
TypeError: expected string or bytes-like object

Attached is pruned log, last entry (same email, it looks like.)
dupReport.log

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

OK, as I suspected, for some reason dR is not getting a message ID from the email, so everything else is crashing behind it. Highly unusual. The easy fix is to add something like

if (no header)
    skip message

However, before I do that, would it be possible to post the entire email that's causing the error, including the headers? I want to see if there is a message format I need to code for. Also, did you say this worked on other mail servers or was that the original error you were talking about?

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

I did also try it on a cPanel-based mail server and it worked without any issues (although it's an empty mailbox right now.)

I will look for that specific email later tonight and pm you privately the contents.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

OK, thanks. I've coded up a potential fix. I can post that now and see if it solves the problem, or wait for your email and do some additional research to see if more changes are needed. Your choice.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

Uploaded a possible fix to the Beta branch. Please have a look and see if it addresses your issue.

from dupreport.

dcurrey avatar dcurrey commented on May 25, 2024

Not sure if this is related to above but I was able to reproduce the error above by having an email message with a blank subject.

This was done with the most current code.

  File "./dupReport.py", line 214, in <module>
    globs.inServer.processMessage(nxtMsg)
  File "/root/dupReport-2.1.0_Beta/dremail.py", line 337, in processMessage
    decode = email.header.decode_header(msg['Subject'])[0]
  File "/usr/lib/python3.5/email/header.py", line 80, in decode_header
    if not ecre.search(header):

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

@dcurrey , interesting, because a message with a blank subject should be simply rejected as "not a message of interest." Can you post the error message you're getting and the last 20 lines of the log file?

from dupreport.

dcurrey avatar dcurrey commented on May 25, 2024

Think I am on to something here looking at the log and the messageid it looks like dupreport is trying to add the blank subject message.

Database.searchForMessage(<20171228112123.Horde.p77A5h25wJZ9ESmi-G1jyFY@hordetest.mydomain.com>)
Database.execSqlStmt(): Executing SQL command.
Message ID [<20171228112123.Horde.p77A5h25wJZ9ESmi-G1jyFY@hordetest.mydomain.com>] does not exist. Adding to DB

As you stated it shouldn't

Here is the full message header notice something missing

Return-Path: <[email protected]>
Delivered-To: [email protected]
Received: from localhost (localhost.localdomain [127.0.0.1])
	by beta.mydomain.com (Postfix) with ESMTP id 92ED84543EEB
	for <[email protected]>; Thu, 28 Dec 2017 11:21:23 -0500 (EST)
X-Virus-Scanned: Debian amavisd-new at beta.mydomain.com
X-Spam-Flag: NO
X-Spam-Score: 6.956
X-Spam-Level: ******
X-Spam-Status: No, score=6.956 tagged_above=3 required=999
	tests=[HELO_DYNAMIC_IPADDR=3.243, MISSING_SUBJECT=1.767,
	RDNS_DYNAMIC=0.363, SPF_FAIL=0.919, TO_EQ_FM_DIRECT_MX=2.499,
	TO_EQ_FM_DOM_SPF_FAIL=0.001, TO_EQ_FM_SPF_FAIL=0.001, TXREP=-1.837]
	autolearn=no autolearn_force=no
Received: from beta.mydomain.com ([127.0.0.1])
	by localhost (beta.mydomain.com [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id oF2DDRqrgkR2 for <[email protected]>;
	Thu, 28 Dec 2017 11:21:23 -0500 (EST)
Received: by beta.mydomain.com (Postfix, from userid 5002)
	id 10BEF4543ED3; Thu, 28 Dec 2017 11:21:23 -0500 (EST)
Received: from mh-esr1-xxx-xxx-xxx-xxx.,myisp.net (mh-esr1-xxx-xxx-xxx-xxx.myisp.net
 [xxx.xxx.xxx.xxx]) by hordetest.mydomain.com (Horde Framework) with HTTPS;
 Thu, 28 Dec 2017 11:21:23 -0500
Date: Thu, 28 Dec 2017 11:21:23 -0500
Message-ID: <20171228112123.Horde.p77A5h25wJZ9ESmi-G1jyFY@hordetest.mydomain.com>
From: [email protected]
To: [email protected]
User-Agent: Horde Application Framework 5
Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes
MIME-Version: 1.0
Content-Disposition: inline

some text

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

OK updated.. it runs for a long time (log is too large to read in n++), but it ends with this error:

Traceback (most recent call last):
File "dupReport.py", line 214, in
globs.inServer.processMessage(nxtMsg)
File "C:\dupReport-2.1.0_Beta\dremail.py", line 443, in processMessage
globs.log.write(3, 'endTimeStamp=[{}] beginTimeStamp=[{}]'.format(drdatetime.fromTimestamp(dateParts['endTimestamp']), drdatetime.fromTimestamp(dateParts['beginTimestamp'])))
File "C:\dupReport-2.1.0_Beta\drdatetime.py", line 134, in fromTimestamp
dt = datetime.datetime.fromtimestamp(float(ts))
TypeError: float() argument must be a string or a number, not 'NoneType'

My .rc file is unchanged from default other than the server access/credentials:

[main]
version = 2.1.0
dbpath = C:\dupReport-2.1.0_Beta
logpath = C:\dupReport-2.1.0_Beta
verbose = 1
logappend = false
subjectregex = ^Duplicati Backup report for
srcregex = \w*
destregex = \w*
srcdestdelimiter = -
dateformat = MM/DD/YYYY
timeformat = HH:MM:SS
warnoncollect = false
applyutcoffset = false
show24hourtime = true
purgedb = false

Another question - I tried running it again without verbose logging so I could read the .log, and I noticed that more often than not, it says "Message ID [[email protected]] does not exist. Adding to DB"

However, since I've already run it once before, should the majority (if not all) of these existing messages now be in the DB, even if it crashes out?

from dupreport.

dcurrey avatar dcurrey commented on May 25, 2024

Quick and dirty fix for no subject problem

Line 336 dremail.py

        # get Subject
        if not msg['Subject']:
            msg['Subject'] = '[No Subject]'
        decode = email.header.decode_header(msg['Subject'])[0]

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

@ninja6o4 , the program doesn't commit the database until all the emails are parsed, so if it crashed in the middle it won't register those emails in the db. They will be scanned again on the next run.

@dcurrey , I'm thinking an easier fix is to just skip the message entirely if no subject, since it won't be from a Duplicati backup session. dR won't be able to do anything with it anyway. Continuing to process, even with a bogus subject will just lead to problems down the line. Thoughts?

from dupreport.

dcurrey avatar dcurrey commented on May 25, 2024

Hence the quick and dirty fix. :)

It does look like the check is just a few lines down in the dremail.py

So just do a return at the if statement instead of bogus subject

if not msg['Subject']:
     globs.log.write(1, 'Message [{}] is not a Message of Interest, No Subject.'.format(msgParts['messageId']))
     return None, None    # Not a message of Interest

But I am good on whatever works.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

On the road today. Will take a look at it tonight an work up a permanent fix. Will also look at the ts parsing problem, which I suspect is related.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

Just posted potential fix code to the Beta branch. I basically punted the problem by checking for essential message elements and skipping the message if any of them are missing:

        if msg['Message-Id'] is None or msg['Message-Id'] == '':
            globs.log.write(3,'No message-Id. Abandoning processMessage()')
            return None, None
        if msg['Subject'] is None or msg['Subject'] == '':
            globs.log.write(3,'No Subject. Abandoning processMessage()')
            return None, None
        if msg['Date'] is None or msg['Date'] == '':
            globs.log.write(3,'No Date. Abandoning processMessage()')
            return None, None

I also removed the debug lines that logged multiple copies of the email. That'll cut down on the log size a bit.

@ninja6o4 , I'll revise my earlier explanation for the multiple "does not exist in DB" messages. It's been a while since I've been in this section of the code. "Upon further review" it looks like dR only stores the data for emails that it successfully parses as Duplicati result messages. If it can't match an acceptable subject line (as defined in [main] subjectregex= in the .rc file) it just outputs an error message ("... not a message of interest...") and skips to the next message. As a result, if you have a lot of non-Duplicati messages in your inbox they will be reviewed every time the program runs. This was a time/space trade-off. I sacrificed some processing time to prevent storing lots of irrelevant message data in the database.

@mr-flibble, you may want to download and test this fix. You seem to have emails that like to break dR (:-)). Since this fix altered the way email headers are processed and matched it would be good to have you run through it as well.

Let me know how it goes.

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

My current log folder is ~4000 messages, so it does take a few minutes to run, but probably not a big deal once i get this working and automated. My main concern is that it looks like it is literally downloading the entire folder every time it runs, which is ~200MB each time. I don't know if it would overly burden the db to log all message IDs and a flag as to if it is relevant or not. Maybe an option for the user as to whether or not to save it?

In any event, still getting the odd date format error. Log shows:

, Failed: .*, 24, 1
options.getRcSectionDateTimeFmt(, )
drDateTime.toTimestamp(, MM/DD/YYYY, HH:MM:SS, -28800)
drDateTime.toTimestamp(, None, None, -28800)
drdatetime.fromTimestamp(None, MM/DD/YYYY, HH:MM:SS)

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

@ninja6o4, can you post the 20 or so lines around the format error message so I can see what's going on there?

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

, Errors: [.*^], 24, 1
EmailServer.searchMesagePart(RGVsZXRlZEZpbGVzOiAwDQpEZWxldGVkRm9sZGVyczogMA0KTW9kaWZpZWRGaWxlczogMA0KRXhh

bWluZWRGaWxlczogMzY4MQ0KT3BlbmVkRmlsZXM6IDANCkFkZGVkRmlsZXM6IDANClNpemVPZk1v

ZGlmaWVkRmlsZXM6IDANClNpemVPZkFkZGVkRmlsZXM6IDANClNpemVPZkV4YW1pbmVkRmlsZXM6

IDI4ODEwNTc1OTQNClNpemVPZk9wZW5lZEZpbGVzOiAwDQpOb3RQcm9jZXNzZWRGaWxlczogMA0K

QWRkZWRGb2xkZXJzOiAwDQpUb29MYXJnZUZpbGVzOiAwDQpGaWxlc1dpdGhFcnJvcjogMA0KTW9k

aWZpZWRGb2xkZXJzOiAwDQpNb2RpZmllZFN5bWxpbmtzOiAwDQpBZGRlZFN5bWxpbmtzOiAwDQpE

ZWxldGVkU3ltbGlua3M6IDANClBhcnRpYWxCYWNrdXA6IEZhbHNlDQpEcnlydW46IEZhbHNlDQpN

YWluT3BlcmF0aW9uOiBCYWNrdXANClBhcnNlZFJlc3VsdDogU3VjY2Vzcw0KVmVyYm9zZU91dHB1

dDogRmFsc2UNClZlcmJvc2VFcnJvcnM6IEZhbHNlDQpFbmRUaW1lOiAxMi8yNy8yMDE3IDEwOjAw

OjMyIEFNDQpCZWdpblRpbWU6IDEyLzI3LzIwMTcgMTA6MDA6MDAgQU0NCkR1cmF0aW9uOiAwMDow

MDozMi42MDg3NjAzDQpNZXNzYWdlczogWw0KICAgIE5vIHJlbW90ZSBmaWxlc2V0cyB3ZXJlIGRl

bGV0ZWQsDQogICAgcmVtb3ZpbmcgZmlsZSBsaXN0ZWQgYXMgVGVtcG9yYXJ5OiBkdXBsaWNhdGkt

YmRiZWZjMzA2ZDRmYzQxYjc5MjQxN2EzZWRhZjU4MTk1LmRibG9jay56aXAuYWVzLA0KICAgIHJl

bW92aW5nIGZpbGUgbGlzdGVkIGFzIFRlbXBvcmFyeTogZHVwbGljYXRpLWliY2MxOGM0NjcyMWY0

MTBlOTc2ZTUxMDFhZWMyNDc1ZS5kaW5kZXguemlwLmFlcw0KXQ0KV2FybmluZ3M6IFtdDQpFcnJv

cnM6IFtdDQoNCg==

, Details: .*, 24, 1
EmailServer.searchMesagePart(RGVsZXRlZEZpbGVzOiAwDQpEZWxldGVkRm9sZGVyczogMA0KTW9kaWZpZWRGaWxlczogMA0KRXhh

bWluZWRGaWxlczogMzY4MQ0KT3BlbmVkRmlsZXM6IDANCkFkZGVkRmlsZXM6IDANClNpemVPZk1v

ZGlmaWVkRmlsZXM6IDANClNpemVPZkFkZGVkRmlsZXM6IDANClNpemVPZkV4YW1pbmVkRmlsZXM6

IDI4ODEwNTc1OTQNClNpemVPZk9wZW5lZEZpbGVzOiAwDQpOb3RQcm9jZXNzZWRGaWxlczogMA0K

QWRkZWRGb2xkZXJzOiAwDQpUb29MYXJnZUZpbGVzOiAwDQpGaWxlc1dpdGhFcnJvcjogMA0KTW9k

aWZpZWRGb2xkZXJzOiAwDQpNb2RpZmllZFN5bWxpbmtzOiAwDQpBZGRlZFN5bWxpbmtzOiAwDQpE

ZWxldGVkU3ltbGlua3M6IDANClBhcnRpYWxCYWNrdXA6IEZhbHNlDQpEcnlydW46IEZhbHNlDQpN

YWluT3BlcmF0aW9uOiBCYWNrdXANClBhcnNlZFJlc3VsdDogU3VjY2Vzcw0KVmVyYm9zZU91dHB1

dDogRmFsc2UNClZlcmJvc2VFcnJvcnM6IEZhbHNlDQpFbmRUaW1lOiAxMi8yNy8yMDE3IDEwOjAw

OjMyIEFNDQpCZWdpblRpbWU6IDEyLzI3LzIwMTcgMTA6MDA6MDAgQU0NCkR1cmF0aW9uOiAwMDow

MDozMi42MDg3NjAzDQpNZXNzYWdlczogWw0KICAgIE5vIHJlbW90ZSBmaWxlc2V0cyB3ZXJlIGRl

bGV0ZWQsDQogICAgcmVtb3ZpbmcgZmlsZSBsaXN0ZWQgYXMgVGVtcG9yYXJ5OiBkdXBsaWNhdGkt

YmRiZWZjMzA2ZDRmYzQxYjc5MjQxN2EzZWRhZjU4MTk1LmRibG9jay56aXAuYWVzLA0KICAgIHJl

bW92aW5nIGZpbGUgbGlzdGVkIGFzIFRlbXBvcmFyeTogZHVwbGljYXRpLWliY2MxOGM0NjcyMWY0

MTBlOTc2ZTUxMDFhZWMyNDc1ZS5kaW5kZXguemlwLmFlcw0KXQ0KV2FybmluZ3M6IFtdDQpFcnJv

cnM6IFtdDQoNCg==

, Failed: .*, 24, 1
options.getRcSectionDateTimeFmt(, )
drDateTime.toTimestamp(, MM/DD/YYYY, HH:MM:SS, -28800)
drDateTime.toTimestamp(, None, None, -28800)
drdatetime.fromTimestamp(None, MM/DD/YYYY, HH:MM:SS)

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

Hard to tell from the context, but it looks like these are non-Duplicati emails. In those cases, the date/time strings wouldn't be in the place dupReport expects to find them. dR should catch this and skip the message, but for some reason it is not.

I'll need a bigger patch of log to trace where/why this is happening, since the snip you sent is all email body. Can you snip from the end back to where is says "EmailServer.process_message()"? That will give me a full trace of that message and why it is crashing there.

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

I am certain they are non-Duplicati, as I just started using it, but dumping logs into my generic log folder that collects from other various services. Good way to test your email parsing/detection :D

Here it is. I trimmed out the encoded contents of each EmailServer.searchMesagePart(, it is the same as pasted above, and repeats for every instance of that entry.

EmailServer.process_message()
Database.searchForMessage([email protected])
Database.execSqlStmt(): Executing SQL command.
Message ID [[email protected]] does not exist. Adding to DB
drDateTime.toTimestamp(2017/12/27 00:00:40, YYYY/MM/DD, HH:MM:SS, None)
Date/Time converted to [1514361640.0]
drdatetime.fromTimestamp(1514361640.0, MM/DD/YYYY, HH:MM:SS)
Message [[email protected]] is not a Message of Interest.
dremail.getNextMessage()
EmailServer.process_message()
Database.searchForMessage([email protected])
Database.execSqlStmt(): Executing SQL command.
Database.execSqlStmt(): Executing SQL command.
Database.dbCommit(): Commiting transaction.
dremail.getNextMessage()
EmailServer.process_message()
Database.searchForMessage(DD79R41783U4.UJ34CIL6V9EC@Ironhide)
Database.execSqlStmt(): Executing SQL command.
Message ID [DD79R41783U4.UJ34CIL6V9EC@Ironhide] does not exist. Adding to DB
drDateTime.toTimestamp(2017/12/27 02:00:33, YYYY/MM/DD, HH:MM:SS, None)
Date/Time converted to [1514368833.0]
drdatetime.fromTimestamp(1514368833.0, MM/DD/YYYY, HH:MM:SS)
Database.searchSrcDestPair(, )
Database.execSqlStmt(): Executing SQL command.
EmailServer.searchMesagePart(

, DeletedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, DeletedFolders: \d+, 0, 0
EmailServer.searchMesagePart(

, ModifiedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, ExaminedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, OpenedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, AddedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, SizeOfModifiedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, SizeOfAddedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, SizeOfExaminedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, SizeOfOpenedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, NotProcessedFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, AddedFolders: \d+, 0, 0
EmailServer.searchMesagePart(

, TooLargeFiles: \d+, 0, 0
EmailServer.searchMesagePart(

, FilesWithError: \d+, 0, 0
EmailServer.searchMesagePart(

, ModifiedFolders: \d+, 0, 0
EmailServer.searchMesagePart(

, ModifiedSymlinks: \d+, 0, 0
EmailServer.searchMesagePart(

, AddedSymlinks: \d+, 0, 0
EmailServer.searchMesagePart(

, DeletedSymlinks: \d+, 0, 0
EmailServer.searchMesagePart(

, PartialBackup: \w+, 0, 1
EmailServer.searchMesagePart(

, Dryrun: \w+, 0, 1
EmailServer.searchMesagePart(

, MainOperation: \w+, 0, 1
EmailServer.searchMesagePart(

, ParsedResult: \w+, 0, 1
EmailServer.searchMesagePart(

, VerboseOutput: \w+, 0, 1
EmailServer.searchMesagePart(

, VerboseErrors: \w+, 0, 1
EmailServer.searchMesagePart(

, EndTime: .*, 0, 1
EmailServer.searchMesagePart(

, BeginTime: .*, 0, 1
EmailServer.searchMesagePart(

, Duration: .*, 0, 1
EmailServer.searchMesagePart(

, Messages: [.*^], 24, 1
EmailServer.searchMesagePart(

, Warnings: [.*^], 24, 1
EmailServer.searchMesagePart(

, Errors: [.*^], 24, 1
EmailServer.searchMesagePart(

, Details: .*, 24, 1
EmailServer.searchMesagePart(

, Failed: .*, 24, 1
options.getRcSectionDateTimeFmt(, )
drDateTime.toTimestamp(, MM/DD/YYYY, HH:MM:SS, -28800)
drDateTime.toTimestamp(, None, None, -28800)
drdatetime.fromTimestamp(None, MM/DD/YYYY, HH:MM:SS)

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

OK, tried some more fixing. Please pull down the latest beta and see if that solves the problem.

Also, make sure you are running at logging level 3 (command line -v 3 or [main] verbose=3). It'll blow up the size of your log, but traces most of the variables in the code to give more debugging info.

Let me know how it goes.

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

EmailServer.process_message()
Subject=[Duplicati Backup report for Test - Success]
srcregex=[\w*-] destRegex=[-\w*]
msg[Message-Id]=[DD79R41783U4.UJ34CIL6V9EC@Ironhide]
msgParts[messageId]=[DD79R41783U4.UJ34CIL6V9EC@Ironhide]
Database.searchForMessage(DD79R41783U4.UJ34CIL6V9EC@Ironhide)
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT messageId FROM emails WHERE messageId='DD79R41783U4.UJ34CIL6V9EC@Ironhide']
Message ID [DD79R41783U4.UJ34CIL6V9EC@Ironhide] does not exist. Adding to DB
drDateTime.toTimestamp(2017/12/27 02:00:33, YYYY/MM/DD, HH:MM:SS, None)
Date/Time converted to [1514368833.0]
drdatetime.fromTimestamp(1514368833.0, MM/DD/YYYY, HH:MM:SS)
Converted [1514368833.0] to [12/27/2017 02:00:33]
emailDate=[1514368833.0]-[('12/27/2017', '02:00:33')]
sourceComp=[] destComp=[] emailTimestamp=[1514368833.0] subject=[Duplicati Backup report for Test - Success]
Database.searchSrcDestPair(, )
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT source, destination FROM backupsets WHERE source='' AND destination='']
Source/Destination pair [/] already in database.
Body=[RGVsZXRlZEZpbGVzOiAwDQpEZWxldGVkRm9sZGVyczogMA0KTW9kaWZpZWRGaWxlczogMA0KRXhh

bWluZWRGaWxlczogMzY4MQ0KT3BlbmVkRmlsZXM6IDANCkFkZGVkRmlsZXM6IDANClNpemVPZk1v

ZGlmaWVkRmlsZXM6IDANClNpemVPZkFkZGVkRmlsZXM6IDANClNpemVPZkV4YW1pbmVkRmlsZXM6

IDI4ODEwNTc1OTQNClNpemVPZk9wZW5lZEZpbGVzOiAwDQpOb3RQcm9jZXNzZWRGaWxlczogMA0K

QWRkZWRGb2xkZXJzOiAwDQpUb29MYXJnZUZpbGVzOiAwDQpGaWxlc1dpdGhFcnJvcjogMA0KTW9k

aWZpZWRGb2xkZXJzOiAwDQpNb2RpZmllZFN5bWxpbmtzOiAwDQpBZGRlZFN5bWxpbmtzOiAwDQpE

ZWxldGVkU3ltbGlua3M6IDANClBhcnRpYWxCYWNrdXA6IEZhbHNlDQpEcnlydW46IEZhbHNlDQpN

YWluT3BlcmF0aW9uOiBCYWNrdXANClBhcnNlZFJlc3VsdDogU3VjY2Vzcw0KVmVyYm9zZU91dHB1

dDogRmFsc2UNClZlcmJvc2VFcnJvcnM6IEZhbHNlDQpFbmRUaW1lOiAxMi8yNy8yMDE3IDEwOjAw

OjMyIEFNDQpCZWdpblRpbWU6IDEyLzI3LzIwMTcgMTA6MDA6MDAgQU0NCkR1cmF0aW9uOiAwMDow

MDozMi42MDg3NjAzDQpNZXNzYWdlczogWw0KICAgIE5vIHJlbW90ZSBmaWxlc2V0cyB3ZXJlIGRl

bGV0ZWQsDQogICAgcmVtb3ZpbmcgZmlsZSBsaXN0ZWQgYXMgVGVtcG9yYXJ5OiBkdXBsaWNhdGkt

YmRiZWZjMzA2ZDRmYzQxYjc5MjQxN2EzZWRhZjU4MTk1LmRibG9jay56aXAuYWVzLA0KICAgIHJl

bW92aW5nIGZpbGUgbGlzdGVkIGFzIFRlbXBvcmFyeTogZHVwbGljYXRpLWliY2MxOGM0NjcyMWY0

MTBlOTc2ZTUxMDFhZWMyNDc1ZS5kaW5kZXguemlwLmFlcw0KXQ0KV2FybmluZ3M6IFtdDQpFcnJv

cnM6IFtdDQoNCg==

]
EmailServer.searchMesagePart(
^^ At this point it repeats the email contents for each variable, as before.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

The log entry: Database.searchSrcDestPair(, ) means that the program isn't finding the source and destination properly. The source and destination should be shown between the parentheses, as in Database.searchSrcDestPair(Test,Success )

Can you remove the spaces after "Test" and before "Success" in your backup job name? Instead of

Test - Success

your backup job needs to be named

Test-Success (no spaces).

I did some testing and was able to reproduce this part of your problem. The regex parser is confused by the extra spaces between source and destination and not properly catching the source and destination names. Try running the job under the new name (or somehow editing the email subject lines) and rerun. Regex is a delicate thing, unfortunately. I need to update the docs on this point.

I'm still concerned that your email body looks like encoded text. If the email in question is from a Duplicati backup report, your log should look something like this:

Source/Destination pair [XXX/YYY] already in database.
Body=[DeletedFiles: 0DeletedFolders: 0ModifiedFiles: 0ExaminedFiles: 223314OpenedFiles: 0AddedFiles: 0SizeOfModifiedFiles: 0SizeOfAddedFiles: 0SizeOfExaminedFiles: 885093604482SizeOfOpenedFiles: 0NotProcessedFiles: 0AddedFolders: 0TooLargeFiles: 0FilesWithError: 0ModifiedFolders: 0ModifiedSymlinks: 0AddedSymlinks: 0DeletedSymlinks: 0PartialBackup: FalseDryrun: FalseMainOperation: BackupParsedResult: SuccessVerboseOutput: FalseVerboseErrors: FalseEndTime: 10/27/2017 3:15:24 PMBeginTime: 10/27/2017 2:59:48 PMDuration: 00:15:36.5097932Messages: [ No remote filesets were deleted, removing file listed as Temporary: duplicati-b3039a471f4a843cfb51316b892db4950.dblock.zip.aes, removing file listed as Temporary: duplicati-i127aa9063f0741cd95d96baaaf178c6d.dindex.zip.aes]Warnings: []Errors: []]
EmailServer.searchMesagePart(DeletedFiles: 0DeletedFolders: 0ModifiedFiles: 0ExaminedFiles: 223314OpenedFiles: 0AddedFiles:
.
.
.

The encoded text will definitely confuse the message parser, but let's fix one problem at a time. First, matching the proper source/destination pairs...

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

OK, i'll update the subject line. I'll also have to delete the existing reports and wait to see how they do going forward. I'll keep you posted.

EDIT: Actually, I just tried on my cPanel mailbox (which is empty except for Duplicati emails) and it parsed the emails just fine. It doesn't differentiate between the two backup jobs, but I do get a report.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

A speedier option would be to run a manual job or two with the new job name. You won't get much in the way of file difference but you'll at least get couple of new emails to parse. Then run the program and it should only pick up the new result emails.

Your choice, I'm standing by either way. Thanks for helping me dig into this. :-)

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

OK I got a different error but the log seems like it thinks it was successful. I think the outbound connection times out and is closed before dR has a chance to finish going through the emails.

btw - I reread what you said. Originally I had added " - %PARSEDRESULT%" to the end of Duplicati's default subject line, so that I can see at a glance if there was a problem with the job or not. Is there some way I can do this and still use dR? Nevermind - reading your documentation on src/dest pairing, I will adjust accordingly

Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 354, in send
self.sock.sendall(s)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 972, in sendall
v = self.send(byte_view[count:])
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 941, in send
return self._sslobj.write(data)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 642, in write
return self._sslobj.write(data)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "dupReport.py", line 255, in
globs.outServer.sendEmail(msgHtml, msgText)
File "C:\dupReport-2.1.0_Beta\dremail.py", line 484, in sendEmail
self.server.sendmail(sender, receiver, msg.as_string().encode('utf-8'))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 861, in sendmail
(code, resp) = self.mail(from_addr, esmtp_opts)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 534, in mail
self.putcmd("mail", "FROM:%s%s" % (quoteaddr(sender), optionlist))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 367, in putcmd
self.send(str)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 357, in send
raise SMTPServerDisconnected('Server not connected')
smtplib.SMTPServerDisconnected: Server not connected

Here is the end of log snippet:

sendNoBackupWarnings()
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT source, destination FROM backupsets ORDER BY source, destination]
getLatestTimestamp(, )
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT lastTimestamp FROM backupsets WHERE source = "" AND destination = ""]
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT max(endTimeStamp) FROM emails WHERE sourceComp = '' AND destComp= '' AND endTimeStamp > 1514610397.0]
Returning backupsets timestamp: 1514610397.0
report.pastBackupWarningThreshold(, , -1)
options.getRcOption(-, nobackupwarn)
Nobackup warning threshold is 0 days.
pastBackupWarningThreshold returning False
getLatestTimestamp(Test, Success)
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT lastTimestamp FROM backupsets WHERE source = "Test" AND destination = "Success"]
Database.execSqlStmt(): Executing SQL command.
SQL stmt=[SELECT max(endTimeStamp) FROM emails WHERE sourceComp = 'Test' AND destComp= 'Success' AND endTimeStamp > 1514612362.0]
Returning backupsets timestamp: 1514612362.0
report.pastBackupWarningThreshold(Test, Success, -1)
options.getRcOption(Test-Success, nobackupwarn)
Nobackup warning threshold is 0 days.
pastBackupWarningThreshold returning False
sendEmail(msgHtml=

Duplicati Backup Summary Report
DateTimeFiles+/-Size+/-AddedDeletedModifiedErrorsResult
Source: Destination:
No new activity. Last activity on 12/29/2017 at 21:06:37 (-1 days ago)
Source: Test Destination: Success
No new activity. Last activity on 12/29/2017 at 21:39:22 (-1 days ago)
Running Time: 960.377 seconds.
, msgText=Duplicati Backup Summary Report Date Time Files +/- Size +/- Added Deleted Modified Errors Result ***** Source: Destination: ***** No new activity. Last activity on 12/29/2017 at 21:06:37 (-1 days ago) ***** Source: Test Destination: Success ***** No new activity. Last activity on 12/29/2017 at 21:39:22 (-1 days ago) Running Time: 960.377 seconds. , subject=None, sender=None, receiver=None)

from dupreport.

dcurrey avatar dcurrey commented on May 25, 2024

ninja6o4 I once used the following with duplicti and dupreport

Duplicti would be setup as

--send-mail-subject=Duplicati %OPERATIONNAME% (%PARSEDRESULT%) report for %backup-name%

DupReport you would set subjectregex in the .rc file

subjectregex = ^Duplicati Backup \((Success|Error|Warn|Failed)\) report for 

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

Running for 960 seconds (16 minutes) will definitely cause the backend server to timeout. The good news is that should only affect the outgoing summary emails. The incoming Duplicati emails are all read on the front end of the program and that connection is constantly in use, so that shouldn't time out.

As a workaround during debugging, use the -f <file>,html option to send the output to <file> and the -x option to suppress the final summary email. That way you'll get the report without needing to send the final email. I've opened an issue about the timeout problem.

Using %PARSEDRESULT% is certainly an option. @dcurrey had a post about doing this in the Forum at this link. This post shows another way of doing it.

So the last question is should there have been entries in the final report, or is the "No new activity" message correct? I'm assuming the empty Source/Destination output is from the previous runs with the bad source/destination spec?

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

@dcurrey Thanks for the suggestion! I will do that.

@HandyGuySoftware I believe no activity to be correct, since no files were uploaded in my test runs. For the timeout, I don't know how python works, but is it possible to do the initial connection at the start to confirm working configuration, close it, and then start another connection at the end once it's ready to send?

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

So, I think I figured out the focus of dupReport 2.1.1: More efficient email handling! :-)

It's certainly possible, and in fact the outgoing email connection wasn't made until much later in the program in earlier releases. However, I added a feature to send warning emails to users if they pass the nobackupwarn threshold (see Issue #44 ) and that forced the connection to be opened earlier in the program run.

I'm currently researching the possibility of adding some sort of keepalive signal to the SMTP server as an interim measure. I'll let you know as soon as I work up something.

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

Just added a short keepalive routine in the read-messages loop. It's quick & dirty and I'm not sure it will work, but it's worth a try. Uploaded new code to the beta branch.

Remove the -x from the command line and let me know if you're still getting the timeout on the outbound server. (You can keep the -f if you want. I always find it easier to just look at the output file instead of always going back to email to see the result. Personal choice).

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

Looks like the whole thing is crashing out prematurely when SMTP is closed now. It's not finished scanning and has this error:

Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 387, in getreply
line = self.file.readline(_MAXLINE + 1)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 1009, in recv_into
return self.read(nbytes, buffer)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 871, in read
return self._sslobj.read(len, buffer)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "dupReport.py", line 216, in
globs.outServer.keepAlive()
File "C:\dupReport-2.1.0_Beta\dremail.py", line 506, in keepAlive
retval = self.server.noop()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 514, in noop
return self.docmd("noop")
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 421, in docmd
return self.getreply()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36-32\lib\smtplib.py", line 391, in getreply
+ str(e))
smtplib.SMTPServerDisconnected: Connection unexpectedly closed: [WinError 10054] An existing connection was forcibly closed by the remote host

Last entry in the log:

--===============1810988833==--
]
outServer.keepAlive()

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

So quick & dirty didn't work, so went with a more thorough approach. Expanded KeepAlive routine to re-connect to server if the connection drops. Tested on a 20-minute delay and seems to work. Give it a try and see if it fixes your timeout issue.

from dupreport.

ninja6o4 avatar ninja6o4 commented on May 25, 2024

Yup that did it. Thanks!

outServer.keepAlive()
Server smtp.office365.com timed out. Reconnecting.
EmailServer.Connect(protocol=[smtp] address=[smtp.office365.com] port=[587] account=[xxx.com] passwd=[*] encryption=[ssl])
Using SMTP
Logged in. retVal=235 retMsg=b'2.7.0 Authentication successful target host CY1PR17MB0647.namprd17.prod.outlook.com'
Program completed in 872.879 seconds. Exiting
Closing everything...
Closing inbound email server...
Closing outbound email server...
Closing database file...
Database.dbClose(): Closing database object.
Closing log file...

from dupreport.

HandyGuySoftware avatar HandyGuySoftware commented on May 25, 2024

Excellent! That'll close out two issues. Let me know if you're having any more problems processing emails as they start to build up.

In version 2.1.1 I'll do some work to make the reading of emails more efficient so you don't have to download all of them. That should cut down on your run time sufficiently!

from dupreport.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.