Comments (14)
I'll check it out, sounds like a weird bug.
from discord-history-tracker.
Also, what version of the tracker do you have? It's visible in Settings.
from discord-history-tracker.
I'm not sure if it's the same bug, but I ran into a similar problem (other party's nick getting replaced by mine) in my log.
Unfortunately, because of your userindex
-based approach and varying ordering of the userindex
list, there's a lot of these in the diffs I tried between my older dumps, making it difficult to identify when the bug occurred with trivial shell scripting:
- "u": 0
+ "u": 1
- "u": 1
+ "u": 0
from discord-history-tracker.
OK, I whipped up a quick little Python script to let me identify between which two old copies of the log one of the two user IDs became less prevalent despite more messages being added, and it looks like whatever went wrong happened between these two dumps:
-rw-rw-r-- 1 ssokolow ssokolow 1.8M Apr 28 19:14 dht.txt.old.13
-rw-rw-r-- 1 ssokolow ssokolow 2.1M Jul 3 21:59 dht.txt.old.14
That said, were I in your situation, I'd never have gone with the userindex
-based approach specifically because it's so easy to introduce this kind of data-loss bug. It just feels like premature optimization at the cost of fragility to me to not store something like "u": "398493450724704277"
in the first place. If space becomes an issue, either introduce gzip or implement chunking, depending on where it's becoming an issue.
from discord-history-tracker.
File size would definitely be an issue and unfortunately there are no browser APIs for compression (and a third party compression library would've made DHT several times larger and possibly not fit within bookmarklet/URL limits). Browsers can already take a long time to generate the download and run out of memory while tracking messages, which I think is beyond the realm of premature optimizations.
I'll look into the issue more, would be nice to have minimal reproduction steps but I suspect the issue is somewhere in archive combining code.
from discord-history-tracker.
Browsers can already take a long time to generate the download and run out of memory while tracking messages, which I think is beyond the realm of premature optimizations.
I only use DHT on a single private conversation per file, but that concern did come to mind. Have you considered support for chunked output as a complement to the default "pause tracking on encountering already-seen messages" behaviour?
from discord-history-tracker.
As for reproduction code, if you can provide me with something more suitable to batch operation to test with (ideally, something that'll run on the command-line under Node.js), I kept every revision dht.txt
went through to hedge against just this kind of thing.
If I can trigger the problem with any of those, I can pare down the chatlog to something I'm willing to share.
from discord-history-tracker.
I haven't considered chunked output, DHT started as "save whatever your computer and browser can handle", so I tried to compact the JSON structure so that you could save a reasonable amount of messages (i.e. all of one person's DMs at minimum, up to a few hundred thousand messages).
What the project could really use is unit tests, but I'm barely working on it nowadays because it's low priority and there appear to be much more user-friendly alternatives :P but anyway, I'll go over the code and try to find a reproducible example.
from discord-history-tracker.
What the project could really use is unit tests, but I'm barely working on it nowadays
That sounds familiar... though mine being low priority is far less my choice than I'd like. :)
from discord-history-tracker.
Oh, speaking of which...
and there appear to be much more user-friendly alternatives
Which ones are you thinking of? Yours was the only Linux-compatible solution that turned up last time I googled around.
from discord-history-tracker.
Which ones are you thinking of? Yours was the only Linux-compatible solution that turned up last time I googled around.
Fair enough, though Discord Chat Exporter has a multiplatform CLI version, which may still be more "user-friendly" than dealing with the mess I made :P. Haven't used it though, so I can't tell.
Anyway I found the bug, or at least one bug - the archive combining code has a safeguard in case a user was missing from the index, and it coerced "undefined" and "0" into the same thing, so a valid user was being considered invalid with a fallback to its original (and now wrong) ID from the other archive.
Stupid mistake on my part, but at least this could only happen when combining archives after tracking messages, which is probably why barely anyone noticed because the recommended steps are to upload the archive first and only then start tracking new messages.
Do you remember uploading the archive after tracking, or combining multiple archives together, at the time where diffs show the changes? Otherwise there may be more than 1 issue.
from discord-history-tracker.
I think I remember doing "upload, track, re-upload" (with the same dht.txt
) at some point in time for some reason that now escapes me, but my memory for dates and times is terrible.
from discord-history-tracker.
Well, the diff looks like what happened in my test case and what happened to OP, but the OP mentioned renamed user accounts which didn't make sense.
I'll push the fix and close the issue, then. Unfortunately the only reliable way to fix corrupted archives is to load the archive and re-track all messages. Even with your full revision history, it'd probably take less time to re-track than to script a fix based on the diffs.
from discord-history-tracker.
I already re-dumped but, just in case, I might try a little script in the future to be sure.
It shouldn't be too difficult or time-consuming for me to whip up a little Python script which walks through from oldest to newest revision, building its own list of message dicts with user IDs rather than indexes, and then raise the alarm if, after the process is finished, there are any mismatches between the first appearance of a given message ID and the most recent dump's copy.
from discord-history-tracker.
Related Issues (20)
- New Version of Tracker for Mac OS Displaying As TextEdit files and not Unix Executable HOT 2
- Shows overwrite prompt HOT 3
- Cannot see any messages on newest update of Discord HOT 3
- Cannot see any messages HOT 1
- discord.com says : The selected channel is not visible in the channel list. HOT 1
- Database error HOT 3
- Getting "Error" when Click "Start Tracking" despite Server status Ready HOT 8
- Storing raw message data HOT 1
- Running the exe froze computer HOT 5
- DHT No Longer Working HOT 2
- DHT browser version not working HOT 2
- Feature Request: Display the list of failed downloads as links and their location HOT 6
- Way to visualize files downloaded from chat (archive.dht) HOT 3
- Downloading avatars that have full version removed will fail; DHT will not download small avatar image
- Links broken in older .txt archives HOT 1
- Can't download larger attachments HOT 3
- Download messages from a single bot only HOT 3
- [Mac OS] Cannot Open Databases. Can only Create New Ones HOT 2
- New Error on a very specific discord channel HOT 5
- DHT Viewer Error HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from discord-history-tracker.