Giter Site home page Giter Site logo

telegram-history-dump's Introduction

telegram-history-dump

This utility is the successor of telegram-json-backup, written from the ground up in Ruby. It can create backups of your Telegram user and (super)group dialogs using telegram-cli's remote control feature.

Compared to the old project, telegram-history-dump:

  • Has better support for media downloads
  • Supports output formats other than JSON and is extensible with custom formats
  • Supports incremental backup (only new messages are downloaded)
  • Does not depend on unstable Python/Lua bindings within telegram-cli
  • Has a separate YAML formatted configuration file

The default configuration will backup all dialogs to a directory named output, in JSON format, without downloading any media.

Usage

First time setup

  1. Compile telegram-cli, start it once to link your Telegram account
  2. Make sure Ruby 2+ is installed on your system: ruby --version
  3. Optionally configure your backup routine by editing config.yaml

Performing a backup

  1. Start telegram-cli with at least the following options: telegram-cli --json -P 9009
  2. While telegram-cli is running, execute the script: ruby telegram-history-dump.rb

Formatters

History will always be stored in JSON Lines compliant files. However, additional output formats can be produced by uncommenting a few lines in the configuration file.

You can enable one or more of the following formatter modules:

html creates styled, paginated chat logs vieweable with a web browser.

plaintext creates human-readable text files, organized as one file per day.

bare outputs only the actual message texts without any context. It is meant for linguistic / statistical analysis.

pisg creates daily logs compatible with the EnergyMech IRC logging format as input for the PISG chat statistics generator. Also see telegram-pisg.

You can also implement a custom formatter; see formatters/lib/formatter_base.rb for details.

Command line options

Most of the backup configuration is done through the config file, but a few specific options are available as CLI options. None of them are mandatory.

Usage: telegram-history-dump.rb [options]
    -c, --config=cfg.yaml            Path to YAML configuration file
    -k, --kill-tg                    Kill telegram-cli after backup
    -h, --help                       Show help
    -d, --dir=DIR                    Subdirectory for output files
                                     (relative to backup_dir in YAML config)
    -l, --limit=LIMIT                Maximum number of messages to backup
                                     for each target (overrides YAML config)

Notes

Usage notes:

  • It is possible to run telegram-cli on a different machine, e.g. as a daemon on a server. In this case you must pass --accept-any-tcp to telegram-cli and firewall the port appropriately to prevent unwanted exposure. Keep in mind that some options regarding media files will not work in a remote setup.
  • Be careful with decreasing chunk_delay or increasing chunk_size. Telegram seems to rate limit history requests. Going too fast may cause an operation to time out and force the script to skip part of a dump.

Telegram-cli issues known to affect telegram-history-dump:

  • vysheng/tg#947 can cause crashes when dumping channels with more than 100 messages.
  • vysheng/tg#904 can cause crashes when dialogs contain certain media files. If you get this, recompile telegram-cli with the suggested workaround.

telegram-history-dump's People

Contributors

4r0n05 avatar amalani avatar araishikeiwai avatar felipesanches avatar gorlug avatar hennes-maertins avatar hiyorimi avatar lgommans avatar mildsunrise avatar phylliida avatar the-glu avatar tmmsartor avatar tvdstaaij avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

telegram-history-dump's Issues

Backup group before becoming supergroup

If I do a dump of a supergroup I only back up until the date it was converted to supergroup. How could I back up conversations and media from before being converted to supergroup?

Output

Hi, Thanks for all your help.

I run the whole thing successfully! Outputs only Json files even aftee I uncomment formatters... doesnt matter which I I uncomment (use) nothing other than the Json file is made Can you please help me?

Thanks again
M

dumping channel has no author/user associated with the messages

When I dump a channel there's no author associated with any of the messages.

When I look at the jsonl file there is a "to" and "from" for each message but it looks like it's just the channel. Which sounds weird, but the "from" -> "id" matches the channel's ID.

"from":{ "id":"$050000008e0df13b169d06XXXXXXXX", "peer_type":"channel", "peer_id":XXXXXXXXX, "print_name":"XXXXXXXXX", "flags":65601, "title":"XXXXXXXX", "participants_count":0, "admins_count":0, "kicked_count":0 }, "to": the same,

Poking around the telegram-cli I'm able to see the list of users in the channel, but when I run something like history my_channel_name I get the same thing (channel ID where I think should be getting the actual user who sent the message).

Is this a bug in telegram-cli? If so, does anyone know of a work around? If that's all it is maybe this could go on your list of known issues.

(And this is a wonderful little utility, thanks so much, even with the issues I've had, it's saved me a lot of time.)

Formatting porblem

I am using the new version and I have a problem with formatting.
I want the plaintext with the configuration below:

formatters: {
plaintext: {
date_format: '%Y-%m-%d %H:%M:%S'
},
}

But the app still uses json to export. Would you please kindly help?

JSONL / conversation / day

The JSONL format is nice. However, files get big, and I would like to process (encrypt) them on a daily basis. I looked at the output of the plain text formatter, but that misses a lot of meta data.

Is it possible to have one JSONL file per conversation per day?

Fails with latest telegram-cli

telegram-history-dump.rb fails with latest telegram-cli. The error is

$ ./telegram-history-dump.rb 
I, [2016-04-26T14:40:55.234242 #8579]  INFO -- : Attaching to telegram-cli control socket at localhost:9009
I, [2016-04-26T14:40:55.346012 #8579]  INFO -- : Skipping 0 dialogs: (none)
I, [2016-04-26T14:40:55.346120 #8579]  INFO -- : Backing up 57 dialogs: [omissis]
I, [2016-04-26T14:40:55.365495 #8579]  INFO -- : Dumping "user#3417111" (range 1-100)
/home/giovanni/software/telegram/telegram-history-dump/dumpers/lib/dumper_interface.rb:39:in `>': comparison of String with 14 failed (ArgumentError)
    from /home/giovanni/software/telegram/telegram-history-dump/dumpers/lib/dumper_interface.rb:39:in `msg_fresh?'
    from ./telegram-history-dump.rb:84:in `block in dump_dialog'
    from ./telegram-history-dump.rb:67:in `reverse_each'
    from ./telegram-history-dump.rb:67:in `dump_dialog'
    from ./telegram-history-dump.rb:273:in `block in <main>'
    from ./telegram-history-dump.rb:269:in `each'
    from ./telegram-history-dump.rb:269:in `each_with_index'
    from ./telegram-history-dump.rb:269:in `<main>'

It works correctly with commit 160231b of telegram-cli. I did not have time to find the precise commit that makes it fail. I'm trying on a up-to-date Debian sid system.

Backup chat with special contact

Hello

i want to backup my conversations with special contact. i tried this by editing config.yaml => backup_users and adding contact's username or contact's print name or contact's id but not work!

is there any way?

Sometimes last message is dumped again

The tool is working great, except for a small bug: for some chats at every execution of telegram-history-dump the last message is added again to the jsonl file (and therefore to the other formatters). For example:

{"event":"message","unread":false,"out":false,"text":"Такой статьи у меня в базе пока нет!\nI don't have this article in database yet","id":"010000004bebb509376a0000000000004c0bdeeeb86ea71d","flags":256,"from":{"first_name":"Sci-Hub","id":"$010000004bebb5094c0bdeeeb86ea71d","username":"scihubot","peer_type":"user","peer_id":162917195,"print_name":"Sci-Hub","flags":1,"last_name":""},"to":{"id":"$010000005fd382005407330023a867ff","first_name":"Giovanni","username":"giomasce","peer_type":"user","peer_id":8573791,"print_name":"Giovanni_Mascellani","flags":524289,"when":"2016-08-02 22:30:57","last_name":"Mascellani","phone":"[omitted]"},"service":false,"date":1470169330}
{"event":"message","unread":false,"out":false,"text":"Такой статьи у меня в базе пока нет!\nI don't have this article in database yet","id":"010000004bebb509376a0000000000004c0bdeeeb86ea71d","flags":256,"from":{"first_name":"Sci-Hub","id":"$010000004bebb5094c0bdeeeb86ea71d","username":"scihubot","peer_type":"user","peer_id":162917195,"print_name":"Sci-Hub","flags":1,"last_name":""},"to":{"id":"$010000005fd382005407330023a867ff","first_name":"Giovanni","username":"giomasce","peer_type":"user","peer_id":8573791,"print_name":"Giovanni_Mascellani","flags":524289,"when":"2016-08-02 22:27:21","last_name":"Mascellani","phone":"[omitted]"},"service":false,"date":1470169330}

At each execution, another similar row will be added to the top of the file. It happens with just a few chats, maybe one or two of the dozens I have. I cannot see any patter: I used to think that only happened when the last message was an attachment, but the example above disproves that.

partial backup

Hi and thanks for your script.
I want to backup my chats every day; but in this way it takes a lot of times, because I'm member of many groups.
can you add some ability to your script to backup only changes?

backup history from deleted account

My problem is that deleted accounts hasn't named to insert in "backup_users" array in config file ,their has named in dialog_list ex: "deleted user#123456789" I tested this name but not working
And in the other hand if I fill empty array, in "Backing up dialogs" this deleted accounts doesn't exist.
So how can I backup history from deleted account?

Reduce memory usage

Running on a machine with 1GB RAM is hellishly slow when processing a decent number of messages, would be a great help if you made it not load so much into memory :) Love the project, very useful for some of my own :D

Backup chats by chat ids

Hi @tvdstaaij thank you for the great tool!

I have a question about backing up chats (I could try it myself but right now I'm dumping chats which has taken so long and probably will take more time).

What if I want to setup which chat/group to backup by their chat ids? Since many of my groups have 'very dynamic' chat titles (the title may change up to 10 times a day 😂)

I was thinking about writing the ids to the YAML file and then start the tg-cli with --disable-names option.

Will that work?

convert jsonl to html manually

Hello

is there any way to convert .jsonl files to .html manually? (where i can't access to telegram-cli ever again and telegram-history-dump stoped when was working)

View Counts of a channel message

Hi,
When I dump messages from a public channel it has no attribute for view counts which is visually available in official telegram app. It's not included in telegram-history-dump or I just can't find it?

connection refused !

I start telegram-cli by bin/telegram-cli --json -P 9009 then type ctrl + c to halt the window then type ruby ./telegram-history-dump.rb . then I get the following error :

:~/tg# ruby ./telegram-history-dump.rb
I, [2016-08-22T22:26:18.023648 #5512]  INFO -- : Attaching to telegram-cli control socket at localhost:9009
./telegram-history-dump.rb:26:in `initialize': Connection refused - connect(2) for "localhost" port 9009 (Errno::ECONNREFUSED)
        from ./telegram-history-dump.rb:26:in `open'
        from ./telegram-history-dump.rb:26:in `connect_socket'
        from ./telegram-history-dump.rb:276:in `<main>'

What should I do now?

No dialogs found

telegram-history-dump$ ruby telegram-history-dump.rb
I, [2016-07-28T22:58:33.908942 #19975]  INFO -- : Attaching to telegram-cli control socket at localhost:9009
telegram-history-dump.rb:282:in `<main>': No dialogs found (RuntimeError)

undefined method 'each' when dumping supergroup as channels

Read over the issues surrounding supergroups, used both the main branch and the dev branch you recommended. Doesn't seem to be an issue on the telegram-cli side of things anyway but I wanted to rule that out.

When trying to dump my group, I get the following:

brandon@LINUX:~/telegram-history-dump$ ruby telegram-history-dump.rb

I, [2016-06-28T21:45:20.620050 #28188]  INFO -- : Attaching to telegram-cli control socket at localhost:9009

telegram-history-dump.rb:161:in 'backup_target?': undefined method 'each' for "SuperGroup":String (NoMethodError)

    from telegram-history-dump.rb:268:in 'block in <main>'

    from telegram-history-dump.rb:250:in 'each'

    from telegram-history-dump.rb:250:in '<main>'

brandon@LINUX:~/telegram-history-dump$

I used the script perfectly for dumping a single conversation and tested with a normal group which worked fine.

Here is my config:
---

  backup_users: [
null
  ]

  backup_groups: [
null
  ]

  backup_channels: SuperGroup

Let me know if I am just missing something or if you need any more debug info.

Absolute path for backup_dir

Hi,

I set the backup_dir in config-yaml to:

backup_dir: '/home/araishikeiwai/Google Drive/Others/Telegram Backups'

But the output is stored in ./home/araishikeiwai/Google Drive/Others/Telegram Backups (relative path)

EDIT

I misread the line:
# It his is a relative path it will be relative to the script's directory

Did you mean "if this is a" or "this is a"?

If the latter, could you make it so that we can provide absolute path?

Conversion Error

how to solve this error?
298:in `join': no implicit conversion of nil into String (TypeError)

get smilies in html-output

is it possible to get the smilies from telegram to the html-output!?? at this time it shows only characters like this one:

🙈😱😳🙈😒�

thx for your work!!!

Dumping channels with the test branch in cygwin

I have built the test branch like this:

  1. Installed Complete Cygwin
  2. Then Cloned the test branch:
    git clone --recursive https://github.com/vysheng/tg.git -b test --single-branch tg-test
  3. configured (./configure)
  4. Patched the makefile and loop.c as Instructed Here
  5. executed make command.
  6. The telegram-cli is built without any problem I run telegram-cli with the following parameters:
    telegram-cli --json -P 9009
    as Instructed.
    I also run the "ruby telegram-history-dump/telegram-history-dump.rb" command in another terminal.
    But it skips all the dialogs and says:
I, [2016-01-30T21:07:27.425554 #4304]  INFO -- : Loading dumper module 'json'
I, [2016-01-30T21:07:27.427640 #4304]  INFO -- : Attaching to telegram-cli control socket at localhost:9009
I, [2016-01-30T21:09:53.108433 #4304]  INFO -- : Skipping 16 dialogs: "کتابخانه_الف#1", "++maryam", "Maman", "خانوادمون", "کتابخانه_الف", "Top10kala.ir", "Ali_Nfr", "Amani", "Mohamad_Ferdosi", "Ali_Sedaghat", "Saeed_Zhiani", "جوكستان_الف", "بچه_های_رجایی", "PWUTX", "کتابخانه_الف", "Top10kala.ir"
I, [2016-01-30T21:09:53.108510 #4304]  INFO -- : Backing up 0 dialogs: (none)
I, [2016-01-30T21:09:53.108537 #4304]  INFO -- : Finished

(It gets stuck a little bit after Attaching to telegram-cli control socket at localhost:9009 and then skips the dialogs and finishes)
I have tried changing the config.json5 But it doesn't affect anything. I have dumped history using the master branch build of telegram-cli and I didn't have any problem. but as you can see I can't dump anything with the test branch build. Any help would be appreciated.

Separate downloading data from exporting to formats

I think the program should always dump to one data structure and then export it into an output format like pisg. Jsonl(.gz) or sqlite seem good candidates for the 'internal' dump format because they are popular and portable (easy to work with when using external tools).

Advantages of this change:

  • Changing the output format (dumper) no longer causes a complete re-download, which is very inefficient and completely unnecessary.
  • Similarly, upgrading the chosen data structure (e.g. new pisg version) would not require a complete re-download.
  • If a backup is interrupted, no data would be lost. 'Complex' data formats that do not support append/prepend-only formatting (such as html) currently need to wait until the whole backup is complete before they can write files to disk.
  • Old downloads would not need to be re-parsed upon updates (partial downloads). For example to update an html export currently, you need to parse old html files, add the new messages, and finally dump everything into (a) file(s) again.
  • I think having the option of exporting to different formats, as it does now, adds a lot of value to the software. It could be said that external programs can format the data into any imaginable complex format (Unix philosophy: write programs that do one thing and do it well) but I think the exporting feature makes the software more useful and belongs in the project.

Reasons against this change:

  • Duplicate data: all events would be stored both in the internal database as well as zero or more exported formats. It only applies to text and not media, though, so it compresses nicely.
  • The cost of change (rewriting, testing, documenting).

Some background: despite being new to Ruby, I've been working on an html dumper to make browseable day-by-day dumps of chats, but ran into problems (basically everything mentioned above). I would also have liked to dump my chats into pisg format for the statistics program, just for fun, but this is not possible without downloading everything again. That's quite a waste of resources so I decided not to do that. Overall it seems like a good idea to change the way the dumping works because it makes it more modular.

In my case, it would make it easier to write a formatter/dumper/exporter: I could just look in the internal data structure to see what fields there are and only need to figure out how to convert that to my desired output using Ruby code. Currently I have to look at which events exist and are called when and handle them properly (start_backup, dialog_start, dump_msg, get_filename_for_date, etc.), I have to worry about downloads being interrupted due to errors or user interruption, a backup might be an update/partial... Not all of it is hard to understand or do, and some Ruby experience certainly helps as well, but altogether it's a relatively big task even as an otherwise quite experienced programmer.

I was/am very tempted to just write a simple python script to dump the jsonl into html, which would take no time at all. The reason I didn't is because I'd like to contribute back, both to the community in general and to this project in particular (because I like it and think it's very useful). I could have uploaded the python script to my own profile, but without a link from here to that parser, nobody would benefit unless they are specifically looking for an html export and randomly stumble upon my repository. Also it's good for me to practice something new (Ruby), but having to understand the whole program first goes a bit far, even if it is quite a simple program. Data conversion is a lot simpler to do than writing a dumper in its current form (plus the other advantages mentioned).

I'm also thinking of using this program as a back-end for allowing users to back up their chat history through a website, putting all chats in readable format (e.g. html) in a zip, but that goal is way beyond the scope of this application (and this issue) of course.

The change to this code base doesn't even have to take that long if someone experienced with Ruby does it: the current jsonl files can be kept and other formats just use the existing files as database. If there are no existing files, call the json dumper first. Other formats need to be more or less rewritten, but bare and plaintext are simple enough, and even pisg doesn't look like it would take too much time. I'd contribute a html dumper in that case (I'm hereby committing myself ;) ).

In other news, looking at the size of this post by now, I think I put way too much thought into this.

unexpected token (JSON::ParserError)

Hi!
I've been using this tool for a while without any issue until yesterday.
When i launch telegram-history-dump program quits with this exception right after connecting to telegram-cli:

/usr/lib/ruby/2.1.0/json/common.rb:155:in 'parse': 757: unexpected token at User *one_name*: 0 unread (JSON::ParserError)

from /usr/lib/ruby/2.1.0/json/common.rb:155:in parse
from telegram-history-dump.rb:37:in <main>
from telegram-history-dump.rb:278:in <main>

one_name: a Telegram Name, always the same one

How can I fix this?
Thank you!

Formatting media

is there any way to include media (videos, images, and captions included) when formatting as html?

Can't dump with test branch of tg-cli

I can't combine Test version of tg-cli + telegram-history-dump to dump anything.

I can dump users and groups if I use the master tg-cli (without channels_backup - with channels_backup it unsurprisingly throws error)

but if I switch to the test branch of tg-cli... telegram-history-dump runs without throwing error including channels_backup but doesn't actually dump anything.

I'm a newbie so I Followed suggestions here by @Polpetta / @vysheng - to install the test branch of telegram-cli

suspect I'm making an obvious error somewhere - but as a newbie I can't find it...

Is possible to backup media?

I used your script to backup a chat history, but this script downloaded media and seems corrupted (couldnt open from image viewer and appears like broken file on html-output)

There is anyway to download correctly

Thank you in advice =)

Thank you!

This is not really an issue, but a compliment! Your script worked flawless on my system, and i got all my precious chat messages. Thank you so much! As a thank-you gift i wrote a little Python script that converts from the JSONL format to CSV for easier viewing.

Missing messages in groups, from people not in contact list. Need to re-run the script.

I've been trying to export message history from some group chats. I got messages like the following in the process:

W, [2015-10-25T16:29:00.220902 #29023]  WARN -- : Message without date: {"event"=>"message", "id"=>372588}

And the json line ends up like:

{"event":"message","id":372588}

These come from certain users not in my contact list; but there are some others not in my contact list whose messages are exported correctly and I can't tell what determines whose messages are exported and whose not.

Then I noticed that if I re-run the script with same settings, no warnings are shown and all messages are exported. Like if they were cached in the first run so second one goes smoothly; maybe asynchronous calls to get the user info?

I repeatedly get this behavior with diff chats. I also get the same result if I delete my ~/.telegram-cli folder and start again.

Name too long - ignoring it

i want to download everything except 1 group, because his name is too long (261 chars)
is it possible to ignore some groups/chats?

Not dumping voice messages

This isn't dumping the voice messages… is there an option somewhere I didn't notice, or it this feature missing?

  download_media:
    photo: true
    document: true
    audio: true
    video: true

Pictures are downloaded fine.

{"event":"message","unread":false,"id":69985,"flags":258,"media":{"type":"???"}

Invalid JSON

Hey, this produces invalid json files with multiple root elements. Is this intentional?
Makes automatic parsing of the output files impossible...

Crashing when trying to backup ImageBot conversation

I'm not quite sure if this is the issue of your awesome backup script, or of the telegram-cli.
It seems like old content from @imagebot is no longer available and when the backup got to that conversation, both the telegram-cli and the backup script crashed.

Sadly, I couldn't simply add the ImageBot to a blacklist and backup everything else, so I deleted the chat history (of ImageBot) and now it works fine. (But it wasn't very important anyways)

I wouldn't class is as high-priority issue but it would be great if it wouldn't crash just because of some deleted content. ;)

Here are the logs:

telegram-history-dump:

I, [2016-02-17T16:10:42.932076 #4652]  INFO -- : Dumping "ImageBot" (range 1-100)
E, [2016-02-17T16:10:43.052428 #4652] ERROR -- : Failed to download media file: no implicit conversion of nil into String
E, [2016-02-17T16:10:43.053209 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.053626 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.053914 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.054132 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.054244 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.054463 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.054590 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.054679 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.054788 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.055270 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.055480 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.055955 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.056277 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.056553 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.056802 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.057216 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.057532 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.057897 #4652] ERROR -- : Failed to download media file: Broken pipe
E, [2016-02-17T16:10:43.058162 #4652] ERROR -- : Failed to download media file: Broken pipe
telegram-history-dump.rb:32:in `write': Broken pipe (Errno::EPIPE)
        from telegram-history-dump.rb:32:in `puts'
        from telegram-history-dump.rb:32:in `exec_tg_command'
        from telegram-history-dump.rb:58:in `block in dump_dialog'
        from /usr/lib/ruby/2.1.0/timeout.rb:91:in `block in timeout'
        from /usr/lib/ruby/2.1.0/timeout.rb:35:in `block in catch'
        from /usr/lib/ruby/2.1.0/timeout.rb:35:in `catch'
        from /usr/lib/ruby/2.1.0/timeout.rb:35:in `catch'
        from /usr/lib/ruby/2.1.0/timeout.rb:106:in `timeout'
        from telegram-history-dump.rb:57:in `dump_dialog'
        from telegram-history-dump.rb:193:in `block in <main>'
        from telegram-history-dump.rb:189:in `each'
        from telegram-history-dump.rb:189:in `each_with_index'
        from telegram-history-dump.rb:189:in `<main>'

telegram-cli:

> SIGNAL received
h/opt/telegram-cli/bin/telegram-cli(print_backtrace+0x20)[0x46f7a0]
/opt/telegram-cli/bin/telegram-cli(termination_signal_handler+0x64)[0x46f824]
/lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7ff027458180]
/opt/telegram-cli/bin/telegram-cli(tgl_do_load_photo+0xd)[0x49e10d]
/opt/telegram-cli/bin/telegram-cli(interpreter_ex+0x7d6)[0x477c76]
/opt/telegram-cli/bin/telegram-cli[0x46fd05]
/usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(+0x1b2aa)[0x7ff02979e2aa]
/usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc)[0x7ff0297933dc]
/opt/telegram-cli/bin/telegram-cli(net_loop+0xa4)[0x470cb4]
/opt/telegram-cli/bin/telegram-cli(loop+0x195)[0x471f75]
/opt/telegram-cli/bin/telegram-cli(main+0x2c4)[0x46e084]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7ff027444b45]
/opt/telegram-cli/bin/telegram-cli[0x46e180]

Exclude bots

It would be nice to have an option to exclude backing up bots, I have one that sends me a bunch of notifications daily and none of it is anything I need backed up. Alternatively, selectively excluding users similar to #40.

Store history by ID/username instead of display name?

Is it possible to make this store the chat history by username or user ID instead of the display name? Most of my friends change their display names rather regularly, so it throws off properly appending to their old logs on a subsequent incremental backup. Keeping a map of username/ID/display name could be used to convert it to display name for things like the HTML output.

html.rb undefined method

After dumping the file from Telegram the html formatter quits with "html.rb:195:in pagination': undefined method[]' for nil:NilClass (NoMethodError)"

Supergroups

Telegram-history-dump appears to be unable to dump messages from supergroups. When I perform a complete backup, it always comes up with "Skipping x dialogs: [names of every supergroup I'm in]". Even specifying JUST one of these in the config file still has it skipped. Strangely, the entire chat history from before I migrated the group comes up fine, under the name of the normal group that no longer exists.

Any word on this behavior?

Relative media paths

Excerpt from output/json/john_doe.jsonl:

{"event":"message","id":"010000008d6ba40c4104000000000000c4a5a5cf527658c5","flags":256,"from":{"id":"$010000008d6ba40cc4a5a5d536234sb6","peer_type":"user","peer_id":3634534f9,"print_name":"John_Doe","flags":65537,"first_name":"John","when":"2016-11-28 22:39:35","last_name":"Doe","phone":"4917661903543"},"to":{"id":"$0100000018ec6a110aca4f3645aff3d6","peer_type":"user","peer_id":292223423,"print_name":"Foo_Bar","flags":524289,"first_name":"Foo","when":"2016-11-28 22:39:53","last_name":"Bar","phone":"491741344354","username":"foobar"},"out":false,"unread":false,"service":false,"date":1480176787,"media":{"type":"photo","caption":"","file":"/home/foobar/telegram-history-dump/output/media/John_Doe/download_535454028_154356.jpg"}}

Note the absolute path:

/home/foobar/telegram-history-dump/output/media/John_Doe/download_535454028_154356.jpg

Absolute media paths break when migrating output to a different location. Is it possible to let the tool use relative paths?

Exception: Channel_list

Running for the first time, I get "can not find command 'Channel_List'"

Has this been removed by Telegram-CLI?

Possible temporary ban for using this tool

Do you know if it is possible to be temporary banned for using this tool? I have chats going for 2/3 years with 4000 shared medias and this tool could take several hours to do the job.

I wrote to a member of the support that says "i think using this tool is not "normal usage of telegram" so could be some temporal limitations like a temporary ban".

But i know that he had no idea how the tool works and i think that there's nothing to be worried about. What do you think?

Selectively exclude channels

It would be nice if one could also opt out of channels rather than opting into them. Say you want every group you're in except for one really large group that gets 2k+ messages a day. Instead of having to manually add each one, it'd be nice to just exclude the big one(s) so that you dont miss new groups.

Backup from first message

Hi, thank you for your good script...
I would like to know if this script can dump messages from first of one dialogue?
Or is there any way to dump a dialogue in special period of time? for example chats between January to March.
Best regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.