osteele / matrix-archive Goto Github PK
View Code? Open in Web Editor NEWExport a Matrix room message archive
License: MIT License
Export a Matrix room message archive
License: MIT License
When an image cannot be downloaded (for example when the hosting server has gone extinct) ,the script aborts with an AssertionError
without downloading the rest of the images.
Stack Trace:
$ python download_images.py --no-thumbnails
Skipping 872 already-downloaded images
Downloading 73 new images...
Traceback (most recent call last):
File "download_images.py", line 63, in <module>
download_images()
File "/home/naka/.local/share/virtualenvs/matrix-archive-j2607j-8/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/naka/.local/share/virtualenvs/matrix-archive-j2607j-8/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/naka/.local/share/virtualenvs/matrix-archive-j2607j-8/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/naka/.local/share/virtualenvs/matrix-archive-j2607j-8/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "download_images.py", line 59, in download_images
run_downloads(new_messages, download_dir, prefer_thumbnails=thumbnails)
File "download_images.py", line 23, in run_downloads
assert res.status_code == 200
AssertionError
Rather than asserting a 200
response, I'd propose catching the requests.exceptions.RequestException
and continue downloading the next images.
I encountered this when exporting a large chat (137988 messages):
Traceback (most recent call last):
File "/media/d/temp/git/matrix-archive/export_messages.py", line 80, in <module>
export_archive()
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/media/d/temp/git/matrix-archive/export_messages.py", line 66, in export_archive
print(f"Writing {len(messages)} messages to {filename!r}")
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/mongoengine/queryset/queryset.py", line 62, in __len__
list(self._iter_results())
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/mongoengine/queryset/queryset.py", line 110, in _iter_results
self._populate_cache()
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/mongoengine/queryset/queryset.py", line 129, in _populate_cache
self._result_cache.append(next(self))
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/mongoengine/queryset/base.py", line 1619, in __next__
raw_doc = next(self._cursor)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/cursor.py", line 1207, in next
if len(self.__data) or self._refresh():
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/cursor.py", line 1124, in _refresh
self.__send_message(q)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/cursor.py", line 999, in __send_message
response = client._run_operation_with_response(
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1368, in _run_operation_with_response
return self._retryable_read(
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1471, in _retryable_read
return func(session, server, sock_info, slave_ok)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1360, in _cmd
return server.run_operation_with_response(
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/server.py", line 135, in run_operation_with_response
_check_command_response(first, sock_info.max_wire_version)
File "/home/bodqhrohro/.local/lib/python3.8/site-packages/pymongo/helpers.py", line 160, in _check_command_response
raise OperationFailure(errmsg, code, response, max_wire_version)
pymongo.errors.OperationFailure: Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit., full error: {'ok': 0.0, 'errmsg': 'Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.', 'code': 96, 'codeName': 'OperationFailed'}
This can be worked around, though adding an actual index should be better.
Thanks for sharing this script!
It would be cool if it supported end-to-end-encrypted rooms too. Do you think you could add support?
I have setup matrix-archive like in https://github.com/osteele/matrix-archive, set env variables (MATRIX_*) and run pipenv run list
, but get this Traceback:
Signing into https://my_matrix_server...
Traceback (most recent call last):
File "list_rooms.py", line 23, in <module>
list_rooms()
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "list_rooms.py", line 13, in list_rooms
rooms = matrix_client().get_rooms()
File "/opt/matrix-archive/matrix_connection.py", line 21, in matrix_client
password=MATRIX_PASSWORD)
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/matrix_client/client.py", line 249, in login_with_password
return self.login(username, password, limit, sync=True)
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/matrix_client/client.py", line 280, in login
self._sync()
File "/root/.local/share/virtualenvs/matrix-archive-8WJWP9GR/lib/python3.6/site-packages/matrix_client/client.py", line 562, in _sync
for room_id, invite_room in response['rooms']['invite'].items():
KeyError: 'invite'
I use Python 3.6.5 from a Docker container:
docker run -it python:3.6.5 bash
pip install --upgrade pip
pip install pipenv
cd /opt/
git clone https://github.com/osteele/matrix-archive.git
cd matrix-archive/
pipenv install
export MATRIX_HOST=https://my_matrix_server
export MATRIX_USER=my_username
export MATRIX_PASSWORD=my_password
pipenv run list
I use 2fa3a22129b8 matrixdotorg/synapse:latest
. Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.