apirrone / memento Goto Github PK

Memento is a Python app that records everything you do on your computer and lets you go back in time, search, and chat with a LLM (Large Language Model) to find back information about what you did.

License: MIT License

Python 100.00%

llms productivity pygame python

memento's Introduction

Memento

Memento is a Python app that records everything you do on your computer and lets you go back in time, search, and chat with a LLM (Large Language Model) to find back information about what you did.

demo_memento.mp4

This project is heavily inspired by rewind.ai

How it works:

The app takes a screenshot every 2 seconds
It compiles the screenshots into h264 video segments for storage efficiency
It uses OCR to extract text from the images
It indexes the text in a sqlite3 database and a vectordb
It uses FTS5 to search the text
It uses a LLM (GPT through OpenAI's API) to chat with the timeline

Branches :

The main branch is the latest release
The dev branch contains the latest "stable" improvements that will be merged into main periodically
Any other branch is a feature currently being developed

Disk space and performance considerations

Right now, Memento produces about 120MB of data per hour
We are working on ways to reduce this number
TODO profile cpu usage of Memento

Installation

This project was tested on Ubuntu 22.04.

$ pip install -e .

You also need to install tesseract-ocr on your system. To install latest version (tesseract 5.x.x):

$ sudo apt update
$ sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
$ sudo apt install tesseract-ocr

Then install the language packs you need, for example:

$ sudo apt install tesseract-ocr-eng
$ sudo apt install tesseract-ocr-fra

You also need to set an environment variable : (This is the path on my system, it may be different on yours)

export TESSDATA_PREFIX=/usr/share/tesseract-ocr/5/tessdata/

If you want to chat with the timeline through a llm, you need an openai api key in your env as OPENAI_API_KEY.

Usage

Run the background process :

$ memento-bg

Run the timeline :

$ memento-timeline

Controls :

Scroll horizontally or vertically to navigate the timeline.
ctrl+scroll to zoom the timeline in and out.
Hover the timeline to see a preview of the screenshot at that time, click to navigate there.
Press d for debug mode (useful for development)
ctrl+f to open search sidebar.
ctrl+t to open chat sidebar.
click+drag on a screenshot to select text, ctrl+c to copy it to clipboard.

Contributing:

Feel free to contribute !

Fork the repo, and submit a PR to the dev branch.

memento's People

Contributors

Stargazers

Watchers

memento's Issues

[Timeline] Handle different screen sizes

[Backend] Maybe go back to previous embedding function

Did it run locally ?

With openai embeddings you could get :

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised 
APIConnectionError: Error communicating with OpenAI: HTTPSConnectionPool(host='api.openai.com', port=443): Max retries 
exceeded with url: /v1/embeddings (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 
0x7f8289f978b0>: Failed to resolve 'api.openai.com' ([Errno -3] Temporary failure in name resolution)")).

[Timeline] Show that the background process is running

A little icon somewhere

[Timeline chat] Currently selected text as context

[Timeline] Sliding window timebar

Hey, I met this issue on a ubuntu 22.04 virtual machine which installed on the real macOS machine

[Timeline] Better search UI / UX

Show latest match first, iterate backwards
Search buffer ? -> faster search
filter by apps / time period

[App] Make a nice logo :)

libXrandr.so.2 problem after memento-bg

6.5.0-15-generic
Linux machine 6.5.0-15-generic #15~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC x86_64 x86_64 x86_64 GNU/Linux
Linux Mint
i5 12400 integrated graphics

useruser1@computerr:/$ memento-bg
pygame 2.5.0 (SDL 2.28.0, Python 3.10.8)
Hello from the pygame community. https://www.pygame.org/contribute.html
No OPENAI_API_KEY environment variable found, LLM related features will not be available
Traceback (most recent call last):
File "/home/linuxbrew/.linuxbrew/bin/memento-bg", line 33, in
sys.exit(load_entry_point('memento', 'console_scripts', 'memento-bg')())
File "/home/useruser1/Documents/folder1/folder2/memento/memento/init.py", line 6, in bg
backgound = Background()
File "/home/useruser1/Documents/folder1/folder2/memento/memento/background.py", line 62, in init
self.sct = mss.mss()
File "/home/linuxbrew/.linuxbrew/opt/[email protected]/lib/python3.10/site-packages/mss/factory.py", line 34, in mss
return linux.MSS(**kwargs)
File "/home/linuxbrew/.linuxbrew/opt/[email protected]/lib/python3.10/site-packages/mss/linux.py", line 302, in init
self.xrandr = cdll.LoadLibrary(_XRANDR)
File "/home/linuxbrew/.linuxbrew/opt/[email protected]/lib/python3.10/ctypes/init.py", line 452, in LoadLibrary
return self._dlltype(name)
File "/home/linuxbrew/.linuxbrew/opt/[email protected]/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: libXrandr.so.2: cannot open shared object file: No such file or directory

Tried this, but issue remains:

2054 sudo aptitude install libxrandr2
2055 sudo aptitude install libxrandr2:i386
2057 sudo aptitude install libxrandr-dev
2058 sudo apt install libxrandr-dev:i386

Sorry, guys, if i am stupid. I am a newbie and i created this github account only to solve this issue and use your soft.
Great software, btw.

[Timeline] Fix text selection

Currently broken

[Timeline] Ability to select region in image and run ocr on it

Like in normcap

[Backend] opencv + av imshow not working

Known issue,

https://stackoverflow.com/questions/72604912/cant-show-image-with-opencv-when-importing-av

Maybe I'll use pygame to display UI and everything

[Backend] Try langchain integration of unstructured as is ?

https://python.langchain.com/docs/integrations/document_loaders/image

let unstructured do the work
returns a document object with metadata, put this in vectordb as is
see if it is better for the chat
no need for custom json metadata anymore ?

[Timeline] Choose a nice color palette (material / pastel?) and stick with it

initialize it at first launch, update it when new app is seen

[App] Profile the cpu load and average storage consumption

memento-timeline just showing a completely black window instead of the expected GUI.

As the title states. I've had to make some modifications to work with my Hyprland environment in the memento/background.py file and now memento-bg is working for me and taking the expected screenshots, etc. However when I run memento-timeline all I get is a fully-black window. There's no error-output in the terminal and I haven't made any modifications to the files in memento/timeline so I'm not quite sure what the issue is. Any help would be appreciated. Thanks.

[Backend] Cache metadata ?

Maybe not, it's one big file anyways
Maybe split metadata the same way as video for faster loading and better caching

Run memento-bg error

os: monterey
device:macbook pro M1
python:3.10

error log:
Traceback (most recent call last):
File "/Users/zhoulingfeng/miniconda3/envs/memnto/bin/memento-bg", line 33, in
sys.exit(load_entry_point('memento', 'console_scripts', 'memento-bg')())
File "/Users/zhoulingfeng/Desktop/code/Memento/memento/init.py", line 6, in bg
backgound = Background()
File "/Users/zhoulingfeng/Desktop/code/Memento/memento/background.py", line 77, in init
self.workers[i].start()
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
return Popen(process_obj)
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Users/zhoulingfeng/miniconda3/envs/memnto/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'sqlite3.Connection' object

[Backend] Optimize storage and computing power by not saving and recomputing existing (or very close?) frames

need to find a fast and smart way to diff images
rely on app classification ?

[Timeline] Gracefully handle timeline startup when no records are found

[Backend] Optimize cpu consumption for background process

Right now it is too much

Try to restrict to one cpu
Mostly chromadb add that takes all cpus at once

[Timeline] Handle special keys in search and chat

[Backend] Smart fusing of frames into segments in db

If we detect overlap in the text of frames that are next to each other, we can aggregate the text into a segment for better llm context

[App] Better way to run memento

Just a command "memento". The timeline opens, you can activate/deactivate recording. When you close it, the recording process still runs in background

[Backend] Use paragraphs instead of words in chromadb

Adding a lot of independant words to chromadb takes a long time ans is inefficient in terms of semantic search
Try to merge the bboxes to make paragraphs, the ocr can run efficiently on crops of images
then when querying, get the paragraph and re run ocr to get the precise word ?

[Timeline] Dynamically refresh timeline

[Backend] Build a good context for llm query

The vectordb contains sentences, their bbs, and some metadata (window title and date)
When making a llm query, we should custom build a relevant context
- Retrieve relevent frames
- extract all text -> try to package in blocks / paragraphs

Windows

Is there even a chance for it to work on Windows?

Currently, tesserocr wheels doesn't compile on Windows and installed binary is not found.

Also, OPENAI_API_KEY cannot be found no matter what I do. I guess I need to try to restart the system as a last resort. Yes, restarting actually helped somehow.

> memento-bg
pygame 2.5.0 (SDL 2.28.0, Python 3.10.12)
Hello from the pygame community. https://www.pygame.org/contribute.html
Traceback (most recent call last):
  File "C:\Users\dzidm\miniconda3\Scripts\memento-bg-script.py", line 33, in <module>
    sys.exit(load_entry_point('memento', 'console_scripts', 'memento-bg')())
  File "c:\users\dzidm\downloads\memento\memento\__init__.py", line 6, in bg
    backgound = Background()
  File "c:\users\dzidm\downloads\memento\memento\background.py", line 75, in __init__
    self.workers[i].start()
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\context.py", line 336, in _Popen
    return Popen(process_obj)
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'sqlite3.Connection' object
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\dzidm\miniconda3\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

For now I am give up on this...

[Timeline] rework the UI of the timeline (like rewind.ai)

[Timeline] Sort by apps + different found frames display

like in rewind.ia

[Timeline UI] Better ui for navigation on long recordings (and live)

It is hard to know where we are right now
long to go far back in time

One solution : drag timeline

As metadata
To give more context ?
use a image to text solution ?

test_h264.py contains a first try

need to be able to find back frames from timestamps

Something related to "time machine" ? or juste "Time Machine" actually
Best idea yet : "Memento"
Multivac
Simply "Timeline" or "Timewarp"