Giter Site home page Giter Site logo

dapper-magician / second-brain-agent Goto Github PK

View Code? Open in Web Editor NEW

This project forked from flepied/second-brain-agent

0.0 0.0 0.0 648 KB

🧠 Second Brain AI agent

License: GNU General Public License v3.0

Shell 14.64% Python 84.19% Makefile 1.17%

second-brain-agent's Introduction

🧠 Second Brain AI agent

Introducing the Second Brain AI Agent Project: Empowering Your Personal Knowledge Management

Are you overwhelmed with the information you collect daily? Do you often find yourself lost in a sea of markdown files, videos, web pages, and PDFs? What if there's a way to seamlessly index, search, and even interact with all this content like never before? Welcome to the future of Personal Knowledge Management: The Second Brain AI Agent Project.

πŸ“ Inspired by Tiago Forte's Second Brain Concept

Tiago Forte's groundbreaking idea of the Second Brain has revolutionized the way we think about note-taking. It’s not just about jotting down ideas; it's about creating a powerful tool that enhances learning and creativity. Learn more about Building a Second Brain by Tiago Forte here.

πŸ’Ό What Can the Second Brain AI Agent Project Do for You?

  1. Automated Indexing: No more manually sorting through files! Automatically index the content of your markdown files along with contained links, such as PDF documents, YouTube videos, and web pages.

  2. Smart Search Engine: Ask questions about your content, and our AI will provide precise answers, using the robust OpenAI Large Language Model. It’s like having a personal assistant that knows your content inside out!

  3. Effortless Integration: Whether you follow the Second Brain method or have your own unique way of note-taking, our system seamlessly integrates with your style, helping you harness the true power of your information.

  4. Enhanced Productivity: Spend less time organizing and more time innovating. By accessing your information faster and more efficiently, you can focus on what truly matters.

βœ… Who Can Benefit?

  • Professionals: Streamline your workflow and find exactly what you need in seconds.
  • Students: Make study sessions more productive by quickly accessing and understanding your notes.
  • Researchers: Dive deep into your research without getting lost in information overload.
  • Creatives: Free your creativity by organizing your thoughts and ideas effortlessly.

πŸš€ Get Started Today

Don't let your notes and content overwhelm you. Make them your allies in growth, innovation, and productivity. Join us in transforming the way you manage your personal knowledge and take the leap into the future.

Details

If you take notes using markdown files like in the Second Brain method or using your own way, this project automatically indexes the content of the markdown files and the contained links (pdf documents, youtube video, web pages) and allows you to ask question about your content using the OpenAI Large Language Model.

The system is built on top of the LangChain framework and the ChromaDB vector store.

The system takes as input a directory where you store your markdown notes. For example, I take my notes with Obsidian. The system then processes any change in these files automatically with the following pipeline:

graph TD
A[Markdown files from Obsidian]-->B[Text files from markdown and pointers]-->C[Text Chunks]-->D[Vector Database]-->E[Second Brain AI Agent]
Loading

From a markdown file, transform_md.py extracts the text from the markdown file, then from the links inside the markdown file it extracts pdf, url, youtube video and transforms them into text. There is some support to extract history data from the markdown files: if there is an ## History section or the file name contains History, the file is split in multiple parts according to <day> <month> <year> sections like ### 10 Sep 2023.

From these text files, transform_txt.py breaks these text files into chunks, create a vector embeddings and then stores these vector embeddings into a vector database.

The second brain agent is using the vector database to answer questions about your documents using a large language model.

Installation

You need a Python 3 interpreter, poetry and the inotify-tools installed. All this has been tested under Fedora Linux 38 on my laptop and Ubuntu latest in the CI workflows. Let me know if it works on your system.

Get the source code:

$ git clone https://github.com/flepied/second-brain-agent.git

Copy the example .env file and edit it to suit your settings:

$ cp example.env .env

Install the dependencies using poetry:

$ poetry install

There is a bug between poetry, torch and pypi, to workaround just do:

$ poetry run pip install torch

Then to use the created virtualenv, do:

$ poetry shell

systemd services

To install systemd services to manage automatically the different scripts when the operating system starts, use the following command (need sudo access):

$ ./install-systemd-services.sh

To see the output of the md and txt services:

$ journalctl --unit=sba-md.service
$ journalctl --unit=sba-txt.service

Doing a similarity search with the vector database

$ ./similarity.py "What is LangChain?" type=note

Launching the web UI

Launch this command to access the web UI:

$ streamlit run second_brain_agent.py
  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8502
  Network URL: http://192.168.121.112:8502

Here is an example:

Screenshot

Development

Install the extra dependencies using poetry:

$ poetry install --with test

And then run the tests, like this:

$ poetry run pytest

pre-commit

Before submitting a PR, make sure to activate pre-commit:

poetry run pre-commit install

second-brain-agent's People

Contributors

flepied avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.