Giter Site home page Giter Site logo

clara's People

Contributors

cristobalcl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

clara's Issues

Rate limit reached for default-text-embedding-ada-00

openai.error.RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-foo on tokens per min. Limit: 1000000 / min. Current: 0 / min. Contact us through our help center at help.openai.com if you continue to have issues.

the media wiki code base is 130+MB
and this exceeds the rate limit.

perhaps a switch to slow down the creation of the vectordb so that we can not hit this limit would be nice for larger codebases.

although perhaps not... as a repo of this size might just go over the token limit anyways.

Refactor code base ingestion

The way the files are ingested is very inefficient.

Also, we should leverage that the approach for parsing code was released in LangChain. So using the new LanguageParser will improve and simplify the code.

ERROR: No matching distribution found for clara-ai

I'm trying to use your superpower tool through Mac M1. Unfortunately, an error was found in the installation process.
Is this repository suitable for Mac M1?

(clara-code-reader) @macbook-pro clara % pip install clara-ai
ERROR: Could not find a version that satisfies the requirement clara-ai (from versions: none)
ERROR: No matching distribution found for clara-ai

directories with "." appear to break the parser.

Loading /home/foo/Github/mediawiki/resources/src/mediawiki.less …                                                                                                  index.py:51
Traceback (most recent call last):
  File /home/foo/may15venv/lib64/python3.11/site-packages/langchain/document_loaders/text.py, line 40, in load
    with open(self.file_path, encoding=self.encoding) as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: '/home/foo/Github/mediawiki/resources/src/mediawiki.less'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File /home/foo/may15venv/bin/clara, line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/clara/cli.py, line 177, in main
    fire.Fire(Clara())
  File /home/foo/may15venv/lib64/python3.11/site-packages/fire/core.py, line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/fire/core.py, line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/fire/core.py, line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/clara/cli.py, line 105, in chat
    index, chat = setup(path, memory_storage)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/clara/cli.py, line 30, in setup
    index.ingest()
  File /home/foo/may15venv/lib64/python3.11/site-packages/clara/index.py, line 73, in ingest
    texts = self._get_texts()
            ^^^^^^^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/clara/index.py, line 53, in _get_texts
    documents.extend(loader.load_and_split())
                     ^^^^^^^^^^^^^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/langchain/document_loaders/base.py, line 43, in load_and_split
    docs = self.load()
           ^^^^^^^^^^^
  File /home/foo/may15venv/lib64/python3.11/site-packages/langchain/document_loaders/text.py, line 56, in load
    raise RuntimeError(fError loading {self.file_path}) from e
RuntimeError: Error loading /home/foo/Github/mediawiki/resources/src/mediawiki.less

Add a way to ignore files

I tried clara on a couple projects and run into errors in the document loading process.

           Loading node_modules/ipaddr.js …                              
Traceback (most recent call last):                                                                                         
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/text.py", line 40, in load
    with open(self.file_path, encoding=self.encoding) as f:                                                                
IsADirectoryError: [Errno 21] Is a directory: 'node_modules/ipaddr.js'
                                                                                                                                                                                             
  File "/opt/homebrew/lib/python3.10/site-packages/clara/cli.py", line 30, in setup   
    index.ingest()
  File "/opt/homebrew/lib/python3.10/site-packages/clara/index.py", line 73, in ingest 
    texts = self._get_texts()
  File "/opt/homebrew/lib/python3.10/site-packages/clara/index.py", line 53, in _get_texts
    documents.extend(loader.load_and_split())
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/base.py", line 43, in load_and_split
    docs = self.load()
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/text.py", line 56, in load
    raise RuntimeError(f"Error loading {self.file_path}") from e
RuntimeError: Error loading /Users/tmm1/fancybits/chrome-capture-for-channels/node_modules/ipaddr.js
           Loading ext/libhdhomerun/README.md …                           index.py:51
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/text.py", line 41, in load
    text = f.read()
  File "/opt/homebrew/Cellar/[email protected]/3.10.12/Frameworks/Python.framework/Versions/3.10/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 10: invalid start byte

If there was a way to specify directories to ignore, I could tell it to stop traversing into directories like node_modules, vendor/gems and ext in these projects.

langchain dep doesn't need extras = ["llms"] ?

 pip install clara-ai


Collecting nvidia-cuda-nvrtc-cu11==11.7.99
  Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 10.3 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu11==11.7.99
  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 kB 5.5 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu11==11.7.101
  Downloading nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 11.4 MB/s eta 0:00:00

Collecting nvidia-cudnn-cu11==8.5.0.96
  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 3.3 MB/s eta 0:00:00
Collecting nvidia-cublas-cu11==11.10.3.66
  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 5.3 MB/s eta 0:00:00
Collecting nvidia-cufft-cu11==10.9.0.58
  Downloading nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.4/168.4 MB 6.5 MB/s eta 0:00:00
Collecting nvidia-curand-cu11==10.2.10.91
  Downloading nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.6/54.6 MB 8.8 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu11==11.4.0.1
  Downloading nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)

I would predict that you dont need llms extras from langchain for this project.

langchain = {version = ">=0.0.139", extras = ["llms"]}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.