seednapseai / clara Goto Github PK
View Code? Open in Web Editor NEWCLARA: Code Language Assistant & Repository Analyzer
License: BSD 3-Clause "New" or "Revised" License
CLARA: Code Language Assistant & Repository Analyzer
License: BSD 3-Clause "New" or "Revised" License
openai.error.RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-foo on tokens per min. Limit: 1000000 / min. Current: 0 / min. Contact us through our help center at help.openai.com if you continue to have issues.
the media wiki code base is 130+MB
and this exceeds the rate limit.
perhaps a switch to slow down the creation of the vectordb so that we can not hit this limit would be nice for larger codebases.
although perhaps not... as a repo of this size might just go over the token limit anyways.
The way the files are ingested is very inefficient.
Also, we should leverage that the approach for parsing code was released in LangChain. So using the new LanguageParser
will improve and simplify the code.
I'm trying to use your superpower tool through Mac M1. Unfortunately, an error was found in the installation process.
Is this repository suitable for Mac M1?
(clara-code-reader) @macbook-pro clara % pip install clara-ai
ERROR: Could not find a version that satisfies the requirement clara-ai (from versions: none)
ERROR: No matching distribution found for clara-ai
Loading /home/foo/Github/mediawiki/resources/src/mediawiki.less … index.py:51
Traceback (most recent call last):
File /home/foo/may15venv/lib64/python3.11/site-packages/langchain/document_loaders/text.py, line 40, in load
with open(self.file_path, encoding=self.encoding) as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: '/home/foo/Github/mediawiki/resources/src/mediawiki.less'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File /home/foo/may15venv/bin/clara, line 8, in <module>
sys.exit(main())
^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/clara/cli.py, line 177, in main
fire.Fire(Clara())
File /home/foo/may15venv/lib64/python3.11/site-packages/fire/core.py, line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/fire/core.py, line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/fire/core.py, line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/clara/cli.py, line 105, in chat
index, chat = setup(path, memory_storage)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/clara/cli.py, line 30, in setup
index.ingest()
File /home/foo/may15venv/lib64/python3.11/site-packages/clara/index.py, line 73, in ingest
texts = self._get_texts()
^^^^^^^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/clara/index.py, line 53, in _get_texts
documents.extend(loader.load_and_split())
^^^^^^^^^^^^^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/langchain/document_loaders/base.py, line 43, in load_and_split
docs = self.load()
^^^^^^^^^^^
File /home/foo/may15venv/lib64/python3.11/site-packages/langchain/document_loaders/text.py, line 56, in load
raise RuntimeError(fError loading {self.file_path}) from e
RuntimeError: Error loading /home/foo/Github/mediawiki/resources/src/mediawiki.less
I tried clara on a couple projects and run into errors in the document loading process.
Loading node_modules/ipaddr.js …
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/text.py", line 40, in load
with open(self.file_path, encoding=self.encoding) as f:
IsADirectoryError: [Errno 21] Is a directory: 'node_modules/ipaddr.js'
File "/opt/homebrew/lib/python3.10/site-packages/clara/cli.py", line 30, in setup
index.ingest()
File "/opt/homebrew/lib/python3.10/site-packages/clara/index.py", line 73, in ingest
texts = self._get_texts()
File "/opt/homebrew/lib/python3.10/site-packages/clara/index.py", line 53, in _get_texts
documents.extend(loader.load_and_split())
File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/base.py", line 43, in load_and_split
docs = self.load()
File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/text.py", line 56, in load
raise RuntimeError(f"Error loading {self.file_path}") from e
RuntimeError: Error loading /Users/tmm1/fancybits/chrome-capture-for-channels/node_modules/ipaddr.js
Loading ext/libhdhomerun/README.md … index.py:51
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/langchain/document_loaders/text.py", line 41, in load
text = f.read()
File "/opt/homebrew/Cellar/[email protected]/3.10.12/Frameworks/Python.framework/Versions/3.10/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 10: invalid start byte
If there was a way to specify directories to ignore, I could tell it to stop traversing into directories like node_modules
, vendor/gems
and ext
in these projects.
pip install clara-ai
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 10.3 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu11==11.7.99
Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 kB 5.5 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu11==11.7.101
Downloading nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 11.4 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu11==8.5.0.96
Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 3.3 MB/s eta 0:00:00
Collecting nvidia-cublas-cu11==11.10.3.66
Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 5.3 MB/s eta 0:00:00
Collecting nvidia-cufft-cu11==10.9.0.58
Downloading nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.4/168.4 MB 6.5 MB/s eta 0:00:00
Collecting nvidia-curand-cu11==10.2.10.91
Downloading nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.6/54.6 MB 8.8 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu11==11.4.0.1
Downloading nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
I would predict that you dont need llms extras from langchain for this project.
langchain = {version = ">=0.0.139", extras = ["llms"]}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.