hannibal046 / awesome-llm Goto Github PK

View Code? Open in Web Editor NEW

15.0K 15.0K 1.2K 14.28 MB

Awesome-LLM: a curated list of Large Language Model

License: Creative Commons Zero v1.0 Universal

awesome-llm's People

Contributors

Stargazers

Watchers

Forkers

patrick-tssn n0wwa closegoingaway fskeo spicyguml vamoko qianxinchun-bug dumpmemory xfxcnlc d3p10y fuxianghua sinclaircoder crazyofapple tutuna crackhopper luluchou zaku-zaku xupercoin codecmd recklessronan sev777 liujingxiu23 xiexukang cerviny minisoco alphadl hay-man tufo830 hertera1 zzmjohn sinwang20 yyht hitxujian luciusssss tingchenfu zhaopu7 moerehman pli76 gowithwind codedcclxxvii zyrain chenchy jak528 ankit-da cyril-jz charlesmartin14 machinelearningsystem happylynn jxzhangjhu alex-songs deluair ai-avant-garde-research scottsuk0306 aanchala cloudbee7 synehe hongwen-sun apspecial readmecode finley1991 brunoscaglione chakravarthiponmudi pmaksymiak pnnngchg jeff3071 yankaifyyy adaptivefailure strudland havealex petertakahashi eddieburning https-github-com-eddieburning yanndd1 yangcaot joemocha maximlf kirawang23 kunpeng199494 ataymano omkardash leonlahoud xiaobo996 tolgayan vpegasus praveenvijayan pjvazquez piotrlnordea lecheng butub1 gmh5225 yxuansu xinchaosong cloverforks hejingcao codingonion stanleyjacob glmapper mhsnalm samuellan coreuxt

awesome-llm's Issues

LLM data repos

https://github.com/Hannibal046/Awesome-LLM/blame/7e4bf169521e45f1209f826e7850fe0566ae6a05/README.md#L10

I found below two GitHub repositories regarding the LLM dataset:
https://github.com/Zjh-819/LLMDataHub
https://github.com/yaodongC/awesome-instruction-dataset

Logarithm Math for Neural Nets

Suggestion for LLM neural net model speedup which is very easy to achieve, will VASTLY speed up processing and the end result will basically be identical. LOGARITHMIC MATH.

Refer to the MATHOLOGER video on ANTI-SHAPESHIFTERS.

Simply put, in neural net matrixes, there are endless multiplication and divide functions. However in LOGARITHMIC math, LOG(X)+LOG(Y)=LOG(X * Y).
Likewise, LOG(X)-LOG(Y)=LOG(X / Y).

So, how to take advantage of these speedups?
If your neural net matrix is Algebraic in format, you convert the values to the LOG(X) value. Then all the multiply & divide math becomes addition and subtraction which uses considerably less processing. The end calculation result is in LOG(X) format, so you convert it back into Algebraic format by using the inverse logarithm routine. The matrix math as logarithm math will speed up everything by a factor of 10 per decimal point of accuracy (8 decimal points netting you a math speedup of 10000000 times the previous efforts using multiplication and division math).

Help

Include Falcon 40B

Please include the Falcon Family of models in Awesome-LLM. AFAIK this family includes the 7B, 40B, and 40B-instruct variants, all now available under the Apache 2.0 license.

Additional info:

Falcon LLM - Home
https://huggingface.co/tiiuae/falcon-40b
https://huggingface.co/tiiuae/falcon-40b-instruct
https://huggingface.co/tiiuae/falcon-7b
Foundation model trained on 1,000B tokens of RefinedWeb.

Open source UIs/webapps for LLM chatbots

Hello, great list, thanks for maintaining.

Felt this might be helpful to include:

https://github.com/snowfort-ai/awesome-llm-webapps

Collection of high quality, open source webapps to use as a starting point for LLM chatbots, etc. Much faster to start from an app than to start from a framework! Not sure where it would fit on the page, though.

Could you add a paper in the section of Applications of LLM?

paper title: SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support
paper link: https://arxiv.org/pdf/2305.00450.pdf
code repo: https://github.com/qiuhuachuan/smile

our recent survey may be suitable for "Instruction-Tuning" & "LLM-Alignment"

Hi,

our recent survey Aligning Large Language Models with Human: A Survey may be a good fit for your planned section "Instruction-Tuning" & "LLM-Alignment".

Let me know your thoughts on this!

LLM for learning a code repo and answering questions about the code design architecture etc?

Is there an LLM that we can download and install on the local system which will go through a code repo and can answer question about the architecture and design of the code. Like for eg: snap from chatgpt

Open LLM list

Section on LLM evaluation tools

Can you add a section on some eval tools, that would be really helpful

Mistral?

Any particular reason Mistral models aren't mentioned in this list (https://arxiv.org/abs/2310.06825)

Doubt: add a new "other papers"

I would like to contribute with this paper, but I do not know for sure which section fits better.

It is a Big Data Architecture for Question Answering systems
https://doi.org/10.5220/0011842700003467

help

您好，我从https://github.com/yixinL7/SimCLS看到您问yixinL7的问题，我在跑SimCLS这个模型时遇到一些错误；
错误;RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

请问您知道这个是什么问题嘛？希望您有空的时候可以帮我看看，万分感激！

Job opportunity

Hi Hannibal,
My apologies for contacting you through your opensource project, but this is one method we use when headhunting exceptional talents on behalf of our clients within AI. I took a look through your work, and I am very impressed.

We are partnered with a US startup called Play.ht who have developed advanced text to speech technology to generate human grade AI voices from text in rapid time. After an incredibly successful launch they have over 7000 customers from small startups to enterprise level multinationals including SalesForce, Samsung and Verizon, content creators and video editors who can make use of 800 voices across 60 languages.

The three major use cases for their voice technology are video editing & voice cloning for post production edits, voiceovers for YouTube and video content, e-learning and training, IVR systems and audio accessibility. The platform decreases time for recording and updating and cost. Hiring voice actors is expensive and future editing is limited.

They are looking to scale their Engineering and Research team with Senior to principal level hires to build large generative LLM based models, end to end. You will be fully hands on building, training and deploying advanced voice models and deliver a human grade, context understanding speech product across two product areas, a library of over 100 ready to use voices and a text to voice editor API. To be successful in this role you will have previous experience building and training large generative models to deployment in a production setting at scale. Any experience in the TTS/ Voice space is a big plus but not essential.

In return you can expect a strong comp package including meaningful equity and a strong salary. You will also be one of the early team, responsible for bringing some of the most advanced and realistic voice products to market to be deployed across a number of industries and used worldwide.

This position can be fully remote, anywhere in the world, and you will work with a small, but rapidly scaling international team of world class engineers and developers.

Please can you let me know if you and I can organise a time to speak?
Here is a link to my LinkedIn profile - https://www.linkedin.com/in/jackdev/
And here is a link to my calendar - https://calendly.com/jackcubiq/20min

add palm2?

Probably this paper needs to be added:

https://ai.google/static/documents/palm2techreport.pdf

ai00_rwkv_server

AI00 RWKV Server is an inference API server based on the RWKV model.

It supports VULKAN/DX12/openGL parallel and concurrent batched inference and can run on all GPUs that support VULKAN/DX12/openGL. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!!

No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box!

Compatible with OpenAI's ChatGPT API interface.

100% open source and commercially usable, under the MIT license.

If you are looking for a fast, efficient, and easy-to-use LLM API server, then AI00 RWKV Server is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.

https://github.com/cgisky1980/ai00_rwkv_server