Topic: bpe Goto Github
Some thing interesting about bpe
Some thing interesting about bpe
bpe,Sentiment-based classification for stock article title using PhoBert
User: 209sontung
bpe,GPT3 encoder & decoder tool written in Swift
User: aespinilla
bpe,High performance unsupervised text tokenization for Ruby
User: ankane
bpe,R package for Byte Pair Encoding based on YouTokenToMe
Organization: bnosac
bpe,Subword-augmented Embedding for Cloze Reading Comprehension (COLING 2018)
User: cooelf
Home Page: https://arxiv.org/abs/1806.09103
bpe,Auto summarization from BPE tokenization
User: crodriguez1a
bpe,Low resource language machine translation(az,be,tr -> en).
User: cxia0209
bpe,Java library implementing Byte-Pair Encoding Tokenization
User: deepanprabhu
bpe,Byte-Pair Encoding (BPE) (subword-based tokenization) algorithm implementaions from scratch with python
User: dolbyuuu
bpe,Generating new titles for movie posters using a combination of image features and pre-trained subword embeddings
User: eliyetres
bpe,A light stemmer for MDA (Moroccan Dialect Arabic) based on BPE (Byte Pair Encoding) algorithm implemented with Typescript
User: essofyany
bpe,Machine Learning for Phishing Website Detection
User: faizann24
Home Page: https://faizanahmad.tech/blog/2020/02/phishytics-machine-learning-for-phishing-websites-detection/
bpe,Fast bare-bones BPE for modern tokenizer training
User: gautierdag
bpe,n8n node for working with BPE Tokens with GPT in mind.
User: geckse
bpe,(py package) train your own tokenizer based on BPE algorithm for the LLMs (supports the regex pattern and special tokens)
User: hk669
Home Page: https://pypi.org/project/bpetokenizer/
bpe,Natural Language EnCoder-Decoder: word, char, bpe etc
Organization: isi-nlp
Home Page: https://isi-nlp.github.io/nlcodec
bpe,Subword Encoding in Lattice LSTM for Chinese Word Segmentation
User: jiesutd
bpe,Byte-Pair Encoding tokenizer for training large language models on huge datasets
User: jmaczan
bpe,An extremily simple and restricted tool/lib converting binary data into text that can be processed with unsuperwised character-level natural language processing tools/libs
Organization: kolanich-libs
bpe,Explains nlp building blocks in a simple manner.
User: kyubyong
bpe,Detect whether the text is AI-generated by training a new tokenizer and combining it with tree classification models or by training language models on a large dataset of human & AI-generated texts.
User: lizhecheng02
Home Page: https://www.kaggle.com/competitions/llm-detect-ai-generated-text
bpe,The fastest JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4 / GPT-4o. Port of OpenAI's tiktoken with additional features.
User: niieani
Home Page: https://gpt-tokenizer.dev
bpe,Fast and customizable text tokenization library with BPE and SentencePiece support
Organization: opennmt
Home Page: https://opennmt.net/
bpe,Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
User: rsennrich
bpe,BPE (Byte-Pair Encoding) Encoder Decoder for OpenAI's GPT-2 / GPT-3 Implemented In Pure PHP, Zero Dependency, Multi Byte Supported.
User: sajjadh47
bpe,Go BPE tokenizer (Encoder+Decoder) for GPT2 and GPT3
User: samber
Home Page: https://pkg.go.dev/github.com/samber/go-gpt-3-encoder
bpe,Byte Pair Encoding (BPE)
User: seonbeomkim
bpe,BPE tokenizer used for Dart/Flutter applications when calling ChatGPT APIs
User: simonwang9610
Home Page: https://pub.dev/packages/flutter_gpt_tokenizer
bpe,Word/Image/Audio Embedding models, Tokenizer models, Ngram language models, MatrixModels, Corpus building, Vocabulary Building, Language modelling
Organization: spydazwebai-nlp
bpe,Learning BPE embeddings by first learning a segmentation model and then training word2vec
User: stephantul
bpe,Fast and versatile tokenizer for language models with BPE, Unigram and WordPiece tokenization. Compatible with SentencePiece, Tokenizers, Tiktoken and more.
User: systemcluster
bpe,Byte Pair Encoding (BPE) Tokenization for Natural Language Processing
User: teleprint-me
bpe,A python package to build a corpus vocabulary using the byte pair methodology and also a tokenizer to tokenize input texts based on the built vocab.
User: vatsalsaglani
bpe,BPE Tokenizer implementations in C# for Anthropic, OpenAI LLM offerings
User: veerashayyagari
bpe,Unsupervised text tokenizer focused on computational efficiency
Organization: vkcom
bpe,A modified, secure version of BPE algorithm
User: yash-srivastava19
bpe,In this repo I will share different topics on anything I want to know in nlp and llms
User: zeyadusf
bpe,Simple-to-use scoring function for arbitrarily tokenized texts.
User: zouharvi
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.