Some APIs around AI models.
- CPU only docker setup
- Expects preloaded models, no (annoying) auto-downloads
- Stats about token usages (partially/WIP)
๐ = most stable
โ๏ธ = very experimental / unstable
- Text / Sentences ๐
- Image
- Code ๐
by Stanza ๐
- Locale Identification
- Sentence Segmentation
- Token Classification (NER, POS, MWT)
- Sequence Classification (Sentiment)
- Lemmatization
- Image to Data (by
donut
) โ๏ธ - Visual Document Question Answering (Image) (by
donut
) - WIP Document Classification (Image) (by
dit
) โ๏ธ- (dataset) RVL-CDIP:
"letter", "form", "email", "handwritten", "advertisement", "scientific report", "scientific publication", "specification", "file folder", "news article", "budget", "invoice", "presentation", "questionnaire", "resume", "memo"
- (dataset) RVL-CDIP:
general Natural Language Inference
- Question Answering
- Question Answer Generation โ๏ธ
- Question Generation โ๏ธ
- Question Natural Language Inference / QNLI ๐
- Semantic Search ๐
- WIP Sentence Clustering
- todo Topic Clustering (by BERTopic)
poetry lock --no-update
poetry install --sync
# poetry lock --no-update && poetry install --sync
Run in docker container cli:
docker compose run --rm baistro bash
poetry run cli
# download models:
poetry run cli download