Bram Vanroy's Projects
Robust recipes to align language models with human and AI preferences
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
Demo app to illustrate ASTrED
A word aligner based on multilingual encoders
A small repo showing how to easily use BERT (or other transformers) for inference
8-bit CUDA functions for PyTorch
Public repo for HF blog posts
Character-based MT evaluation and difference highlighting
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency
A repo that implements Stanford CRFM their HELM Instruct with adaptable evaluation criteria
Proceedings of EAMT 2022 (23rd Annual Conference of the European Association for Machine Translation)
Extract text from eBooks with Python
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
An open, efficient LLM for Dutch
Transformer trainer for variety of classification problems that has been used in-house at LT3 for different research topics.
Sentence-Level Text Simplification for Dutch
Segmentation interface for the TPR-DB to manually tokenize and sentence segment
MAchine Translation Evaluation Online (MATEO)
Benchmarking throughput of MBART