manueltonneau Goto Github PK
Name: Manuel
Type: User
Twitter: ManuelTonneau
Location: Berlin, Germany
Blog: manueltonneau.com
Name: Manuel
Type: User
Twitter: ManuelTonneau
Location: Berlin, Germany
Blog: manueltonneau.com
Repository containing code for "How to Train BERT with an Academic Budget"
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Small utility to schedule start and stop times of SelfControl
TensorFlow code and pre-trained models for BERT
BERT-related papers
DBMDZ BERT models
implementation of "BotPercent: Estimating Twitter Bot Populations from Groups to Crowds"
Python libraries for Google Colaboratory
A Code-First Introduction to NLP course
BERT models pretrained on the CORD-19 Kaggle dataset
Browse Covid-19 & SARS-CoV-2 Scientific Papers with Transformers 🦠 📖
Python scripts to process german wiki dump. This is to generate a german text corpus for supervised word representation learning. Especially for training an BILM.
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Bayesian ideal points of French politicians, based on Twitter data.
A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM
emoji terminal output for Python
Super easy library for BERT based NLP models
Unsupervised Language Model Pre-training for French
Single Python script to get tweet JSON objects from a list of tweet IDs
A text classification model with pretrained GloVe embeddings
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
An implementation of training for GPT2, supports TPUs
GPT-2 French demo | Démo française de GPT-2
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Describes and solves some simple HACT models in Julia. The notes and code is modified and translated from Benjamin Moll's notes and codes: http://www.princeton.edu/~moll/notes.htm and http://www.princeton.edu/~moll/HACTproject.htm).
Resources for WOAH 2024 paper: "From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets"
Markdown Cheatsheet for Github Readme.md
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.