Eric Lam's Projects
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Extract files from any type of archive in command line
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Create your development Env like LEGO blocks, run your projects on any device - be it a PC, Web, Phone or Tablet!
Showcase for "A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies."
[AAAI 2019] Generating Distractors for Reading Comprehension Questions from Real Examinations
A multilingual version of DPR
Data and code for paper "EQG-RACE: Examination-Type Question Generation" at AAAI2021.
End-to-End Speech Processing Toolkit
An example homework assignment using Python.
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
lib for system monitoring in Python / Web API (CPU/GPU/DISK/MEM/NET/SERVICE)
:zap: Dynamically generated stats for your github readmes
Join the GitHub Graduation Yearbook and "walk the stage" on June 5.
PTT 八卦版問答-正面-中文語料
🐱💻 GPU Info API is an API that provides detailed information about Nvidia, AMD, and Intel GPUs. The information is extracted from Wikipedia and stored in JSON format.
:mag: End-to-end Python framework for building natural language search interfaces to data. Leverages Transformers and the State-of-the-Art of NLP. Supports DPR, Elasticsearch, HuggingFace’s Modelhub, and much more!
Extract clustering feature from hubert
using huggingface trainer to pre-train hubert