bigscience-workshop Goto Github PK
Name: BigScience Workshop
Type: Organization
Bio: Research workshop on large language models - The Summer of Language Models 21
Twitter: BigScienceW
Name: BigScience Workshop
Type: Organization
Bio: Research workshop on large language models - The Summer of Language Models 21
Twitter: BigScienceW
Managing your machine learning lifecycle with MLflow and Amazon SageMaker
A list of BigScience publications
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
Alternative to https://github.com/Dynalon/mdwiki-seed
Tools for curating biomedical training data for large-scale language modeling
A repo for running model shrinking experiments
A repository for `codecarbon` logs.
Scripts to prepare catalogue data
Track emissions from Compute and recommend ways to reduce their impact on the environment.
Code used for sourcing and cleaning the BigScience ROOTS corpus
This directory gathers the tools developed by the Data Sourcing Working Group
Tools for managing datasets for governance and training.
Generate statistics over datasets used in the context of BS
Code and Data for Evaluation WG
Tools for evaluating model robustness and consistency
BigScience working group on language models for historical texts
Libraries, Archives and Museums (LAM)
A framework for few-shot evaluation of autoregressive language models.
Framework for BLOOM probing
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.
BLOOM+1: Adapting BLOOM model to support a new unseen language
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
PII Processing code to detect and remediate PII in BigScience datasets. Reference implementation for the PII Hackathon
Toolkit for creating, sharing and using natural language prompts.
scaling-laws-tokenization
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.