Giter Site home page Giter Site logo

lookoncetohear's Introduction

Look Once to Hear

Ab Gradio demo

This repository provides code for the paper, Look Once to Hear: Target Speech Hearing with Noisy Examples. Look Once to Hear is an intelligent hearable system where users choose to hear a target speaker by just looking at them for a few seconds. This paper won best paper honorable mention ๐Ÿ† at CHI 2024.

loh.mov

Setup

conda create -n ts-hear python=3.9
conda activate ts-hear
pip install -r requirements.txt

Training

Training data includes clean speech, background sounds, head-related transfer functions (HRTFs) and binaural room impulse responses (BRIRs). We use Scaper toolkit to synthetically generate audio mixtures. Each audio mixture is generated on-the-fly, during training or evaluation, using Scaper's generate_from_jams function on a .jams specification file.

We provide self-contained datasets here, with the source .jams specifications we used for training. To perform a training run, it is sufficient to download the .zip files provided there, unzip the contents to data/ directory and run this command:

python -m src.trainer --config <configs/tsh.json> --run_dir <runs/tsh> [--frac <0.05 (% train/val batches)>]

To resume a partial run:

python -m src.trainer --config <configs/tsh.json> --run_dir <runs/tsh>

Evaluation

Evaluation is done on speech mixture in similar format as training samples. Checkpoints of the embedding model and the target speech hearing (TSH) model are available here.

python src.ts_hear_test

lookoncetohear's People

Contributors

vb000 avatar mbrydak avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.