Giter Site home page Giter Site logo

collaborativewatermarking's Introduction

Collaborative Watermarking

This repository contains source code for "Collaborative Watermarking for Adversarial Speech Synthesis" submitted to ICASSP 2024.

Listen to audio samples at the demo page https://ljuvela.github.io/CollaborativeWatermarkingDemo/

Environment setup

Create a conda environment and install dependencies

conda create -n adversarial-watermarking python=3.10
conda activate adversarial-watermarking
conda install -c pytorch -c conda-forge pytorch torchaudio pytest conda-build matplotlib scipy pysoundfile tensorboard

Install submodule dependencies and link them to the conda python evironment. You could use PYTHONPATH instead, but we recommend conda to keep the environment contained

git submodule update --init --recursive
conda develop third_party/hifi-gan
conda develop third_party/asvspoof-2021/LA/Baseline-LFCC-LCNN

Pretrained models

To replicate experiments, download pretrained ASVSpoof models

cd third_party/asvspoof-2021/LA/Baseline-LFCC-LCNN/project
sh 00_download.sh

We do not currently distribute pre-trained HiFi GAN models. In the paper, we trained a wide range of configurations for 100k iterations, which still leaves a quality gap to the official Hifi GAN trained for 2.5M iterations (https://github.com/jik876/hifi-gan). If there is demand for a production-quality watermarked HiFi GAN, we will definitely consider training one.

Noise augmentation

Musan dataset is available as direct download from https://www.openslr.org/17/

Torchaudio has a nice wrapper, but only available in the nightly prototype at the moment https://pytorch.org/audio/master/generated/torchaudio.prototype.datasets.Musan.html

Using the nightly build provides a convenient method for download, but be careful not to break your environment

conda install -c pytorch-nightly -c nvidia torchaudio
dataset = torchaudio.prototype.datasets.Musan(root='/path/on/your/system/MUSAN', subset='noise', download=True)

Room impulse repsonse augmentation

The repository also implements room impulse response augmentation. The paper didn't have room to evaluate this, but you're welcome to experiment

Download the MIT RIR dataset from http://mcdermottlab.mit.edu/Reverb/IR_Survey.html

VSCode config

The following config snippet helps VSCode find the dependencies for development

{
    "terminal.integrated.env.linux" : {"PYTHONPATH": "${workspaceFolder}/third_party/hifi-gan;${workspaceFolder}/tests"},
    "terminal.integrated.env.osx" :   {"PYTHONPATH": "${workspaceFolder}/third_party/hifi-gan;${workspaceFolder}/tests"},

    "python.analysis.extraPaths": [
        "third_party/hifi-gan",
        "third_party/asvspoof-2021/LA/Baseline-LFCC-LCNN"]
}

collaborativewatermarking's People

Contributors

ljuvela avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.