Giter Site home page Giter Site logo

goalspeech's Introduction

GoALSpeech: Goal-directed Articulatory Learning for Speech Acquisition

Source code for replicating the results of the following paper: Philippsen, A. (2021). Goal-directed exploration for learning vowels and syllables: a computational model of speech acquisition. KI-Künstliche Intelligenz, 35(1), 53-70.

This the python implementation of the original Matlab implementation used for Ph.D. thesis on "Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition": https://pub.uni-bielefeld.de/record/2921296

Installation

The code runs with Python3.

Python packages

Can be installed e.g. via pip:

  • dtw (tested with version 1.4.0)
  • matplotlib (you might need to install the package python3-tk)
  • numpy
  • python-speech-features (https://github.com/StevenLOL/python_speech_features)
  • scipy
  • scikit-learn
  • sounddevice (you might need to install the package for the PortAudio library first in some Linux distributions, libportaudio2)
  • torch
  • oct2py (If GBFB features should be used, see below.)
  • tqdm (progress bar for sound production)
  • fastdtw (Comparison of sounds using the syllable weighting scheme)

Articulatory system

Acoustics

How to run

1. Configure the parameters. The parameters for an experiment are defined in a cfg file. Examples can be found in goalspeech/config/. Details about the format can be found in goalspeech/config/info.txt. A config file can also be generated via the file generateConfig.py. Modify the script and execute it to generate the file which will be written to config/.

2. Initialize the experiment. In ipython: Run one of the files initExperiment*.py [e.g. "%run -i initExperimentVowels.py"]. If you changed the config and want to use your own one, replace the path for the config file first. This will create an instance of VTLSpeechProduction and loads all the required parameters from the config into the ipython workspace. (When switching between *Vowels.py and *Syllables.py, currently a new ipython instance has to be started.)

3. Run experiment. After the initialization, runExperiment*.py starts the babbling learning process. If the script is run for the first time, ambient speech data is generated and stored into data/ambientSpeech/. In subsequent runs this file is reused. If you want the system to override it, delete it in the above mentioned directory. The following steps are performed:

  • Create articulatory data set (temporarily, is discarded after generating the acoustics)
  • Create corresponding acoustic data set, store in data/ambientSpeech (used as ambient speech, i.e. the speech from the environment that the system hears in its environment)
  • Start babbling. Results will be saved into a folder named with the current date in data/results/.

4. Inspect results.

  • In the beginning of babbling: gs.png shows the generated goal space. Make sure that it looks meaningful before investing time in continuing the babbling. The config is stored as config.txt.
  • After each babbling run (#runs defined in "runs") the results of the corresponding run R are stored as "results-R.pickle"
  • The script evaluateResults.py can be used to evaluate and visualize the results from multiple runs of one experiment.

goalspeech's People

Contributors

aphilippsen avatar

Stargazers

Xavier Hinaut avatar  avatar jkang avatar

Watchers

Xavier Hinaut avatar  avatar

Forkers

jorisgr

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.