Giter Site home page Giter Site logo

avutils's People

Contributors

gobbios avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

avutils's Issues

split_duration returns non-sensical results

split_duration() returns non-sensical results when the end point of an annotation is in fact a duration, as is the case for most .rttm output formats from the divime tools (but not for ELAN, which does indeed work with two time stamps).

To clarify, this is only relevant for those annotations that 'cross' over two time periods.

mp3 audio

audio_info() and convert_audio() fail if the input is mp3. It seems that sox lacks out-of-the-box support for mp3. ffmpeg though seems to support mp3. Hence, both functions need to be modified to either allow the choice between sox and ffmpeg or switch everything to ffmpeg right away. The latter seems the more reasonable approach...

Installation error

I am trying to install this package on windows, but I am running into this problem, which is stopping the installation.

Warning: newline within quoted string at elan2rttm.Rd:40
Error in parse_Rd("Mydirectory/Rbuild218c55e07db0/avutils/man/elan2rttm.Rd", :
Unexpected end of input (in " quoted string opened at elan2rttm.Rd:79:39)
Execution halted

divime_diarization() fails for unknown reasons

The noisemes SAD file contains one row with one speech entry
Here is the output:

vagrant ssh -c 'diartk.sh data/ noisemesSad'
wavs and transcriptions found !
Tests finished
treating testfile6
WARNING for /vagrant/data/tmp.rwYWRUqvBX/testfile6.fea: replacing HCopy htconfig with SMILExtract MFCC12_E_D_A is untested
(MSG) [2] in SMILExtract : openSMILE starting!
(MSG) [2] in SMILExtract : config file is: /home/vagrant/repos/opensmile-2.3.0/config/MFCC12_E_D_A.conf
(MSG) [2] in cComponentManager : successfully registered 96 component types.
(MSG) [2] in instance 'lldcsvsink' : No filename given, disabling this sink component.
(MSG) [2] in instance 'lldarffsink' : No filename given, disabling this sink component.
(MSG) [2] in cComponentManager : successfully finished createInstances
                                 (16 component instances were finalised, 1 data memories were finalised)
(MSG) [2] in cComponentManager : starting single thread processing loop
(MSG) [2] in cComponentManager : Processing finished! System ran for 549 ticks.
cp: cannot stat '/vagrant/data/tmp.rwYWRUqvBX/testfile6.rttm': No such file or directory
Connection to 127.0.0.1 closed.

non-recognized video formats

If a video format is not recognized by ffmpeg, extract_audio() will still report such files as processed without producing any output wav.

divime_sad_noisemes() does not process all files

When using divime_sad_noisemes(), sometimes only one .rttm file is produced. The intermediate steps seem to work fine (.htk files are produced for all audio files).

detecting speech and non speech segments
/home/vagrant/launcher/noisemesSad.sh: line 68: 5030 Killed python yunified.py noisemes ${audio_dir} $chunksize
finished detecting speech and non speech segments

If the data folder contains only the file suspected of being the cause of this behaviour, the .rttm file is still produced. But it seems that the content of this file reflects only roughly the first half of the recording (no records after 400s, although the recording is about 800s long).

From the message above, the problem seems to be related to yunitator.

One potentially dangerous work-around would be to do the processing inside divime_sad_noisemes() with a loop, i.e. handling each audio file one by one (creating a temp folder with one audio file only, run noisemesSad, move the .rttm out, replace audio with the next file, run noisemesSad again, etc...). The important thing is to catch and log this behaviour somehow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.