gobbios / avutils Goto Github PK
View Code? Open in Web Editor NEWDiViMe interface and utilities dealing with audio and video files
DiViMe interface and utilities dealing with audio and video files
Some of the tools fail with audio files that contain spaces and parentheses.
divime_diarization()
fails when the corresponding SAD file is empty.
split_duration()
returns non-sensical results when the end point of an annotation is in fact a duration, as is the case for most .rttm output formats from the divime tools (but not for ELAN, which does indeed work with two time stamps).
To clarify, this is only relevant for those annotations that 'cross' over two time periods.
audio_info()
and convert_audio()
fail if the input is mp3. It seems that sox
lacks out-of-the-box support for mp3. ffmpeg
though seems to support mp3. Hence, both functions need to be modified to either allow the choice between sox
and ffmpeg
or switch everything to ffmpeg
right away. The latter seems the more reasonable approach...
I am trying to install this package on windows, but I am running into this problem, which is stopping the installation.
Warning: newline within quoted string at elan2rttm.Rd:40
Error in parse_Rd("Mydirectory/Rbuild218c55e07db0/avutils/man/elan2rttm.Rd", :
Unexpected end of input (in " quoted string opened at elan2rttm.Rd:79:39)
Execution halted
extract_audio() fails with file names like 0103(2).mp4
also applies to video_info()
apparently affects everything that relies on ffmpeg
The noisemes SAD file contains one row with one speech entry
Here is the output:
vagrant ssh -c 'diartk.sh data/ noisemesSad'
wavs and transcriptions found !
Tests finished
treating testfile6
WARNING for /vagrant/data/tmp.rwYWRUqvBX/testfile6.fea: replacing HCopy htconfig with SMILExtract MFCC12_E_D_A is untested
(MSG) [2] in SMILExtract : openSMILE starting!
(MSG) [2] in SMILExtract : config file is: /home/vagrant/repos/opensmile-2.3.0/config/MFCC12_E_D_A.conf
(MSG) [2] in cComponentManager : successfully registered 96 component types.
(MSG) [2] in instance 'lldcsvsink' : No filename given, disabling this sink component.
(MSG) [2] in instance 'lldarffsink' : No filename given, disabling this sink component.
(MSG) [2] in cComponentManager : successfully finished createInstances
(16 component instances were finalised, 1 data memories were finalised)
(MSG) [2] in cComponentManager : starting single thread processing loop
(MSG) [2] in cComponentManager : Processing finished! System ran for 549 ticks.
cp: cannot stat '/vagrant/data/tmp.rwYWRUqvBX/testfile6.rttm': No such file or directory
Connection to 127.0.0.1 closed.
If a video format is not recognized by ffmpeg
, extract_audio()
will still report such files as processed without producing any output wav.
When using divime_sad_noisemes()
, sometimes only one .rttm file is produced. The intermediate steps seem to work fine (.htk files are produced for all audio files).
detecting speech and non speech segments
/home/vagrant/launcher/noisemesSad.sh: line 68: 5030 Killed python yunified.py noisemes ${audio_dir} $chunksize
finished detecting speech and non speech segments
If the data folder contains only the file suspected of being the cause of this behaviour, the .rttm file is still produced. But it seems that the content of this file reflects only roughly the first half of the recording (no records after 400s, although the recording is about 800s long).
From the message above, the problem seems to be related to yunitator.
One potentially dangerous work-around would be to do the processing inside divime_sad_noisemes()
with a loop, i.e. handling each audio file one by one (creating a temp folder with one audio file only, run noisemesSad
, move the .rttm out, replace audio with the next file, run noisemesSad
again, etc...). The important thing is to catch and log this behaviour somehow.
For some video formats, video_info()
does not recognise the video's resolution and frame rate, which results in warnings.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.