So, there seem to be a few problems with the current handling of Audio Files. <ul

I have talked to <a class="user-mention notranslate" data-hovercard-type="user" data-h

Convert SoundPlayer to use QT functions about papagayo-ng HOT 29 OPEN

morevnaproject-org commented on September 3, 2024 1

Convert SoundPlayer to use QT functions

from papagayo-ng.

Comments (29)

steveway commented on September 3, 2024 1

Thanks, I'll take a look at that.
It seems that most people want to do that while playing the audio, so they are working with the current playback buffer.
I also found this, which seems to be a workaround to get to (live) audio data from qmediaplayer:
https://stackoverflow.com/questions/23339741/audio-visualization-with-qmediaplayer
But we kinda need the complete raw data directly, it would be quite inconvenient having to play back the audio first to generate the visuals, both from a usability and coding viewpoint.
I've already got qmediaplayer to load and play a wavefile, which was a bit annoying because it loads asynchronous and finding out how to get a signal that it's loaded completely.
( I have to manually call processEvents while waiting, or else it will do nothing and hang.)
That part seems to work now, I'll need to do some more testing but it looks promising.
The more important part is getting to the data itself so we can use that to generate our waveform.
I'll do a few more tests and will post the small part for loading I have to my fork today.

from papagayo-ng.

blackwarthog commented on September 3, 2024 1

I see, I've started adding this. It seems promising. But constData and data seem to be not completely implemented, they return a shiboken2.libshiboken.VoidPtr . So, we have to somehow get the data from these pointers. Not exactly sure how to do that, possibly with shiboken2.wrapInstance or maybe something from ctypes.

you should use an 'asarray' method of shiboken2.libshiboken.VoidPtr with bytes count as argument. data = mySoundBuffer.constData().asarray( mySoundBuffer.bytesCount() ) byte1 = data[1] byte8011 = data[8011] # i think following code will iterate trough audio samples # in format 16-bit signed integer little endian. for i in range(0, len(data)/4): level1 = data[i+1]*256 + data[i+0]; if level1 >= 32768: level1 = 65536 - level1 level1 = data[i+3]*256 + data[i+2]; if level1 >= 32768: level1 = 65536 - level1 level = (level1 + level2) / 65536.0 # do something

from papagayo-ng.

steveway commented on September 3, 2024 1

If you look into my newest commit then you can see that we might have the data now.
The data doesn't look bad, could be correct, and the the way to get it sounds logical.

from papagayo-ng.

blackwarthog commented on September 3, 2024 1

ffi.cast("intptr_t[{0}]".format(tempdata.sampleCount()), int(tempdata.constData()))

is it valid to use an *array of* intptr_t? may be we need "int16_t[{0}]"?

from papagayo-ng.

steveway commented on September 3, 2024 1

After testing a bit more and comparing the waveform to what Audacity shows, it looks pretty good actually.
See also this picture:

But QAudioBuffer should still be changed so that one doesn't need to juggle around with buffers like this.

from papagayo-ng.

steveway commented on September 3, 2024 1

And I just realized why it does not always look correct.
Opening the Tutorial "vista.pgo" and looking at tempdata.format(), it tells us that this is in a different format:
<PySide2.QtMultimedia.QAudioFormat(11025Hz, 8bit, channelCount=1, sampleType=UnSignedInt, byteOrder=LittleEndian, codec="audio/pcm") at 0x0000016C01823E48>
So, we need to change the casting depending on the format of course!

For comparision, here is what I get from "lame.pgo":
<PySide2.QtMultimedia.QAudioFormat(16000Hz, 16bit, channelCount=1, sampleType=SignedInt, byteOrder=LittleEndian, codec="audio/pcm") at 0x0000016C016123C8>

from papagayo-ng.

morevnaproject commented on September 3, 2024

What I have found so far (examples utilizing waveforms in QTMultimedia for C++):

https://blog.qt.io/blog/2010/05/18/qtmultimedia-in-action-a-spectrum-analyser/ (old tutorial)
https://www.youtube.com/watch?v=-QE2vQxhcf0 (demo video)
QT5 example - http://doc.qt.io/qt-5/qtmultimedia-multimedia-spectrum-example.html

from papagayo-ng.

steveway commented on September 3, 2024

Alright, in my fork we have working audio file loading and playback only using QT.
I tested it with a previously correctly and incorrectly working .wav file and with a .mp3 file.
It plays all 3 correctly.
The playmarker also moves correctly to that.
Also you can now again double click on a word or sentence and it will only play that part.
Now the bigger problem is getting the raw data to create our waveform from that.

from papagayo-ng.

morevnaproject commented on September 3, 2024

I have talked to @blackwarthog and he suggests to use https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioDecoder.html

He proposes to call QAudioDecoder.read() to read buffers one-by-one. It returns QAudioBuffer and you can access its data by calling QAudioBuffer.data() - https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioBuffer.html#PySide2.QtMultimedia.PySide2.QtMultimedia.QAudioBuffer.data

from papagayo-ng.

morevnaproject commented on September 3, 2024

As an alternative:
We can keep pydub/python-wave for reading waveform for now.
https://github.com/morevnaproject/papagayo-ng/blob/d9e711d888e4965a026072906c45137c14975bf3/SoundPlayer.py#L73-L82

sounddevice can be removed, as it is replaced by QMediaPlayer.

from papagayo-ng.

morevnaproject commented on September 3, 2024

For QAudioDecoder solution it is possible to get bytes array from buffer:

length = QAudioBuffer.bytesCount()
QAudioBuffer.constData().asarray(length)

...and then do all processing of bytes manually, depending on QAudioBuffer.format() - https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioBuffer.html#PySide2.QtMultimedia.PySide2.QtMultimedia.QAudioBuffer.format

from papagayo-ng.

morevnaproject commented on September 3, 2024

Here I pushed a quick-and-dirty solution for drawing waveform using old method - https://github.com/morevnaproject/papagayo-ng/tree/qmediaplayer

e28c91a

from papagayo-ng.

morevnaproject commented on September 3, 2024

...and then do all processing of bytes manually, depending on QAudioBuffer.format() - https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioBuffer.html#PySide2.QtMultimedia.PySide2.QtMultimedia.QAudioBuffer.format

Maybe it is possible to use audioop (http://code.i-harness.com/ru/docs/python~3.6/library/audioop) to manipulate that raw data.

from papagayo-ng.

steveway commented on September 3, 2024

I see, I've started adding this. It seems promising.
But constData and data seem to be not completely implemented, they return a shiboken2.libshiboken.VoidPtr .
So, we have to somehow get the data from these pointers.
Not exactly sure how to do that, possibly with shiboken2.wrapInstance or maybe something from ctypes.

from papagayo-ng.

morevnaproject commented on September 3, 2024

But constData and data seem to be not completely implemented, they return a shiboken2.libshiboken.VoidPtr .

Did you tested this with audio data loaded?

from papagayo-ng.

morevnaproject commented on September 3, 2024

i think following code will iterate trough audio samples
in format 16-bit signed integer little endian.

...or use audioop? - https://docs.python.org/3/library/audioop.html

from papagayo-ng.

steveway commented on September 3, 2024

I tried using asarray like that at first, but that method does not seem to exist:
AttributeError: 'shiboken2.libshiboken.VoidPtr' object has no attribute 'asarray'

from papagayo-ng.

blackwarthog commented on September 3, 2024

Try to use this construction: import ctypes x = (ctypes.c_uint64*1000).from_address(0x12345678) For our case: import ctypes ... data = myAudioBuffer.constData() count = myAudioBuffer.samplesCount() channels = myAudioBuffer.samplesCount() // myAudioBuffer.framesCount() x = (ctypes.c_int16 * count).from_address( data ) # also try: int(data) ... value1223 = abs(x[1223])/32768.0

from papagayo-ng.

steveway commented on September 3, 2024

Mhh, .from_adress() seems to not be available.
The cffi version I currently have does look good.
With some luck we "only" need to calculate the rms, possibly with the help of audioop, to get a good looking waveform.

from papagayo-ng.

steveway commented on September 3, 2024

After experimenting some more, I'm pretty sure that the data I read using cffi is not correct.
I've opened a bug for the QT people here: https://bugreports.qt.io/browse/PYSIDE-934
Because receiving a shiboken2.libshiboken.VoidPtr from the data() and constData() of a QAudioBuffer seems like unintended behaviour.
Maybe they can find a quick solution.

from papagayo-ng.

steveway commented on September 3, 2024

I tested this yesterday with "int16_t[{0}]" and "uint16_t[{0}]" and the data does look different compared to intptr_t. I guess it should be more correct than intptr_t.
I think a few more tests with the data are needed.
For example we could try to play it back, if it sounds like the source or somehow similar then we are on a good track.
I hope we are even able to get the data this way, maybe ASLR is tricking with us here, or the shiboken VoidPtr addresses are different than cffi/ctypes pointers.
The best would be if the QT people could change QAudioBuffer to return a usable datatype.

from papagayo-ng.

morevnaproject commented on September 3, 2024

Awesome!

from papagayo-ng.

steveway commented on September 3, 2024

I've added some detection so it should automatically choose the correct casts for signed and unsigned Integers of different sizes.
But the result still don't look correct for vista.pgo and scared.pgo...

from papagayo-ng.

morevnaproject commented on September 3, 2024

@blackwarthog Link to "vista.wav" file, which is not decoded properly - https://github.com/LostMoho/Papagayo/blob/master/installer/Papagayo/Tutorial%20Files/vista.wav?raw=true

from papagayo-ng.

steveway commented on September 3, 2024

I've done a few small changes, the information for the number of bits and whether they are signed or unsigned can be gotten directly from the QAudioFormat object.
I was parsing the string representation of it before.
This still doesn't fix the bug with the vista test file.

from papagayo-ng.

steveway commented on September 3, 2024

It seems that Cristian Maureira-Fredes understands my explanation about QAudioBuffer and will try to add a method that allows to get the data from Python!
https://bugreports.qt.io/browse/PYSIDE-934?focusedCommentId=448389&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-448389

from papagayo-ng.

steveway commented on September 3, 2024

Since the QT version doesn't really seem to work on Linux. (At least here in a Ubuntu 19.10 VM) I've modified the original version for that.
So with steveway@23b973e this seems to work on Linux too.
I'm able to load all the english test files correctly and I've also tested loading a mp3 file which worked too.

from papagayo-ng.

luzpaz commented on September 3, 2024

Any progress on this ?

from papagayo-ng.

steveway commented on September 3, 2024

Well, the current version I have in my Github master should work mostly.
For Windows it uses QT since that seems to work reliable, but for OSX and Linux it uses different solutions.
I haven't had time to test this for a long time now.
I just did a few tests on Windows and fixed a small-ish bug with the play marker.

from papagayo-ng.

Convert SoundPlayer to use QT functions about papagayo-ng HOT 29 OPEN

Comments (29)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent