Giter Site home page Giter Site logo

Comments (29)

steveway avatar steveway commented on September 3, 2024 1

Thanks, I'll take a look at that.
It seems that most people want to do that while playing the audio, so they are working with the current playback buffer.
I also found this, which seems to be a workaround to get to (live) audio data from qmediaplayer:
https://stackoverflow.com/questions/23339741/audio-visualization-with-qmediaplayer
But we kinda need the complete raw data directly, it would be quite inconvenient having to play back the audio first to generate the visuals, both from a usability and coding viewpoint.
I've already got qmediaplayer to load and play a wavefile, which was a bit annoying because it loads asynchronous and finding out how to get a signal that it's loaded completely.
( I have to manually call processEvents while waiting, or else it will do nothing and hang.)
That part seems to work now, I'll need to do some more testing but it looks promising.
The more important part is getting to the data itself so we can use that to generate our waveform.
I'll do a few more tests and will post the small part for loading I have to my fork today.

from papagayo-ng.

blackwarthog avatar blackwarthog commented on September 3, 2024 1

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024 1

If you look into my newest commit then you can see that we might have the data now.
The data doesn't look bad, could be correct, and the the way to get it sounds logical.

from papagayo-ng.

blackwarthog avatar blackwarthog commented on September 3, 2024 1

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024 1

After testing a bit more and comparing the waveform to what Audacity shows, it looks pretty good actually.
See also this picture:
papagayo-comparision
But QAudioBuffer should still be changed so that one doesn't need to juggle around with buffers like this.

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024 1

And I just realized why it does not always look correct.
Opening the Tutorial "vista.pgo" and looking at tempdata.format(), it tells us that this is in a different format:
<PySide2.QtMultimedia.QAudioFormat(11025Hz, 8bit, channelCount=1, sampleType=UnSignedInt, byteOrder=LittleEndian, codec="audio/pcm") at 0x0000016C01823E48>
So, we need to change the casting depending on the format of course!

For comparision, here is what I get from "lame.pgo":
<PySide2.QtMultimedia.QAudioFormat(16000Hz, 16bit, channelCount=1, sampleType=SignedInt, byteOrder=LittleEndian, codec="audio/pcm") at 0x0000016C016123C8>

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

What I have found so far (examples utilizing waveforms in QTMultimedia for C++):

  1. https://blog.qt.io/blog/2010/05/18/qtmultimedia-in-action-a-spectrum-analyser/ (old tutorial)
  2. https://www.youtube.com/watch?v=-QE2vQxhcf0 (demo video)
  3. QT5 example - http://doc.qt.io/qt-5/qtmultimedia-multimedia-spectrum-example.html

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

Alright, in my fork we have working audio file loading and playback only using QT.
I tested it with a previously correctly and incorrectly working .wav file and with a .mp3 file.
It plays all 3 correctly.
The playmarker also moves correctly to that.
Also you can now again double click on a word or sentence and it will only play that part.
Now the bigger problem is getting the raw data to create our waveform from that.

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

I have talked to @blackwarthog and he suggests to use https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioDecoder.html

He proposes to call QAudioDecoder.read() to read buffers one-by-one. It returns QAudioBuffer and you can access its data by calling QAudioBuffer.data() - https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioBuffer.html#PySide2.QtMultimedia.PySide2.QtMultimedia.QAudioBuffer.data

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

As an alternative:
We can keep pydub/python-wave for reading waveform for now.
https://github.com/morevnaproject/papagayo-ng/blob/d9e711d888e4965a026072906c45137c14975bf3/SoundPlayer.py#L73-L82

sounddevice can be removed, as it is replaced by QMediaPlayer.

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

For QAudioDecoder solution it is possible to get bytes array from buffer:

length = QAudioBuffer.bytesCount()
QAudioBuffer.constData().asarray(length)

...and then do all processing of bytes manually, depending on QAudioBuffer.format() - https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioBuffer.html#PySide2.QtMultimedia.PySide2.QtMultimedia.QAudioBuffer.format

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

Here I pushed a quick-and-dirty solution for drawing waveform using old method - https://github.com/morevnaproject/papagayo-ng/tree/qmediaplayer

e28c91a

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

...and then do all processing of bytes manually, depending on QAudioBuffer.format() - https://doc.qt.io/qtforpython/PySide2/QtMultimedia/QAudioBuffer.html#PySide2.QtMultimedia.PySide2.QtMultimedia.QAudioBuffer.format

Maybe it is possible to use audioop (http://code.i-harness.com/ru/docs/python~3.6/library/audioop) to manipulate that raw data.

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

I see, I've started adding this. It seems promising.
But constData and data seem to be not completely implemented, they return a shiboken2.libshiboken.VoidPtr .
So, we have to somehow get the data from these pointers.
Not exactly sure how to do that, possibly with shiboken2.wrapInstance or maybe something from ctypes.

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

But constData and data seem to be not completely implemented, they return a shiboken2.libshiboken.VoidPtr .

Did you tested this with audio data loaded?

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

i think following code will iterate trough audio samples
in format 16-bit signed integer little endian.

...or use audioop? - https://docs.python.org/3/library/audioop.html

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

I tried using asarray like that at first, but that method does not seem to exist:
AttributeError: 'shiboken2.libshiboken.VoidPtr' object has no attribute 'asarray'

from papagayo-ng.

blackwarthog avatar blackwarthog commented on September 3, 2024

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

Mhh, .from_adress() seems to not be available.
The cffi version I currently have does look good.
With some luck we "only" need to calculate the rms, possibly with the help of audioop, to get a good looking waveform.

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

After experimenting some more, I'm pretty sure that the data I read using cffi is not correct.
I've opened a bug for the QT people here: https://bugreports.qt.io/browse/PYSIDE-934
Because receiving a shiboken2.libshiboken.VoidPtr from the data() and constData() of a QAudioBuffer seems like unintended behaviour.
Maybe they can find a quick solution.

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

I tested this yesterday with "int16_t[{0}]" and "uint16_t[{0}]" and the data does look different compared to intptr_t. I guess it should be more correct than intptr_t.
I think a few more tests with the data are needed.
For example we could try to play it back, if it sounds like the source or somehow similar then we are on a good track.
I hope we are even able to get the data this way, maybe ASLR is tricking with us here, or the shiboken VoidPtr addresses are different than cffi/ctypes pointers.
The best would be if the QT people could change QAudioBuffer to return a usable datatype.

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

Awesome!

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

I've added some detection so it should automatically choose the correct casts for signed and unsigned Integers of different sizes.
But the result still don't look correct for vista.pgo and scared.pgo...

from papagayo-ng.

morevnaproject avatar morevnaproject commented on September 3, 2024

@blackwarthog Link to "vista.wav" file, which is not decoded properly - https://github.com/LostMoho/Papagayo/blob/master/installer/Papagayo/Tutorial%20Files/vista.wav?raw=true

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

I've done a few small changes, the information for the number of bits and whether they are signed or unsigned can be gotten directly from the QAudioFormat object.
I was parsing the string representation of it before.
This still doesn't fix the bug with the vista test file.

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

It seems that Cristian Maureira-Fredes understands my explanation about QAudioBuffer and will try to add a method that allows to get the data from Python!
https://bugreports.qt.io/browse/PYSIDE-934?focusedCommentId=448389&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-448389

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

Since the QT version doesn't really seem to work on Linux. (At least here in a Ubuntu 19.10 VM) I've modified the original version for that.
So with steveway@23b973e this seems to work on Linux too.
I'm able to load all the english test files correctly and I've also tested loading a mp3 file which worked too.

from papagayo-ng.

luzpaz avatar luzpaz commented on September 3, 2024

Any progress on this ?

from papagayo-ng.

steveway avatar steveway commented on September 3, 2024

Well, the current version I have in my Github master should work mostly.
For Windows it uses QT since that seems to work reliable, but for OSX and Linux it uses different solutions.
I haven't had time to test this for a long time now.
I just did a few tests on Windows and fixed a small-ish bug with the play marker.

from papagayo-ng.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.