timmahrt / praatio Goto Github PK

A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given hierarchical time-aligned transcriptions (utterance > word > syllable > phone, etc).

License: MIT License

Python 89.04% Jupyter Notebook 10.96%

praatio's People

Stargazers

Watchers

praatio's Issues

Creating textgrid with data

Hi, I'm new to praat and after following your tutorial I've created a Textgrid that looks like this:

bobby.TextGrid
File type = "ooTextFile short"
Object class = "TextGrid"

0.0
1.194625

1
"IntervalTier"
"phoneme"
0.0
1.194625
1
0.0
1.194625
""

However when I tried to print out data from my textgrid it doesn't work. When I use bobby_phone.TextGrid for example it works. My question is do you need the praat.exe to do it manually?

Bobby_phone.Textgrid:
File type = "ooTextFile"
Object class = "TextGrid"

xmin = 0.0
xmax = 1.194625
tiers?
size = 1
item []:
item [1]:
class = "IntervalTier"
name = "phone"
xmin = 0.0
xmax = 1.18979591837
intervals: size = 15
intervals [1]:
xmin = 0.0124716553288
xmax = 0.06469123242311078
text = ""
intervals [2]:
xmin = 0.06469123242311078
xmax = 0.08438971390281873
text = "B"
intervals [3]:
xmin = 0.08438971390281873
xmax = 0.23285789838876556
text = "AA1"
intervals [4]:
xmin = 0.23285789838876556
xmax = 0.2788210218414174
text = "B"
intervals [5]:
xmin = 0.2788210218414174
xmax = 0.41156462585
text = "IY0"
intervals [6]:
xmin = 0.41156462585
xmax = 0.47094510353588265
text = "R"
intervals [7]:
xmin = 0.47094510353588265
xmax = 0.521315192744
text = "IH1"
intervals [8]:
xmin = 0.521315192744
xmax = 0.658052967538796
text = "PT"
intervals [9]:
xmin = 0.658052967538796
xmax = 0.680952380952
text = "DH"
intervals [10]:
xmin = 0.680952380952
xmax = 0.740816326531
text = "AH0"
intervals [11]:
xmin = 0.740816326531
xmax = 0.807647261005538
text = "L"
intervals [12]:
xmin = 0.807647261005538
xmax = 0.910430839002
text = "EH1"
intervals [13]:
xmin = 0.910430839002
xmax = 0.980272108844
text = "JH"
intervals [14]:
xmin = 0.980272108844
xmax = 1.1171482864527198
text = "ER0"
intervals [15]:
xmin = 1.1171482864527198
xmax = 1.18979591837
text = ""

Change pitch and formant

Is it possible with praatIO? I didn't find this in documentation.

xsampa.py lack of license

praatio/utilities/xsampa.py is listed as "does not carry any license". IANAL, but my understanding is that without any license, no one has permission to use it. It may be included in the source of the project with permission, but without an explicit license for the file no one else can know exactly how they are allowed to use it, if at all. Is there any way to contact the author and get a specific license for the file, such as MIT like the main project?

No tag for 3.8.1

Version 3.8.1 is released, but there is no tag on github. Would it be possible to add a tag for it?

Tags are useful for downstreams since they let them get the full, exact upstream source, unlike sdists which only have some of the files. In this case, in particular the github tags include the tests while sdists don't.

PraatIO on conda-forge

Hi Tim, I added praatio to conda-forge as that was the one dependency that wasn't on there, the feedstock is here: https://github.com/conda-forge/praatio-feedstock. Would you like me to add you as a maintainer? I'm happy to keep it up to date as necessary for MFA as well.

Suggestion: namedtuples for entries

Thanks for creating this library. I've a suggestion for improvement: use named tuples for entries rather than plain tuples. Here's an example:

from collections import namedtuple
Entry = namedtuple('Entry', ['start', 'end', 'label'])

# Example
entry = Entry(0.0, 0.30, 'sound')
print(entry)

Result: Entry(start=0.0, end=0.3, label='sound')

The nice thing about this is that you don't have to do any indexing anymore, but you can just do entry.start to get the starting time.

Issues parsing TextGrids from ELAN

I've had a couple of users reporting issues with loading TextGrids exported from ELAN. The issue seems to be that the "item [1]" lines are formatted without a space ("item[1]"), so the parsing in https://github.com/timmahrt/praatIO/blob/master/praatio/tgio.py#L1896 fails. I think a reasonable fix would be something like re.split(r'item ?\[', data, flags=re.MULTILINE)[1:].

Looks like you're working on a 5.0, so don't know if that would be the place to fix it or if it would be better for me to submit a PR for the main branch.

Did there any incompatible upgrade in parratio==6.0.0 Textgrid' object has no attribute 'tierDict'

This problem occured when parratio==6.0.0

Textgrid' object has no attribute 'tierDict'

`intersection`: issue on consecutive duplicate words

Today I encountered an issue with the behavior of intersection.

Say I have a WORD tier that looks like this:

UPON | A | A | TIME

And I have a PHONE tier that looks like this:

AH0 | P | AA1 | N | AH0 | EY1 | T | AY1 | M

Assuming these are time-aligned correctly, when I call intersection, I get a list that looks something like this:

['UPON-AH0', 'UPON-P', 'UPON-AA1', 'UPON-N', 'A-AH0', 'A-EY1', 'TIME-T', 'TIME-AY1', 'TIME-M']

Because I have two intervals in the WORD tier which have the same label, from this intersection I can't really tell if I have two distinct words "A" that have the respective transcriptions "AH0" and "EY1", or if I have one distinct word "A" transcribed as "AH0 EY1".

Obviously, there is no right way to solve this, but I would suggest that since we do know that the word entries are distinct, that perhaps instead the label should be the WORD label plus a tuple of all the PHONE labels that coincide with it. Something like this:

['UPON-(AH0, P, AA1, N)', 'A-(AH0)', 'A-(EY1)', 'TIME-(T, AY1, M)']

This would also mean that the interval boundaries would be the boundaries of the left-hand side tier. So my example would be for

word_tier.intersection(phone_tier)

If you instead did

phone_tier.intersection(word_tier)

you would get

['AH0-UPON', 'P-UPON', 'AA1-UPON', 'N-UPON', 'AH0-A', 'EY1-A', 'T-TIME', 'AY1-TIME', 'M-TIME']

What do you think?

Editing timestamps to obtain new features

Hi,

I am new to Praatio and have been using it in the last few weeks for a project I am currently carrying out. What I am trying to do is to edit the timestamps of all the boundaries within a particular interval tier by 1.5 seconds from before the start time and 1.5 seconds after the end time, effectively expanding my boundaries by a maximum of 3 seconds. After which I want to extract a set of prosodic features for the new boundary.

I see that there is a means of editing the time stamps but not entirely sure how to go about it. The tutorial provided only mentions it without giving a working example. Any suggestions as to how I can make this work?

n.b. I am a novice when it comes to Python (and coding in general) so sorry in advance if my question is too overgeneralized and/or easily solvable.

Thank you in advance :D

empty textgrid

Hello!

I'm trying to follow the tutorial of PraatIO: my objective is to segment my sentences (one file.wav = one recorded FRENCH sentence) and to extract the onset of words/syllabs/phonems.

The first example in the tutorial gave the word segmentation but when I try it, I obtain an empty textdgid :

I just want the labels from the entries

labelList = [entry.label for entry in wordTier.entries]
print(labelList)

Get the duration of each interval

(in this example, an interval is a word, so this outputs word duration)

durationList = []
for start, stop, _ in wordTier.entries:
durationList.append(stop - start)

print(durationList)

output: [] []

and actually, this is what I have in my file.textgrid:

File type = "ooTextFile"
Object class = "TextGrid"

0
5.098503401360544

1
"IntervalTier"
"words"
0
5.098503401360544
1
0
5.098503401360544
""

Did I miss something? maybe I didn't understand well the tutorial?

Your help would be very valuable to me!!

Some API suggestions for next major release

As I'm going through to make MFA compatible with v6.0, one thing that sticks out to me is that several aspects of PraatIO could be more pythonic and make it easier for integrating with other packages.

Use PEP8 snake case rather than camel case (e.g. Textgrid.tier_names instead of TextGrid.tierNames). This might cause a bit of desync with the underlying TextGrid parameters, but it would also be nice to unify TextGrid.minTimestamp, TextGrid.maxTimestamp with IntervalTier.minT, IntervalTier.maxT. (I'll stick to using camel case for consistency in my other thoughts)
Add magic functions to TextGrid classes
- __len__ so len(textgrid_object) returns len(textgrid_object._tierDict), len(interval_tier) returns len(interval_tier.entries)
- __getitem__ so textgrid_object[tier_name] returns textgrid_object._tierDict[tier_name] and interval_tier[0] returns interval_tier.entries[0]
  - __setitem__ could call TextGrid.addTier for TextGrid and IntervalTier.insertEntry for IntervalTier with the defaults (and if there's an error, then prompt the user to use the more fully featured function (or have a separate IntervalTier.append and IntervalTier.insert that mirrors python list functionality, though an interval doesn't necessarily have to be added at the end, but that's usually how I think about TextGrid creation)
  - It might be good to split IntervalTier.insertEntry into three functions rather than have a string switch on its behavior, so something like IntervalTier.insertEntry, IntervalTier.mergeEntry, IntervalTier.replaceEntry, so it's easier for IDE autocompletes, easier debugging
- __iter__ so for tier in textgrid_object yields each tier in textgrid_object.tiers and for interval in interval_tier returns each interval in interval_tier.entries
- Use dataclasses for Intervals/Points rather than namedtuples so that they're mutable. I realize that changes to intervals have to be evaluated in the context of the tier and so that's likely why they're immutable, but I wonder if when adding intervals to a tier, you could store a reference to the tier in the interval, and then when changing any parameters of it, call the tier's validation logic? Or have a signal/slot style where every time an interval gets added, the tier connects a validation function to the interval's dataChanged signal.

It also might be helpful to have includeBlankSpaces default to true for exporting to textgrids (and it might also be worth breaking up the TextGrid.save method into different methods for the output format), since it's a bit of a gotcha if you try to open up TextGrids in Praat later on. Not having blank intervals was the root cause of some issues for MFA in https://twitter.com/betsysneller/status/1621607813632892929, though I don't think they were using PraatIO to generate the textgrids.

use praat to segment a speech file

Hi I am very new to Praat overall, not just the python version.

is Praat a good tool to use to segment a .wav file if I know the exact times marks where I want to chop up the file? I like to just use Praat, because afterwards I will be extracting pitch and tempo from the segments.

Potential bug in `audio.extractSubwav` and/or `audio.openAudioFile`

audio.extractSubwav calls audio.openAudioFile with a keepList containing startT and endT and no deleteList:

def extractSubwav(fn: str, outputFN: str, startT: float, endT: float) -> None:
    audioObj = openAudioFile(
        fn,
        [
            (startT, endT, ""),
        ],
        doShrink=True,
    )
    audioObj.save(outputFN)

In audio.openAudioFile L491, it calls utils.invertIntervalList with duration given as the min:

elif deleteList is None and keepList is not None:
        computedKeepList = []
        computedDeleteList = utils.invertIntervalList(
            [(start, end) for start, end, _ in keepList], duration
        )

In this case, the computedDeleteList will always be empty, because the min time provided is the end of the original wav file.

This means when you call audio.extractSubwav, you will always get an empty result.

To further complicate things, even if you changed L491 to specify min=0, max=duration, you'll still get an empty result for audio.extractSubwav.

This is because on L489, it sets computedKeepList = [], and then on L499, the original keepList is overridden with the now empty computedKeepList:

keepList = [(row[0], row[1], _KEEP) for row in computedKeepList]

Finally, when we actually do the operations, because audio.extractSubwav calls audio.openAudioFile with doShrink=True, it won't actually delete anything on LL519-521:

        elif label == _DELETE and doShrink is False:
            zeroPadding = [0] * int(framerate * diff)
            audioSampleList.extend(zeroPadding)

And it won't keep anything because keepList is empty.

Json or csv format?

Is there a way to save textgrid file to csv or json format?

minimumIntervalLength in tgio.save not not working as expected

I am saving textgrids of syllables of birdsong and noticed the tg.save function does not work as expected.

The save function

def save(self, fn, minimumIntervalLength=MIN_INTERVAL_LENGTH):
    
    for tier in self.tierDict.values():
        tier.sort()
    
    # Fill in the blank spaces for interval tiers
    for name in self.tierNameList:
        tier = self.tierDict[name]
        if isinstance(tier, IntervalTier):
            
            tier = _fillInBlanks(tier,
                                 "",
                                 self.minTimestamp,
                                 self.maxTimestamp)
            if minimumIntervalLength is not None:
                tier = _removeUltrashortIntervals(tier,
                                                  minimumIntervalLength)
            self.tierDict[name] = tier
    
    for tier in self.tierDict.values():
        tier.sort()
    
    # Header
    outputTxt = ""
    outputTxt += 'File type = "ooTextFile short"\n'
    outputTxt += 'Object class = "TextGrid"\n\n'
    outputTxt += "%s\n%s\n" % (repr(self.minTimestamp),
                               repr(self.maxTimestamp))
    outputTxt += "<exists>\n%d\n" % len(self.tierNameList)
    
    for tierName in self.tierNameList:
        outputTxt += self.tierDict[tierName].getAsText()
    
    with io.open(fn, "w", encoding="utf-8") as fd:
        fd.write(outputTxt)

takes as input to minimumIntervalLength by default a hardcoded parameter MIN_INTERVAL_LENGTH = 0.00000001.

When segments are longer than that number, it would be expected that they would be left as is. However, with my song, when I do not set that flag, the first syllable in each of my textgrids is changed from:
Interval(start=0.339, end=0.387, label='syll')
to:
Interval(start=0.0, end=0.387, label='syll')
Despite the segment length being longer than MIN_INTERVAL_LENGTH. Setting minimumIntervalLength to None fixes the problem in my case, but it looks like something is not working as intended in this function.

Thanks for an excellent toolset!
Tim

can i save a normal textgrid but not a short textgrid using praatIO

Hi,timmahrt:
I want to save textgrid using praatio,but I found that it save as a short textgrid.Can I save a normal textgrid?

Montreal Forced Aligner compatability

I have found that useShortForm=False must be set in tg.save() in order for the TextGrids to be readable by the Montreal Forced Aligner. This knowledge may be helpful for other users.

TextGrid.tierDict can be modified, corrupting the TextGrid

As far as I can tell from your examples, the canonical way to access a TextGridTier from a TextGrid is to access the internal tierDict (Example).

However, because tierDict is mutable, one can end up adding a new TextGridTier to this tierDict, e.g.,

new_textgrid_tier = new_textgrid.tierDict.setdefault(
    "new_tier",
    IntervalTier(
        "new_tier", [], textgrid_obj.minTimestamp, textgrid_obj.maxTimestamp
    )
)

If this happens, the TextGrid will essentially be in an Illegal State, because TextGrid.tierNameList will be missing the new tier. This obviously breaks functions like TextGridTier#replaceTier, among other things.

I think there are at least two possible solutions:

Add a getTier method and rename tierDict to _tierDict to indicate that it is a protected member that shouldn't be altered
Change tierDict to an OrderedDict and remove tierNameList entirely. You can then just use tierDict.keys() in place of tierNameList

I think it'd probably be best to implement both solutions, particularly because it's dangerous to have two parallel data structures that you need to keep in sync. However, simply implementing number 1 alone should solve the problem quickly and easily.

I'm also happy to submit a PR for this if you'd like.

Thanks again for all your work!

README code outdated and misleading

The code snippets provided on the README is outdated and misleading.

For example, it is written that the method "openTextGrid" should be used when the true name of the method is "openTextgrid".

Can I fill the blanks in the tier by extending the existing intervals?

I noticed that when saving the textgrid file, praatio would try to fill up the tiers with new blank intervals, which seems not to be quite friendly to automatic aligning. I am not sure but what would happen if the tier is left there and not filled up? Or can I use praatio to fill the blanks by extending the existing intervals (which would not change the total number of intervals, making it much easier for machines to recognize)?

openTextgrid() cannot correctly parse the file if there are '\n's within the label text of interval tiers

Files like the following:

item []:
	item [1]:
		class = "IntervalTier"
		name = "Tokens"
		xmin = 0.0
		xmax = 16.6671875
		intervals: size = 22
		intervals [1]:
			xmin = 0.0
			xmax = 0.32
			text = "#"
		intervals [2]:
			xmin = 0.32
			xmax = 1.165
			text = "zao
chen
liu
wan
er
ne"

Only the "zao part is recognized.
According to the manual of Praat, string variables are identified by double quotes instead of newlines. (double quotes in text are turned into two double quotes in the file: " → """"

It is not hard to fix it, but I'm unfamiliar with git/github. So I paste the changed code in below (in place of original _fetchRow in tgio):

def _fetchRow_for_text(dataStr, searchStr, index):
    startIndex = dataStr.index(searchStr, index) + len(searchStr)
    first_quote_index = dataStr.index("\"", startIndex)
    
    looking = True
    next_quote_index = dataStr.index("\"", first_quote_index+1)
    while looking:
        try:
            neighbor_letter = dataStr[next_quote_index+1]
            if neighbor_letter == "\"":
                next_quote_index = dataStr.index("\"", next_quote_index+1)
            else:
                looking = False
        except IndexError:
            looking = False
    final_quote_index = next_quote_index
    
    word = dataStr[first_quote_index+1:final_quote_index]
    word = word.replace("\"\"", "\"")
    
    return word, final_quote_index + 1

I suppose it might be possible that in other places, like textgrid short version reading and writing, there are also problems due to this issue.

`pkg_resources` deprecation error

Running tests on a project that imports praatio, I get the following deprecation warning:

DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html

I think the more useful link might be here: https://importlib-resources.readthedocs.io/en/latest/migration.html#pkg-resources-resource-filename

praatIO/praatio/utilities/utils.py

Line 9 in 0f0544e

from pkg_resources import resource_filename

Validate support for Klattgrids

I had a project that needed support for Klattgrids, so I added it to Praatio. However that was a long time ago and I'm not sure the code still is functioning.

It was refactored a bit in the move from Praatio 4 -> Praatio 5.

We should add some more robust tests for it maybe?

Incorrect Error Message /praatio/data_classes/interval_tier.py line 99

Hi there,

Very nice package. I just noticed one of the error messages is not informative for when intervals overlap. It prints out only one of the intervals twice:

for entry, nextEntry in zip(self.entries[0::], self.entries[1::]): if entry.end > nextEntry.start: raise errors.TextgridStateError( "Two intervals in the same tier overlap in time:\n" f"({entry.start}, {entry.end}, {entry.label}) and ({entry.start}, {entry.end}, {entry.label})" )

Should read:

Make openTextgrid() and save() parameters manditory

New features are added in a backwards compatible way--they are given default values--but this has obscured some important functionality.

These parameters should be changed to be required so that users are forced to make mindful decisions about how praatio manipulates their data.

Textgrids with non-unique tier names cannot be opened

I received an email request for this feature, so I'm documenting it here as an issue.

Praat will allow tiers in the same textgrid to have the same name. Praatio references tiers by name using a dictionary, so the tier names must be unique. If an affected textgrid is opened in Praatio, an exception is thrown.

It would be nice if somehow, these textgrids could be opened in Praatio.

Why filter out empty labels from Intervals?

        if tierType == INTERVAL_TIER:
            while True:
                try:
                    timeStart, timeStartI = _fetchRow(tierData,
                                                      "xmin = ", labelI)
                    timeEnd, timeEndI = _fetchRow(tierData,
                                                  "xmax = ", timeStartI)
                    label, labelI = _fetchRow(tierData, "text =", timeEndI)
                except (ValueError, IndexError):
                    break
                
                label = label.strip()
                if label == "":
                    continue
                tierEntryList.append((timeStart, timeEnd, label))
            tier = IntervalTier(tierName, tierEntryList, tierStart, tierEnd)

Why wouldn't I want the intervals exactly as they appear in the file?

Wrong parameters when calling get_pitch_and_intensity.praat script from extractPI function

I'm trying to use the pitch_and_intensity.extractPI function. It gives me a PraatExecutionFailed error.

When I paste the command that Python attempted to run, Praat complains that the optional argument “Unit” cannot have the value “True”. The output I'm getting is:

/Applications/Praat.app/Contents/MacOS/Praat --run /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/praatio/praatScripts/get_pitch_and_intensity.praat /Users/calua/google_drive/pessoal/estudo/mestrado/MSc_CaluaPataca/cc/extraction/praatio/piecewise_output/stand-up_002.wav /Users/calua/google_drive/pessoal/estudo/mestrado/MSc_CaluaPataca/cc/extraction/praatio/piecewise_output/stand-up_002.txt 0.01 123 450 0.03 True -1 -1 0 0

It doesn't seem to matter what unit I pass the function, although I got the feeling that in the pitch_and_intensity.py file, line 295, the pitchUnit parameter was left out!

GitHub tag for 3.7.1 is missing

It would also be helpful to upload an sdist to PyPI, so it can be used for packaging.

Forced alignment?

Hi,

I've studied a little bit of forced alignment so currently I have a wav file which spoke "hello" and a .txt file which contain the word "hello". Can I use some sort of forced alignment to find out where is the start or end of the sentences along with its pronunciation?
If so is it possible to do it in a Window 10 OS?

Thank you.

visualize textGrid

Hi I have been reading thru the tutorial for a while now. There are two very nice displays showing these textGrid tiers, but I don't see any sample code for creating these png's, or display them by any other means. How are the visualization images generated?

Textgrid validation

A user might make certain assumptions about their data and praatio will silently break those expectations. It can make calculation errors go unnoticed.

Scenario A

Textgrid A has a maxTimestamp of 10.12345
Textgrid B has a maxTimestamp of 10.123456
both textgrids have an intervalTier with a final entry that runs to the end of the file
if those Textgrids are combined (using append) the interval in Textgrid A will no longer run to the new end of the file, although that may have been the original intention

Scenario B?

Solution?

validate on save will probably be too late for Scenario A but can be caught when adding the tier to the Textgrid
validation could potentially be desired anywhere (will this pollute function signatures more? Is there a cleaner way to do this?)
only offer validation in "risky" functions? Some functions already have the option to throw exceptions if unexpected data is encountered
to what extent is it the users responsibility to validate and to what extent is it praatios?

Tier entries that have blank labels are not read

I have many textgrid files with some or all the labels in specific tiers being deliberately set to be blank. Praatio just skips over them, ignoring the time information (which is what I need to retrieve). I have tried both a prebuilt version from "pip install" and the latest version from github, installed using setup.py

The attached file has three entries in its only tier, and two of them are empty. Only one is retrieved by praatio.

EmptyLabelBug.Txt

Extracting out the phoneme / phone from audio

Hi Tim,

You've explained how to create a blank textgrid from audio file in your tutorial. I was wondering how do you extract out the (phoneme / phone) and words from audio and insert into textgrid.

Thank you!

TextGrids with a -0 xmin fail to parse

I had an MFA user run into some issues with TextGrid parsing where their TextGrids has the minimum time at "-0" for some reason, I think the source of the TextGrids was Praat itself. Probably not too hard for PraatIO to handle this, just a matter of adding "-?" to the xmin parsing line.

More idiomatic json format

The json format exported by praatio largely mirrors the textgrids, in terms of the data that is output.

As requested on a different github project, (MontrealCorpusTools/Montreal-Forced-Aligner#453) there isn't really any need to be bound by the textgrids. We should structure the json files to be their own thing.

Details can be found in the above link.

timmahrt / praatio Goto Github PK

praatio's People

Stargazers

Watchers

Forkers

praatio's Issues

I just want the labels from the entries

Get the duration of each interval

(in this example, an interval is a word, so this outputs word duration)

Recommend Projects

Recommend Topics

Recommend Org