Giter Site home page Giter Site logo

english-to-ipa's People

Contributors

benjaminbenetti avatar mitchellpkt avatar mphilli avatar t-cool avatar valerionerigit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

english-to-ipa's Issues

Import issue, please help

Hello,

I try to set up this English-to-IPA but I get many error message
It would be very very appreciated to get your help to install.

I try

  1. sudo python setup.py install : in the English-to-IPA-master folder
  2. sudo python setup.py build : in the English-to-IPA-master folder
    sudo python setup.py install : in the English-to-IPA-master folder
  3. No any build or install and do it in the English-to-IPA-master folder

both results give same error message

(IN the English-to-IPA folder, i type python and do this...)
Python 2.7.12 (default, Dec 4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import eng_to_ipa
Traceback (most recent call last):
File "", line 1, in
File "eng_to_ipa/init.py", line 1, in
from .transcribe import *
File "eng_to_ipa/transcribe.py", line 58
c.execute(f"SELECT word, phonemes FROM dictionary WHERE word IN ({quest[:-2]})", words_in)
^
SyntaxError: invalid syntax


I'm very new at python, so I just comment and put
c.execute(words_in)
and try the same process.

Then I get this error message.

import eng_to_ipa as es
Traceback (most recent call last):
File "", line 1, in
File "eng_to_ipa/init.py", line 1, in
from .transcribe import *
File "eng_to_ipa/transcribe.py", line 4, in
import eng_to_ipa.stress as stress
File "eng_to_ipa/stress.py", line 25
SyntaxError: Non-ASCII character '\xcb' in file eng_to_ipa/stress.py on line 25, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details


I can see the symbols looks like " , " & " ' "
, so I just replace it as comma and quotes.

Then I get this error message.

import eng_to_ipa
Traceback (most recent call last):
File "", line 1, in
File "eng_to_ipa/init.py", line 1, in
from .transcribe import *
File "eng_to_ipa/transcribe.py", line 4, in
import eng_to_ipa.stress as stress
File "eng_to_ipa/stress.py", line 4, in
import eng_to_ipa.syllables as syllables
File "eng_to_ipa/syllables.py", line 4, in
from eng_to_ipa import transcribe
ImportError: cannot import name transcribe


US phoneset to IPA

I've seen you used a strategy to convert ARPABET to IPA symbols. Is there such a thing for US PHONESET as described here for Festival?

I created a project called Transcriber Wrapper and I implemented a way to convert what Festival returns to IPA symbols, but I'm sure that is incorrect! I created a class called InternationalPhoneticAlphabet that has all the logic. As my project is based on Phonemizer, I described it in a issue here.

@mphilli do you think there is a way for that? I'm asking you because I can use your project in order to assert many things, like the transcription! By the way, thank you for your hard work and dedication!

Nothing happens

Good day. I experience no problems with the installation or whatsoever, no errors. Actually, when I run the code, simply nothing happens. There isn't any output at all. What are the possible reasons for that?

Differences between here and ARPABET

According to http://www.speech.cs.cmu.edu/cgi-bin/cmudict#about, the CMU dictionary uses 2-letter ARPABET notation for representing sounds.
According to the wikipedia page for ARPABET (pointed to as the reference for translating to IPA on the above CMU page), the IPA correspondances would be (lowercased):

CMU_TO_IPA = {
    'aa'       : 'ɑ',   # balm, bot
    'ae'       : 'æ',   # bat
    'ah'       : 'ʌ',   # butt
    'ao'       : 'ɔ',   # story
    'aw'       : 'aʊ',  # bout
    'ax'       : 'ə',   # comma
    'axr'      : 'ɚ',   # letter
    'ay'       : 'aɪ',  # bite
    'eh'       : 'ɛ',   # bet
    'er'       : 'ɝ',   # bird
    'ey'       : 'eɪ',  # bait
    'ih'       : 'ɪ',   # bit
    'ix'       : 'ɨ',   # roses, rabbit
    'iy'       : 'i',   # beat
    'ow'       : 'oʊ',  # boat
    'oy'       : 'ɔɪ',  # boy
    'uh'       : 'ʊ',   # book
    'uw'       : 'u',   # boot
    'ux'       : 'ʉ',   # dude

    'b'        : 'b',   # buy
    'ch'       : 'tʃ',  # China
    'd'        : 'd',   # die
    'dh'       : 'ð',   # thy
    'dx'       : 'ɾ',   # butter
    'el'       : 'l̩',   # bottle
    'em'       : 'm̩',   # rhythm
    'en'       : 'n̩',   # button
    'f'        : 'f',   # fight
    'g'        : 'ɡ',   # guy
    'hh'       : 'h',   # high
    'h'        : 'h',   # high
    'jh'       : 'dʒ',  # jive
    'k'        : 'k',   # kite
    'l'        : 'l',   # lie
    'm'        : 'm',   # my
    'n'        : 'n',   # nigh
    'ng'       : 'ŋ',   # sing
    'nx'       : 'ɾ̃',   # winner
    'p'        : 'p',   # pie
    'q'        : 'ʔ',   # uh-oh
    'r'        : 'ɹ',   # rye
    's'        : 's',   # sigh
    'sh'       : 'ʃ',   # shy
    't'        : 't',   # tie
    'th'       : 'θ',   # thigh
    'v'        : 'v',   # vie
    'w'        : 'w',   # wise
    'wh'       : 'ʍ',   # why
    'y'        : 'j',   # yacht
    'z'        : 'z',   # zoo
    'zh'       : 'ʒ',   # pleasure
}

Whereas (slightly reorganised) I see on https://github.com/mphilli/English-to-IPA/blob/master/eng_to_ipa/transcribe.py#L98:

{
    "a": "ə",  #
    "ey": "e", #
    "aa": "ɑ",
    "ae": "æ",
    "ah": "ə", #
    "ao": "ɔ",
    "aw": "aʊ",
    "ay": "aɪ",
    "eh": "ɛ",
    "er": "ər", #
    "ih": "ɪ",
    "iy": "i",
    "ow": "oʊ",
    "oy": "ɔɪ",
    "uh": "ʊ",
    "uw": "u",
    "ch": "ʧ",
    "dh": "ð",
    "hh": "h",
    "jh": "ʤ",
    "ng": "ŋ",
    "sh": "ʃ",
    "th": "θ",
    "y": "j",
    "zh": "ʒ",
}

I have put a # next to the differences (the initial list also includes identical graphs). Is there a reason for the differences? I can understand 'er' but most of the others strike me as curious choices.

syllable mistakes

There are some mistakes in the syllables.
Such as "amusement", the result is "əmˈjuzmənt", but "əˈmjuːzmənt" or "əˈmjuːzmənt" in dict.

Pronunciation and Accents

I was wondering if there were a way to change the "accent" of the IPA produced? I would like my output to render a British accent, but also render different pronunciations of words from different accents in the same country. A lot of accents are due to vowel shifts, so a way to change all of one vowel sound to another.

If you point me in the right direction, I can try and figure it out. I'm really only a beginner, but I'm getting there with python.

Thanks for all this code. It's just what I was looking for :)

Edit: What I mean by British accent is RP or NRP. I'm referring to TRAP - BATH differences etc.. A summary can be found here: https://notendur.hi.is/peturk/KENNSLA/02/TOP/AmvowelsSum.html

Misconversion of words like C.O.D.

Hi!

Thanks for your work. I am using the convert function in transcribe.py.

It seems that word C.O.D. exist in CMUdict around Line 16930 C.O.D. S IY1 OW1 D IY1, but the convert function can not covert C.O.D. because preserve_punc function treat word C.O.D. as ["", "C.O.D", "."].

Obversiouly, C.O.D not in CMUdict, so the result is c.o.d*.

We can add some codes in preserve_punc function for dealing with this special case.

Syllables

Great library.

Is it also possible to use the library for dividing words into syllables?

cmudict_preparer.py Error

Hello!
Thank you for your work!
I am trying to run code from this repo and I've just faced a problem with dictionary preparer script.
I downloaded a dictionary from http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b and when I start script I face KeyError. Do you know how to solve such problem?

scripts username$ python cmudict_preparer.py

INFO:root:running cmudict_preparer...
INFO:root:reading source file...
Traceback (most recent call last):
File "cmudict_preparer.py", line 31, in
unique_dict[re.findall(pattern, word)[0]] += "%" + ' '.join(line.replace("\n", "").split(" ")[1:])
KeyError: 'SEMI-COLON'

Support for non-sqlite dbs

I'm very interested in using this but my app is multi-threaded so can't really use sqlite. I am going to add support for Postgres for my own purposes but if you would be interested in merging generic DB support then I can try and make it work for other flavours too. If you are interested I'll submit a PR.

how can i get all the IPA symbols?

Hello, thanks for your great work ! I'm a little confused about the symbols of DJ, KK and IPA Phonetic Alphabet. And I found not all IPA symbols was used in English, so could you please show all the symbols used in this tool?

AH should be ʌ not ə

The short u in cup, which ARPABET renders as AH, should be ʌ in IPA. You have it as ə

How to process out of bound words?

Thans for the job.
I am not English native, I found that some out of bound word will give no correct IPA output, but the word with a star. eg:

e2ipa.convert("Strathclyde Police declined to comment") output is :

Stranthclyde* pə¹lis dɪklaɪnd ...

I am not English speaker, and  I do not know how to process this kind of exception.

The codes cannot be used!

Hi ,

You codes cannot be installed with following error:

File "/usr/local/lib/python3.5/dist-packages/English_to_IPA-0.0.1-py3.5.egg/eng_to_ipa/transcribe.py", line 58
c.execute(f"SELECT word, phonemes FROM dictionary WHERE word IN ({quest[:-2]})", words_in)
^
SyntaxError: invalid syntax

File "/usr/local/lib/python3.5/dist-packages/English_to_IPA-0.0.1-py3.5.egg/eng_to_ipa/rhymes.py", line 17
c.execute(f"SELECT word, phonemes FROM dictionary WHERE phonemes "
^
SyntaxError: invalid syntax

Conversion "ʌ"

The ARPABET "AH" (such as "love", "cut", "hut") should be converted to "ʌ".
But seems mistakenly converted to "ə".
Could you check that?

error

u did awesome work, but when i run the import then the error occur. do u know why? thanks.

import eng_to_ipa
Traceback (most recent call last):
File "", line 1, in
File "eng_to_ipa_init_.py", line 1, in
from .transcribe import *
File "eng_to_ipa\transcribe.py", line 69
asset.execute(f"SELECT word, phonemes FROM dictionary WHERE word IN ({quest[:-2]})", words_in)
^
SyntaxError: invalid syntax

What's the license for this code?

Hey there. I'd like to use this code as a library, but am unsure of the license? MIT or Apache2 licenses would be great, if possible :)

Get homophones?

Hi, is there a function similar to the rhyming function but for homophones?

new function (ruby-rt-rp-html) request

Can we add a new function on it with ruby-rt-rp tag to produce html code? then we can insert the code into html directly.
the ruby-rt-rp tag will align the phonics with its word.(https://github.com/dohliam/rubify)
for example:

Let's listen to  Fox  tell the story. I love you. I   am eating a peach. Sheep is here. Ship is here. Once   upon   a time. 
 lɛts ˈlɪsən  tu  fɑks tɛl   ðə ˈstɔri. aɪ lʌv   ju.  aɪ æm  ˈitɪŋ   ə   piʧ.        ʃip   ɪz  hir.    ʃɪp  ɪz   hir.   wʌns  əˈpɑn  ə taɪm.

Is English stress character should be put before or after the IPA symbol?

Thanks for the good job. I am not Native English and knowns little about IPA. But I saw in some TTS front-end that the stress character ˈ is just put after the IPA syllable. eg "ðə kwɪk braʊn fɑks ʤəmpt oˈʊvər ðə lˈeɪzi dɔg." while in this project, it just put before the IPA syllable. which is better? or which is the IPA standard?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.