Giter Site home page Giter Site logo

pymarkovchain's People

Contributors

asmeurer avatar tehmillhouse avatar tswicegood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymarkovchain's Issues

If the input data has extremely long lines, generateString can cause a RuntimeError

I used a book from the Gutenberg Project as input file. A line in the input file corresponds to a paragraph in the book, and as PyMarkovChain uses lines as the delimiting unit, when calling generateString, I either get really long text or a RuntimeError, as recursion depth has been exceeded in _accumulateWithSeed:

>>> print a.generateString()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "MarkovChain.py", line 76, in generateString
    return self._accumulateWithSeed("", "")
  File "MarkovChain.py", line 103, in _accumulateWithSeed
    return self._accumulateWithSeed(sentence + sep + lastWord, nextWord)
[...]
  File "MarkovChain.py", line 103, in _accumulateWithSeed
    return self._accumulateWithSeed(sentence + sep + lastWord, nextWord)
  File "MarkovChain.py", line 96, in _accumulateWithSeed
    nextWord = self._nextWord(lastWord)
  File "MarkovChain.py", line 114, in _nextWord
    if probmap[candidate] > maxprob:
RuntimeError: maximum recursion depth exceeded in cmp

Ability to add text samples to existing database

This would require the markov chain to defer calculation of the occurrence probability until during text generation, but should be quite doable.

Also, switching the _nextWord function over to doing integer math will do away with rounding errors and will improve performance. Yay!

Example not working

When I run your example code:

from pymarkovchain import MarkovChain
MarkovChain().generateDatabase("This is some language to analyze")
MarkovChain().generateString()

I get the following error:

Database file not found, using empty database

Database file could not be writtenDatabase file not found, using empty database

Traceback (most recent call last):
  File "markov.py", line 8, in <module>
    print MarkovChain().generateString()
  File "/Library/Python/2.7/site-packages/pymarkovchain/MarkovChain.py", line 98, in generateString
    return self._accumulateWithSeed('')
  File "/Library/Python/2.7/site-packages/pymarkovchain/MarkovChain.py", line 122, in _accumulateWithSeed
    nextWord = self._nextWord(seed)
  File "/Library/Python/2.7/site-packages/pymarkovchain/MarkovChain.py", line 130, in _nextWord
    probmap = self.db[lastword]
KeyError: ''

Add a way to make words compare equal

I've got source text that has some word variations that I'd like to ignore. There are the basic case sensitivity issues. If it were just that I could str.lower everything, but I'm also dealing with source text with a lot of misspellings. I think I might be able to get around this by using some ratio function from difflib. It would be useful to be able to supply a function which would be used to compare two strings and if it returns True, consider them to be equal.

wheel file

Hi,

I need wheel file for this package.When can i find it please.

Thanks & Regards,
Siva

pypi behind the times

Is the current master of PyMarkovChain stable enough to add to pypi? I found the argument n useful in the MarkovChain initialization. Thanks!

Add ability to do an nth order Markov chain

If I understand the code for this correctly, this is just a first order Markov chain. So for instance, if you have a sentence, "The sky is blue", it maps "The" -> "sky", "sky" -> "is", and "is" -> "blue". A higher order one would also consider "The sky" -> "is", "sky is" -> "blue", and so on. A keyword argument to specify the order would probably be best, as there are probably disadvantages of having orders be too high.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.