Comments (2)
Guangming Xing Algorithm
For the minimization algorithm, see the paper Minimized Thompson NFA by Guangming Xing, which provides an improved approach to the classic method, resulting in smaller trees, less auxiliary states, and it's faster.
The problem of converting a regular expression to NFA is a fundamental problem that has been well studied. However, the two basic construction algorithms: (1) Thompson, (2) McNaughton-Yamada and Glushkov, both have disadvantages.
In this paper: First, a "smart" parsing algorithm is developed which constructs a parse tree with at most (3l − 1) nodes form a regular expression with l literals; Second, we propose an algorithm that works on the resulting NFA from Thompson's construction, eliminating as many auxiliary states as possible while maintaining Thompson's properties.
It is shown that the resulting NFA is minimized. This means that no auxiliary states can be eliminated without violating the defining properties of Thompson NFA. The time and space requirement for the above algorithms are linear with respect to the length of the regular expression. To the author's knowledge, this is the first linear time algorithm minimizing an NFA in a precise technical sense.
from regex-engine.
Yes, it would be good if there was also minimization for the NFA in the RegEx engine. The DFA would also benefit from this.
In the paper, the first algorithm requires a parse tree as input. In the RegEx engine, no parse tree is created. The second algorithm needs an NFA as input, so it could work in my RegEx engine.
Thanks, will have a look after the final version 1.0 is released.
from regex-engine.
Related Issues (20)
- Add support of limited repetitions
- Add escape sequence `\Q`...`\E`
- Add predefined character class `\d` (digit)
- Add predefined character class `\s` (whitespace)
- Add feature to match multiple regexes simultaneously
- Add escape sequence `\xhh` (character with hex code `hh`)
- Add escape sequence `\uhhhh` (character with hex code `hhhh`)
- Add feature to export NFA/DFA table as state diagram HOT 1
- Add support for custom character classes `[`...`]`
- Add escape sequence `\f` (form feed)
- Add predefined character class `\D` (no digit)
- Add predefined character class `\S` (no whitespace)
- Add predefined character class `\w` (word character)
- Add predefined character class `\W` (no word character)
- Add Case-Insensitive Mode
- Add full support for UTF-16 / Unicode HOT 3
- Create reduced module that contains only a DFA matcher
- Add ASCII mode
- Add a `CHANGELOG.md` file to the project HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from regex-engine.