drore's Introduction

DroRE: A better regular expression library for Python

This library is largely compatible with the Python built-in re module, but with some improvements.

Multiple captures of the same group

Regular expressions allow capturing sub-groups by putting them in parentesis. If such a sub-group appears in a repeating expression, the built-in re module only saves one value, whereas drore keeps all of them.

For example,

>>> m = re.match(r'(\d+)(,(\d+))*', '12,34,56,78,90')
>>> m.group(1)
'12'
>>> m.group(3)
'90'

>>> m = drore.match(r'(\d+)(,(\d+))*', '12,34,56,78,90')
>>> m.get(1), list(m.get_all(3))
('12', ['34', '56', '78', '90'])

See also examples/parsing.py for a multi-line example.

No slow edge cases

The built-in re module uses a backtracking algorithm that results in exponential complexity on some inputs. For example,

re.match(r'(a+)+b', 'a'*27 + 'c')

takes several seconds to reject. In contrast, drore compiles the expression to an NFA program which implements the regular expression, and then runs it. The NFA program is efficient to generate and always runs quickly. So the corresponding code

drore.match(r'(a+)+b', 'a'*27 + 'c')

takes less than a millisecond.

Features

TBD

Recommend Projects

amirlb / drore Goto Github PK

drore's Introduction

DroRE: A better regular expression library for Python

Multiple captures of the same group

No slow edge cases

Features

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent