Comments (6)
Good find! Even linking to the RegEx standard showing that it works using the documented reference 👍
from regex.
Hope we can get this fixed soon 🔜🤞
from regex.
Your regex is kinda messed up. Specifically, this part (which is repeated):
[_[\w\d]*]?
Python regexes don't support nested character classes unlike the regex
crate. And because Python's regex engine follows the tradition of context dependent escaping rules, meta characters like ]
are treated literally when used in a context in which they cannot possibly have any special significance. But, as can be seen in this case, it makes the regex quite deceptive. Here's a better way to write the same part of the pattern:
[_\[\w\d]*\]?
And indeed, using that with the regex
crate produces the desired result:
use regex::Regex;
fn main() {
let pattern = r"(?:private|group)[_\[\w\d]*\]?_abc1d2345678ef90ab3c4567890defab[_\[\w\d]*\]?";
let compiled = Regex::new(pattern).unwrap();
let test_haystacks = vec![
"private_x9z45678abc12345d6e7890f123ghijk_abc1d2345678ef90ab3c4567890defab",
"private_x9z45678abc12345d6e7890f123ghijk_abc1d2345678ef90ab3c4567890defab___[[[aaa111]",
"private[_0f4f790_abc1d2345678ef90ab3c4567890defab",
];
for test_haystack in &test_haystacks {
match compiled.is_match(test_haystack) {
true => println!("PASS: {}", test_haystack),
false => eprintln!("FAIL: {}", test_haystack),
}
}
}
(I also switched to using raw strings via r"..."
so that you don't need to do double escaping.)
from regex.
Good find! Even linking to the RegEx standard showing that it works using the documented reference 👍
This isn't a bug and there is no requirement that this crate matches Python's regex engine in all cases. There's also no regex standard at play here (governing either Python's or Rust's regex engine).
from regex.
Hi @BurntSushi 👋
I don't consider this issue invalid.
I'm not in a position to change the un-compiled regular expressions as they are provided by end users, and if they're compilable, which they are, they are expected to be searchable.
Do you have any particular guidance toward a solution for compatibility?
from regex.
I don't know what you mean by your assertion that they are "compatible."
There is literally an unbounded number of ways in which Python regexes are different than Rust regexes. And this generally applies to all pairs of regex engines unless they very strictly follow a standard. (Of which, generally speaking, only two are prevalent: POSIX and ECMA. Neither Python's regex engine nor Rust's regex engine follow either one.)
I don't consider this issue invalid.
I want to be clear here that this issue is definitively invalid within the scope of this project. That doesn't mean you don't have a problem. You might have a problem on your end where you have a pile of regexes that worked with one regex engine and need to use them, unchanged, with some other regex engine. But that isn't really a problem I can help with and is in general not a problem that can be easily solved for any two regex engines. (Unless your patterns happen to incidentally behave the same, or as I mentioned above, the regex engines strictly adhere to an existing standard.)
Do you have any particular guidance toward a solution for compatibility?
Well... of course not. Because I don't really know the structure of the problem you're trying to solve. All that's been presented to me here is a regex that works one way in Python and a seeming request to have it work the same way in Rust. But that will definitively not happen. As far as solving your problem in a different way, I don't know because I don't know what problem you're trying to solve. If, for example, these regexes are provided by end users and you've promised that the regex syntax is equivalent to whatever Python supports, then you need to use a regex engine with the goal of compatibility with Python's regex engine. (Of which, I believe only one exists. The re
module in Python's standard library. The third party regex
Python package on PyPI might also have enough compatibility to work for you.)
from regex.
Related Issues (20)
- UnicodeSetsMode support (`v` flag mode, `\q`) HOT 9
- Detect if a replacement may allocate HOT 3
- Add method to get full match from `Captures` HOT 3
- Have a way to iterate over sub matches with names included HOT 1
- O(m * n) lookaround
- `meta::Cache::reset` can panic
- Add Min DFA for a regex HOT 23
- Inconsistent behavior with zero-width matches on empty strings
- Valid prefix search (with ^) goes into dead state HOT 3
- The regex parse error while the expre is correct ! HOT 2
- Onepass DFA always has empty captures (user error) HOT 2
- dfa/onepass.rs: index out of bounds HOT 2
- Errors when running quickstart from docs HOT 2
- Add a flag for unescaped literal groups HOT 1
- regex-lite with a &[u8] haystack HOT 2
- Underscore will not match propblaly HOT 2
- Invalid regex with multiple repetition flags is accepted HOT 3
- Valgrind reports "possibly lost" when using static `Regex` HOT 7
- adding regex-automata to cargo.toml file reduces performance
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from regex.