Giter Site home page Giter Site logo

Comments (5)

GoogleCodeExporter avatar GoogleCodeExporter commented on June 12, 2024
The caret anchors the pattern to the start of the string. Only "abadalab" 
starts at the start of the string.

"findall" normally performs a series of searches, each search starting from 
where the previous one ended, so the substrings found won't be overlapping. but 
if the "overlapped" flag is turned on, each search starts from one character 
beyond where the previous one _started_, allowing you to find overlapping 
substrings.

Original comment by [email protected] on 21 May 2011 at 12:19

  • Changed state: Invalid

from mrab-regex-hg.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 12, 2024
You're right, what I douche I am, the example that I provided is useless.
Let me try to make my point again. I don't know if this kind of regular 
expression value is valid on any regex interpreter. I hope you can clarify this 
to me.

Is there any reason why you don't include overlapping matches that start on 
_the same_ letter? Let me try with a new example below:

Input string: ' x one something and another something'
I want to get all the 'something's that have an 'x' before and whatever other 
stuff in between. Here, I would like to match: 'x one something' and 'x one 
something and another something'
I would have hoped regexp.findall(r"x.*something"," x one something and another 
something",overlapped=True) would produce that result. But like you said, after 
the last x.*something match is found, you advance a place and the second match 
is not found. In can find the other match if I do 
regexp.findall(r"x.*?something", ...), but I am toast if there is a third match 
in the middle.

Is this achievable with regular expressions at all? Why are the two results 
above not considered an overlap?

Thanks for your patience

Original comment by [email protected] on 22 May 2011 at 4:48

from mrab-regex-hg.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 12, 2024
I guess one solution, which works with regex.0.1.20110514 but not with the 
default python re module - or with Perl v5.10.1 for that matter is to use a 
variable-length lookbehind pattern:

regex.findall(r"(?<=x.*)something", ...)

Original comment by [email protected] on 22 May 2011 at 5:03

from mrab-regex-hg.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 12, 2024
A regex supports greedy match ".*" and lazy match ".*?" (lazy match was a later 
addition). I don't know of a regex implementation which supports what you're 
asking for. There are also the implementation details to work out...

How much demand would there be for it, anyway?

Although it's a form of pattern matching, and regex is pattern matching, it's 
not really a regex kind of thing.

Original comment by [email protected] on 22 May 2011 at 5:20

from mrab-regex-hg.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 12, 2024
Yeah, I don't know how much demand would there be for this. And I already 
solved what I needed with the variable-length lookbehind, which seems to be 
working fine.

I also understand about the additional complexity of the implementation. 
Without knowing how it's currently implemented, I can imagine moving forward 
one step after every match must simplify the implementation.

Just to be clear, my only problem was that when I saw the availability of the 
'overlapped=True' flag, I thought it was reasonable to assume it would also 
find overlapping matches that start on the same character. Just to be clear, 
here is a much simpler example: take the string 'abb' and the match 'a.*b'. 
'ab' and 'abb' are both valid, overlapping matches imho. 

I'm not pushing hard for any change or implying demand here, just trying to 
clarify what my confusion was, in case it helps with other potential confused 
users :-)

Original comment by [email protected] on 22 May 2011 at 5:27

from mrab-regex-hg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.