Giter Site home page Giter Site logo

bad URL recognition about lilyterm HOT 12 OPEN

tetralet avatar tetralet commented on August 30, 2024
bad URL recognition

from lilyterm.

Comments (12)

Tetralet avatar Tetralet commented on August 30, 2024

Although 'dot' is a legal letter of URL, I think that we can assume that all URLs will not end with dot... XD

Thanks for reporting this bug. I'll fix it ASAP.

from lilyterm.

slavkoja avatar slavkoja commented on August 30, 2024

Thanks

This URL recognition is feature, for which lilyterm is better choice than others terminal emulators ;-)

But recognizing the "sftp" can be nice too, but it is not in a "must have" catergory.

from lilyterm.

slavkoja avatar slavkoja commented on August 30, 2024

Hi,

i found another problem, the text::

... 'http://raspi.skk/gitlist/rm-hull_pcd8544/': ...

is recognized as::

http://raspi.skk/gitlist/rm-hull_pcd8544/':

regards

from lilyterm.

eevee avatar eevee commented on August 30, 2024

Similar happens with question marks and trailing parentheses ()).

Also, URLs whose domains don't contain a dot are not recognized, so I can't click on http://bugtracker/ticket/1234 in work IRC.

from lilyterm.

Profpatsch avatar Profpatsch commented on August 30, 2024

r Emacs <http://emacswiki.org/wiki/EMMS>, tries to open http://emacswiki.org/wiki/EMMS>,

What is a good rule for which chars should be stripped at the end?

Oh, I found it: http://daringfireball.net/2010/07/improved_regex_for_matching_urls
I just knew there had to be someone who already solved this.

from lilyterm.

eevee avatar eevee commented on August 30, 2024

DO NOT use that regex; it's susceptible to DoS via catastrophic backtracking.

I'm not sure a pure regex is appropriate for this; more likely you want a dead simple regex with some manual inspection to weed out false positives.

from lilyterm.

Profpatsch avatar Profpatsch commented on August 30, 2024

Can you further elaborate why this should be dangerous?
You just ordered us not to use that regex without explanation. I think I’m not the only who doesn’t want to get ordered around by a complete stranger.

As far as security goes, there are no bad comments here: https://news.ycombinator.com/item?id=1552766

from lilyterm.

eevee avatar eevee commented on August 30, 2024

Try matching that regex against http://www.foo.com/!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.

Depends on the engine, of course, but most engines handle nested +/* operators very poorly. Perl has special protection against this, so it works; PCRE (or at least pcregrep(1)) hits some backtracking limit and aborts; Python obediently carries on for more time than I cared to finish measuring, appearing frozen in the meantime.

It's an artifact of naïve backtracking that makes pathological input take exponential time. And you'd never notice on normal input, just like the author and the HN commentors didn't. But I'd rather not have every single terminal window potentially lock up because some jerk said a malformed URL on IRC. :)

Don't use hairy regexes, and particularly not ones you just got off some guy's website, unless you really really understand what you're doing. Just use something broad and touch it up with postprocessing. Or use a real parser, I guess.

from lilyterm.

Profpatsch avatar Profpatsch commented on August 30, 2024

Hm, I think I understand.

How about something like http://uriparser.sourceforge.net/.

from lilyterm.

slavkoja avatar slavkoja commented on August 30, 2024

Hi again :-)

Here is opposite problem, the URL "http://localhost:8000/" is not recognized as URL. Perhaps due port number?

from lilyterm.

Shougo avatar Shougo commented on August 30, 2024

@Tetralet You should close this issue. It is not closed.

from lilyterm.

Profpatsch avatar Profpatsch commented on August 30, 2024

Hm, I still get (https://something.xyz) recognized as https://something.xyz).

Cool would be if URLs with \n in between them could be recognized, too. Something like: If it hits \n, try to match until the next space nonetheless and strip out every \n.

from lilyterm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.