Comments (3)
Hey, thanks for the report. So, RFC 3986 allows "." in the scheme:
scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
Having said that, the only use of "." I can find in IANA's registry is "z39.50". So maybe we could restrict it to only allow a single "." instead of multiple dots.
That would mean something like foo.http://i.imgur.com
would still be recognized as a URL, but not your example.
What do you think?
from autolink-java.
Thanks for the reply! I see the problem with RFC 3986. In that case, I think in that case it's maybe better to leave this as documented behavior which programmers have to catch since I feel like Protocol adherence > convenience.
Especially since your proposed solution does not fully fix the problem. What would solve it is an extractor which only looks for HTTP[s] schemes and strips everything else from the scheme, but i feel like that's outside the scope of this library.
Funnily enough, twitter doesn't have that problem on their Conformance List
from autolink-java.
Sounds good! I've added a note to the README now, thanks.
(And yeah, Twitter's list is missing some interesting edge cases :).)
from autolink-java.
Related Issues (20)
- xss attacks questions HOT 5
- Don't autolink if authority is only "end" characters HOT 2
- support of git/github links automatic linking HOT 3
- Adapt autolink-java to replace rinku in JRuby HOT 1
- Possible code injection HOT 6
- Links with non-ASCII characters are not always extracted HOT 1
- URL Parsing getting stuck for non clickable URL HOT 3
- URL having consecutive "https://https://" are parsed as it is HOT 1
- Creole links contain the link text as well HOT 2
- Extract Phone numbers (request) HOT 1
- Linkify domain-only links (without scheme or www) HOT 1
- Do not insert HtmlTag when there is already a tag HOT 1
- Issue in extracting links if they are just extracted by commas HOT 1
- Some url without http and www domain HOT 2
- Add jlink-compatible Java9/Jigsaw module-info
- Version 0.10.2 broke binary compatibility HOT 1
- URL containing a single quote in middle results in unexpected ending HOT 1
- Dealing with | symbol HOT 4
- Stop URL on < or > HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autolink-java.