Giter Site home page Giter Site logo

ypcrts / fqdn Goto Github PK

View Code? Open in Web Editor NEW
30.0 4.0 11.0 91 KB

RFC-compliant FQDN validation and manipulation for Python.

Home Page: http://fqdn.readthedocs.io/

License: Mozilla Public License 2.0

Python 100.00%
fqdn python python3 python2 validation rfc-3986 rfc-1035 rfc-1123 rfc-2181

fqdn's People

Contributors

cory-watson avatar gdubicki avatar jalseth avatar milahu avatar tednology avatar wakemaster39 avatar ypcrts avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fqdn's Issues

TLD must not be all numeric

So I understand if you want to say no to this, but-

https://tools.ietf.org/html/rfc3696#section-2

If we read this in the context of FQDNs, TLDs are expected to be not all-numeric. This might mean that .k12 might become a TLD in the far-off future, but not say .123. This may be something to test against for validity of FQDNs, and I could have swore in an older version (1.1.0) this was the case; you treated TLDs in this manner. Was this an intentional change?

Max label length is 63 characters

https://github.com/guyhughes/fqdn/blob/1ad687d6cd1d74c5781f673194a744ff105e345c/fqdn/__init__.py#L26-L28

RFC1035 specifies that 63 octets are available besides the 64th octet used for length. A length octet can only have a max value of 0011 1111 == 63.

The regex is limiting a label to 62 characters, but it can be 63 characters.

Take for example this working web site domain name:

http://www.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijk.com/

The second label has a length of 63.

See also https://webmasters.stackexchange.com/questions/16996/maximum-domain-name-length

RFC1035 and RFC1912 are incompatible

As with the original task for the underscore change, the world is not compliant the removal of allowing numbers at the beginning of hostnames has exploded alot of my URLs.

Need the capability to allow hostnames to begin with digits.

No Equality between FQDNs

Unable to determine if two FQDNs are equal directly and have to compare properties like relative or absolute.

TLD validation is not _quite_ correct

https://github.com/guyhughes/fqdn/blob/1ad687d6cd1d74c5781f673194a744ff105e345c/fqdn/__init__.py#L28

The current regex precludes hyphens and digits in the TLD entirely, while the actual restriction in RFC 1035 is that all labels must start with a letter, and also cannot end with a hyphen. This restriction was relaxed in RFC 1123 to allow labels to start with numbers as well, but not for TLDs (somewhat clarified by RFC 3696).

RFC 1123 does kind of imply that TLDs should be all-alphabetic with the following:

However, a valid host name can never
have the dotted-decimal form #.#.#.#, since at least the
highest-level component label will be alphabetic

But does not actually state this outright. If you look at the way international domain names are encoded into ASCII -- including TLDs -- then it becomes evident that the restriction on TLD labels must instead be the original one: they must start with a letter, may include numbers and hyphens, and cannot end with a hyphen.

should IP-Address-alikes be considered valid?

Are these intended behaviour? I haven't dug too hard into the referenced RFCs so it's possible they are valid, but ISTR there was previously a 'cannot start with a number' rule that was subsequently relaxed to allow some, but not exclusively, leading digits.

Following examples I think probably all(?) should actually be False?

In [38]: !pip freeze | grep fqdn
fqdn==1.5.1

In [39]: fqdn.FQDN('1').is_valid
Out[39]: False

In [40]: fqdn.FQDN('1', min_labels=1).is_valid
Out[40]: True

In [41]: fqdn.FQDN('127.0.0.1').is_valid
Out[41]: True

In [42]: fqdn.FQDN('127.thing.0.1').is_valid
Out[42]: True

In [43]: fqdn.FQDN('127.0.0.0.1').is_valid
Out[43]: True

I could convert those into tests in a PR if it does seem like a bug?

For my use it would ideally either accept a valid IP address (without a trailing FQ dot), or a valid hostname, but not something like 1.2.3.4.5, although separating valid IPs from invalid 'dotted numeric' inputs is something I can handle if it doesn't make sense here.

A quick poke around some of the docs, found in RFC1123, sec 2.1:

If a dotted-decimal number can be entered without such
identifying delimiters, then a full syntactic check must be
made, because a segment of a host domain name is now allowed
to begin with a digit and could legally be entirely numeric
(see Section 6.1.2.4). However, a valid host name can never
have the dotted-decimal form #.#.#.#, since at least the
highest-level component label will be alphabetic.

Empty string should not raise ValueError on construction

A ValueError is raised here if passing the empty string '' since it is a false value.

if not (fqdn and isinstance(fqdn, str)):

The correct line should read:

        if not isinstance(fqdn, str):
            raise ValueError("fqdn must be str")

Since isinstance(None, str) is valid python and returns False.

Which gives the following behaviour that I think is correct:

Python 3.12.2 (tags/v3.12.2:6abddd9, Feb  6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from fqdn import FQDN
>>> a = FQDN('')
>>> a.is_valid
False
>>> b = FQDN()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: FQDN.__init__() missing 1 required positional argument: 'fqdn'
>>> c = FQDN(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\shane.kearns\Documents\git\prappspec\tests\.virtualenv\Lib\site-packages\fqdn\__init__.py", line 44, in __init__
    raise ValueError("fqdn must be str")
ValueError: fqdn must be str

the real world isn't RFC compliant

Unfortunately the real world does not listen to RFCs, _ can be in hostname parameters, DNS respects it, chrome respects it but we fail here.

We should introduce a strict mode and a loose mode for trying to allow for better real world compatibility where required.

different code is called the same version 1.5.1 ?

Hi,

I ended up via https://pypi.org/project/fqdn/ to download https://files.pythonhosted.org/packages/30/3e/a80a8c077fd798951169626cde3e239adeba7dab75deb3555716415bd9b0/fqdn-1.5.1.tar.gz

The same pypi page points to this github project as the "home page"

When compared to this git repository at tag v1.5.1, the "fqdn" subdir is identical but there rest is very different. Like one has a LICENSE file and the other does not., one contains a tests/ dir, the other does not. This is kind of a nightmare for packagers, as I now don't know which is the real version, and I have to diff everything to check for malicious stuff.

Perhaps you can do a 1.5.2 release that brings these two sources back into sync?

[Feature Request] Internationalized domain name (IDN) support

Hello,

I just found this useful python tool, it looks great in most cases! Not sure is IDN support will be something you'd also like to have, it'll be awesome if it's also supported.

Currently, IDN will be considered as an invalid FQDN:

$ python
Python 3.6.9 (default, Jul 17 2020, 12:50:27) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from fqdn import FQDN
>>> FQDN('Bücher.example').is_valid
False

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.