Giter Site home page Giter Site logo

pauldix / domainatrix Goto Github PK

View Code? Open in Web Editor NEW
309.0 309.0 60.0 117 KB

A cruel mistress that uses the public suffix domain list to dominate URLs by canonicalizing, finding the public suffix, and breaking them into their domain parts.

Ruby 100.00%

domainatrix's People

Contributors

dj2 avatar enricob avatar f1sherman avatar joelvh avatar leereilly avatar mtodd avatar pauldix avatar pcasaretto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

domainatrix's Issues

Blows up on .dev (for Pow web server)

Obviously this isn't a legitimate TLD, but it's in use thanks to the Pow rack server by 37 Signals. It might make sense to add support for it, or at least not throw an error when it is used with parse:

>> Domainatrix.parse('http://google.com/')
=> #<Domainatrix::Url:0x00000104a196b8 @scheme="http", @host="google.com", @url="http://google.com/", @public_suffix="com", @domain="google", @subdomain="", @path="/">
>> Domainatrix.parse('http://google.dev/')
NoMethodError: undefined method `has_key?' for nil:NilClass
  from [...]/whiny_nil.rb:48:in `method_missing'
  from [...]/domainatrix-0.0.10/lib/domainatrix/domain_parser.rb:59:in `block in parse_domains_from_host'
  from [...]/domain_parser.rb:54:in `each_index'
  from [...]/domain_parser.rb:54:in `parse_domains_from_host'
  from [...]/domain_parser.rb:40:in `parse'

Blows up when URL doesn't contain HTTP:// (Sinatra)

Blows up when URL doesn't contain HTTP:// would be nice to make the HTTP:// optional

Code was tested under Sinatra

Error:-

  • undefined method `split' for nil:NilClass
    • file: domain_parser.rb
    • location: parse_domains_from_host
    • line: 49

concatenating "http://" is a workaround but it would be nice to have this within the gem itself..

Canonical name

Issue

On the gem's home page, you write:

url = Domainatrix.parse("http://www.pauldix.net")
url.canonical # => "net.pauldix"

However, in IRB, I get the following behavior:

irb> url = Domainatrix.parse('http://www.pauldix.net')
=> #<Domainatrix::Url:0x007fd0409d5310 @scheme="http", @host="www.pauldix.net", @url="http://www.pauldix.net", @public_suffix="net", @domain="pauldix", @subdomain="www", @path="">
> url.canonical
=> "net.pauldix.www"

Is the www supposed to be a part of the canonical name?

Ruby

$ ruby -v
ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-darwin14.0.0]

Gem

$ gem list domainatrix -d

*** LOCAL GEMS ***

domainatrix (0.0.11)
    Authors: Paul Dix, Brian John
    Homepage: http://github.com/pauldix/domainatrix
    Installed at: /Users/craibuc/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0

    A cruel mistress that uses the public suffix domain list to dominate
    URLs by canonicalizing, finding the public suffix, and breaking them
    into their domain parts.

Pass IP Address Causes Exception

It would be nice to be able to pass in IP addresses, as often a website will run as the IP for testing. Eg. url = Domainatrix.parse(request.url)
where request.url may be 192.168.0.1 testing on the local network.
At the moment its throws 'You have a nil object when you didn't expect it!'

Blows up when domain has no suffix eg, 'http://www.foo/'

Hi,
It seems that Domainatrix.parse() method fails when domain has no suffix eg, 'http://www.foo/'

$ irb

require 'domainatrix'
Domainatrix.parse('http://www.foo/')

NoMethodError: undefined method has_key?' for nil:NilClass from /Users/ami/.rvm/gems/ruby-1.9.2-head@rails3beta/gems/domainatrix-0.0.10/lib/domainatrix/domain_parser.rb:59:inblock in parse_domains_from_host'

Thanks,
Ami

url.full_domain ?

Suggestion:
At the moment, in order to get the full domain (minus subdomain) I have to:
url.domain + '.' + url.public_suffix

It would be nice to have one method that combines these :)

No scheme gives error

p Domainatrix.parse("/test?foo=bar")
# => NoMethodError: undefined method `split' for nil:NilClass

p Domainatrix.parse("example.com/test?foo=bar")
# => NoMethodError: undefined method `split' for nil:NilClass

p Domainatrix.parse("www.example.com/test?foo=bar")
# => NoMethodError: undefined method `split' for nil:NilClass

p Domainatrix.parse("http://www.example.com/test?foo=bar")
#=> #<Domainatrix::Url:0x007fa4064d7810 @scheme="http", @host="www.example.com", @url="http://www.example.com/test?foo=bar", @public_suffix="com", @domain="example", @subdomain="www", @path="/test?foo=bar">

Exception on IP address in host string

ruby-1.9.2-p0 > Domainatrix.parse('http://74.205.88.194/article/news/microsoft_ballmer_envious_ipads_success_insists_windows_tablets_are_priority')
NoMethodError: undefined method `has_key?' for nil:NilClass
    from /Users/igrigorik/.rvm/gems/ruby-1.9.2-p0/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:52:in `block in parse_domains_from_host'
    from /Users/igrigorik/.rvm/gems/ruby-1.9.2-p0/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:47:in `each_index'
    from /Users/igrigorik/.rvm/gems/ruby-1.9.2-p0/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:47:in `parse_domains_from_host'
    from /Users/igrigorik/.rvm/gems/ruby-1.9.2-p0/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:33:in `parse'
    from /Users/igrigorik/.rvm/gems/ruby-1.9.2-p0/gems/domainatrix-0.0.7/lib/domainatrix.rb:12:in `parse'
    from (irb):2
    from /Users/igrigorik/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in `'
ruby-1.9.2-p0 > 

Failure to parse DAT file with Ruby 1.9.1

When using Domainatrix with Ruby 1.9.1 (p378 on OSX 10.6 i386) the following error occurs when calling Domainatrix.parse:
ArgumentError: invalid byte sequence in US-ASCII
from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:14:in strip' from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:14:inblock in read_dat_file'
from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:13:in each' from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:13:inread_dat_file'
from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix/domain_parser.rb:9:in initialize' from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix.rb:11:innew'
from /opt/lib/ruby/gems/1.9.1/gems/domainatrix-0.0.7/lib/domainatrix.rb:11:in parse' from (irb):3 from /opt/bin/irb:12:in

'

FIX:

change domainatrix/domain_parser.rb:14 from:
line = line.strip

to: line = line.force_encoding('utf-8').strip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.