Giter Site home page Giter Site logo

twingly-url's Introduction

Twingly::URL

GitHub Build Status

Twingly URL tools.

  • twingly/url - Parse and validate URLs
    • Twingly::URL.parse - Returns one or more Twingly::URL instance
  • twingly/url/hasher - Generate URL hashes suitable for primary keys
    • Twingly::URL::Hasher.taskdb_hash(url) - MD5 hexdigest
    • Twingly::URL::Hasher.documentdb_hash(url) - SHA256 unsigned long, native endian digest
    • Twingly::URL::Hasher.autopingdb_hash(url) - SHA256 64-bit signed, native endian digest
  • twingly/url/utilities - Utilities to work with URLs
    • Twingly::URL::Utilities.extract_valid_urls - Returns Array of valid Twingly::URL

Getting Started

Install the gem:

gem install twingly-url

Usage (this output was created with examples/url.rb):

require "twingly/url"

url = Twingly::URL.parse("http://www.twingly.co.uk/search")
url.scheme                    # => "http"
url.normalized.scheme         # => "http"
url.trd                       # => "www"
url.normalized.trd            # => "www"
url.sld                       # => "twingly"
url.normalized.sld            # => "twingly"
url.tld                       # => "co.uk"
url.normalized.tld            # => "co.uk"
url.ttld                      # => "uk"
url.normalized.ttld           # => "uk"
url.domain                    # => "twingly.co.uk"
url.normalized.domain         # => "twingly.co.uk"
url.host                      # => "www.twingly.co.uk"
url.normalized.host           # => "www.twingly.co.uk"
url.origin                    # => "http://www.twingly.co.uk"
url.normalized.origin         # => "http://www.twingly.co.uk"
url.path                      # => "/search"
url.normalized.path           # => "/search"
url.without_scheme            # => "//www.twingly.co.uk/search"
url.normalized.without_scheme # => "//www.twingly.co.uk/search"
url.userinfo                  # => ""
url.normalized.userinfo       # => ""
url.user                      # => ""
url.normalized.user           # => ""
url.password                  # => ""
url.normalized.password       # => ""
url.valid?                    # => "true"
url.normalized.valid?         # => "true"
url.to_s                      # => "http://www.twingly.co.uk/search"
url.normalized.to_s           # => "http://www.twingly.co.uk/search"

url = Twingly::URL.parse("http://räksmörgås.макдональдс.рф/foo")
url.scheme                    # => "http"
url.normalized.scheme         # => "http"
url.trd                       # => "räksmörgås"
url.normalized.trd            # => "xn--rksmrgs-5wao1o"
url.sld                       # => "макдональдс"
url.normalized.sld            # => "xn--80aalb1aicli8a5i"
url.tld                       # => "рф"
url.normalized.tld            # => "xn--p1ai"
url.ttld                      # => "рф"
url.normalized.ttld           # => "xn--p1ai"
url.domain                    # => "макдональдс.рф"
url.normalized.domain         # => "xn--80aalb1aicli8a5i.xn--p1ai"
url.host                      # => "räksmörgås.макдональдс.рф"
url.normalized.host           # => "xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.origin                    # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.normalized.origin         # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.path                      # => "/foo"
url.normalized.path           # => "/foo"
url.without_scheme            # => "//räksmörgås.макдональдс.рф/foo"
url.normalized.without_scheme # => "//xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo"
url.userinfo                  # => ""
url.normalized.userinfo       # => ""
url.user                      # => ""
url.normalized.user           # => ""
url.password                  # => ""
url.normalized.password       # => ""
url.valid?                    # => "true"
url.normalized.valid?         # => "true"
url.to_s                      # => "http://räksmörgås.макдональдс.рф/foo"
url.normalized.to_s           # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo"

url = Twingly::URL.parse("http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo")
url.scheme                    # => "http"
url.normalized.scheme         # => "http"
url.trd                       # => "xn--rksmrgs-5wao1o"
url.normalized.trd            # => "xn--rksmrgs-5wao1o"
url.sld                       # => "xn--80aalb1aicli8a5i"
url.normalized.sld            # => "xn--80aalb1aicli8a5i"
url.tld                       # => "xn--p1ai"
url.normalized.tld            # => "xn--p1ai"
url.ttld                      # => "xn--p1ai"
url.normalized.ttld           # => "xn--p1ai"
url.domain                    # => "xn--80aalb1aicli8a5i.xn--p1ai"
url.normalized.domain         # => "xn--80aalb1aicli8a5i.xn--p1ai"
url.host                      # => "xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.normalized.host           # => "xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.origin                    # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.normalized.origin         # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai"
url.path                      # => "/foo"
url.normalized.path           # => "/foo"
url.without_scheme            # => "//xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo"
url.normalized.without_scheme # => "//xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo"
url.userinfo                  # => ""
url.normalized.userinfo       # => ""
url.user                      # => ""
url.normalized.user           # => ""
url.password                  # => ""
url.normalized.password       # => ""
url.valid?                    # => "true"
url.normalized.valid?         # => "true"
url.to_s                      # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo"
url.normalized.to_s           # => "http://xn--rksmrgs-5wao1o.xn--80aalb1aicli8a5i.xn--p1ai/foo"

url = Twingly::URL.parse("https://admin:[email protected]/")
url.scheme                    # => "https"
url.normalized.scheme         # => "https"
url.trd                       # => ""
url.normalized.trd            # => "www"
url.sld                       # => "example"
url.normalized.sld            # => "example"
url.tld                       # => "com"
url.normalized.tld            # => "com"
url.ttld                      # => "com"
url.normalized.ttld           # => "com"
url.domain                    # => "example.com"
url.normalized.domain         # => "example.com"
url.host                      # => "example.com"
url.normalized.host           # => "www.example.com"
url.origin                    # => "https://example.com"
url.normalized.origin         # => "https://www.example.com"
url.path                      # => "/"
url.normalized.path           # => "/"
url.without_scheme            # => "//admin:[email protected]/"
url.normalized.without_scheme # => "//admin:[email protected]/"
url.userinfo                  # => "admin:correcthorsebatterystaple"
url.normalized.userinfo       # => "admin:correcthorsebatterystaple"
url.user                      # => "admin"
url.normalized.user           # => "admin"
url.password                  # => "correcthorsebatterystaple"
url.normalized.password       # => "correcthorsebatterystaple"
url.valid?                    # => "true"
url.normalized.valid?         # => "true"
url.to_s                      # => "https://admin:[email protected]/"
url.normalized.to_s           # => "https://admin:[email protected]/"

Dependencies

Only the gems listed in the Gem Specification.

Development

To inspect the Public Suffix List, this handy command can be used (also works in projects that use twingly-url as an dependency).

open $(bundle show public_suffix)/data/list.txt

Tests

Run tests with

bundle exec rake

Profiling

There's some profiling tasks available through Rake

cd profile/
bundle # Install dependencies
bundle exec rake -T # Show available tasks

Note that this isn't a benchmark, we're using ruby-prof and memory_profiler which will slow things down.

Release workflow

  • Update the examples in this README if needed, generate the output with

      ruby examples/url.rb
    
  • Bump the version in lib/twingly/version.rb in a commit, no need to push (the release task does that).

  • Ensure you are signed in to RubyGems.org as twingly with gem signin.

  • Build and publish the gem. This will create the proper tag in git, push the commit and tag and upload to RubyGems.

      bundle exec rake release
    
  • Update the changelog with GitHub Changelog Generator (gem install github_changelog_generator if you don't have it, set CHANGELOG_GITHUB_TOKEN to a personal access token to avoid rate limiting by GitHub). This command will update CHANGELOG.md. You need to commit and push manually.

      github_changelog_generator
    

twingly-url's People

Contributors

dentarg avatar jage avatar walro avatar twingly-mob avatar roback avatar pontus4 avatar chrizpy avatar vikiv480 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.