Giter Site home page Giter Site logo

cidranger's Introduction

cidranger

Fast IP to CIDR block(s) lookup using trie in Golang, inspired by IPv4 route lookup linux. Possible use cases include detecting if a IP address is from published cloud provider CIDR blocks (e.g. 52.95.110.1 is contained in published AWS Route53 CIDR 52.95.110.0/24), IP routing rules, etc.

GoDoc Reference Build Status Coverage Status Go Report Card

This is visualization of a trie storing CIDR blocks 128.0.0.0/2 192.0.0.0/2 200.0.0.0/5 without path compression, the 0/1 number on the path indicates the bit value of the IP address at specified bit position, hence the path from root node to a child node represents a CIDR block that contains all IP ranges of its children, and children's children.

Visualization of trie storing same CIDR blocks with path compression, improving both lookup speed and memory footprint.

Getting Started

Configure imports.

import (
  "net"

  "github.com/yl2chen/cidranger"
)

Create a new ranger implemented using Path-Compressed prefix trie.

ranger := NewPCTrieRanger()

Inserts CIDR blocks.

_, network1, _ := net.ParseCIDR("192.168.1.0/24")
_, network2, _ := net.ParseCIDR("128.168.1.0/24")
ranger.Insert(NewBasicRangerEntry(*network1))
ranger.Insert(NewBasicRangerEntry(*network2))

To attach any additional value(s) to the entry, simply create custom struct storing the desired value(s) that implements the RangerEntry interface:

type RangerEntry interface {
	Network() net.IPNet
}

The prefix trie can be visualized as:

0.0.0.0/0 (target_pos:31:has_entry:false)
| 1--> 128.0.0.0/1 (target_pos:30:has_entry:false)
| | 0--> 128.168.1.0/24 (target_pos:7:has_entry:true)
| | 1--> 192.168.1.0/24 (target_pos:7:has_entry:true)

To test if given IP is contained in constructed ranger,

contains, err = ranger.Contains(net.ParseIP("128.168.1.0")) // returns true, nil
contains, err = ranger.Contains(net.ParseIP("192.168.2.0")) // returns false, nil

To get all the networks given is contained in,

containingNetworks, err = ranger.ContainingNetworks(net.ParseIP("128.168.1.0"))

To get all networks in ranger,

entries, err := ranger.CoveredNetworks(*AllIPv4) // for IPv4
entries, err := ranger.CoveredNetworks(*AllIPv6) // for IPv6

Benchmark

Compare hit/miss case for IPv4/IPv6 using PC trie vs brute force implementation, Ranger is initialized with published AWS ip ranges (889 IPv4 CIDR blocks and 360 IPv6)

// Ipv4 lookup hit scenario
BenchmarkPCTrieHitIPv4UsingAWSRanges-4         	 5000000	       353   ns/op
BenchmarkBruteRangerHitIPv4UsingAWSRanges-4    	  100000	     13719   ns/op

// Ipv6 lookup hit scenario, counter-intuitively faster then IPv4 due to less IPv6 CIDR
// blocks in the AWS dataset, hence the constructed trie has less path splits and depth.
BenchmarkPCTrieHitIPv6UsingAWSRanges-4         	10000000	       143   ns/op
BenchmarkBruteRangerHitIPv6UsingAWSRanges-4    	  300000	      5178   ns/op

// Ipv4 lookup miss scenario
BenchmarkPCTrieMissIPv4UsingAWSRanges-4        	20000000	        96.5 ns/op
BenchmarkBruteRangerMissIPv4UsingAWSRanges-4   	   50000	     24781   ns/op

// Ipv6 lookup miss scenario
BenchmarkPCTrieHMissIPv6UsingAWSRanges-4       	10000000	       115   ns/op
BenchmarkBruteRangerMissIPv6UsingAWSRanges-4   	  100000	     10824   ns/op

Example of IPv6 trie:

::/0 (target_pos:127:has_entry:false)
| 0--> 2400::/14 (target_pos:113:has_entry:false)
| | 0--> 2400:6400::/22 (target_pos:105:has_entry:false)
| | | 0--> 2400:6500::/32 (target_pos:95:has_entry:false)
| | | | 0--> 2400:6500::/39 (target_pos:88:has_entry:false)
| | | | | 0--> 2400:6500:0:7000::/53 (target_pos:74:has_entry:false)
| | | | | | 0--> 2400:6500:0:7000::/54 (target_pos:73:has_entry:false)
| | | | | | | 0--> 2400:6500:0:7000::/55 (target_pos:72:has_entry:false)
| | | | | | | | 0--> 2400:6500:0:7000::/56 (target_pos:71:has_entry:true)
| | | | | | | | 1--> 2400:6500:0:7100::/56 (target_pos:71:has_entry:true)
| | | | | | | 1--> 2400:6500:0:7200::/56 (target_pos:71:has_entry:true)
| | | | | | 1--> 2400:6500:0:7400::/55 (target_pos:72:has_entry:false)
| | | | | | | 0--> 2400:6500:0:7400::/56 (target_pos:71:has_entry:true)
| | | | | | | 1--> 2400:6500:0:7500::/56 (target_pos:71:has_entry:true)
| | | | | 1--> 2400:6500:100:7000::/54 (target_pos:73:has_entry:false)
| | | | | | 0--> 2400:6500:100:7100::/56 (target_pos:71:has_entry:true)
| | | | | | 1--> 2400:6500:100:7200::/56 (target_pos:71:has_entry:true)
| | | | 1--> 2400:6500:ff00::/64 (target_pos:63:has_entry:true)
| | | 1--> 2400:6700:ff00::/64 (target_pos:63:has_entry:true)
| | 1--> 2403:b300:ff00::/64 (target_pos:63:has_entry:true)

cidranger's People

Contributors

dvrkps avatar ldkingvivi avatar maciej avatar mkorolyov avatar mudream4869 avatar readams avatar satta avatar securityguy avatar steenzout avatar yl2chen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cidranger's Issues

Custom Ranger Insert/Remove Hooks

Having the ability to supply custom insertion/removal function hooks would be really useful if the data being added was a complex object like a map/slice/struct where specific merge behavior could be specified. This is primarily for the use case where I'm inserting multiple times for the same network with different data as this prevents the need to buffer and merge before inserting into the trie.

Custom RangerEntry with Overlapping Networks

As posted below, you indicate that there is a way to handle overlapping CIDRs with a custom RangerEntry type. However, it doesn't seem to be possible from what I can tell.

We have a use case where we have multiple subnets that overlap, but are in different VRFs. We want to be able to load all these networks, enriched with the VRF in a custom RangerEntry type, and then retreive this data based on the VRF information.

Please review the code here to see if there is a way to do what we are looking for. - https://play.golang.org/p/9_1PKzEixkr
Thanks!


Yes, deduplication were intentional, the idea behind it is that this will any metadata that you would like to attach to a CIDR block could be done through implementing your own custom type that implements the RangerEntry interface

type RangerEntry interface {

Does that fit your use case?

Originally posted by @yl2chen in #14 (comment)

trie.go:insert - fix proposal for Ipv4\Ipv6 in same trie

func (p *prefixTrie) insert(network rnet.Network, entry RangerEntry) (bool, error) {
        
            *                *                   *

	// No existing child, insert new leaf trie.
	if existingChild == nil || existingChild.network.Number == nil {

          *                   *                   *

	}

          *                  *                  *

	// Check whether it is necessary to insert additional path prefix between current trie and existing child,
	// in the case that inserted network diverges on its path to existing child.
	if lcb, err := network.LeastCommonBitPosition(existingChild.network); err == nil {

          *                   *                  *

	} else {
		return false, err
	}

	return existingChild.insert(network, entry)
}

out of memory error when adding many random ip nets

Following code will produce an out of memory fatal error:

        trie := cidranger.NewPCTrieRanger()
        rand.Seed(1)
        addr := make([]byte, net.IPv4len)
        for n := 0; n < 1000; n += 1 {
                rand.Read(addr)
                cidr := int(rand.Uint32() % uint32(net.IPv4len*8))
                mask := net.CIDRMask(cidr, net.IPv4len*8)
                trie.Insert(cidranger.NewBasicRangerEntry(net.IPNet{
                        IP:   addr,
                        Mask: mask,
                }))
        }

Log shows that it's likely due to too many (*prefixTrie).insert recursive calls.

Slow performance populating tree

I was looking at various modules that can provide a fast lookup for whether an IP is in a list, and while testing this module, it turned out to have significantly slower load time compared to the other alternatives I looked at. The lookup is indeed faster, but not enough to justify the increased load time, which for me is prohibitively slow (20x slower than the fastest alternative, and takes over a minute to load a list of about 5 million IPs). I just wanted to create this issue so you're aware, and in case there's something which can be done to increase performance.

Benchmark code
package main

import (
	"encoding/binary"
	"math/rand"
	"net"
	"strconv"
	"testing"

	"github.com/asergeyev/nradix"
	"github.com/infobloxopen/go-trees/iptree"
	"github.com/yl2chen/cidranger"
)


var rng = rand.New(rand.NewSource(0))

func randIP(bits int) net.IP {
	var ipa [4]byte
	ip := net.IP(ipa[:])
	binary.BigEndian.PutUint32(ip[:], uint32(rng.Intn(1<<bits)<<(32-bits)))
	return ip
}

var IPs []string
func init() {
	for len(IPs) < 100000 {
		ip := randIP(24)
		mask := strconv.Itoa(rand.Intn(25) + 8)
		IPs = append(IPs, ip.String() + "/" + mask)
	}
}

func BenchmarkRangerLoad(b *testing.B) {
	for n := 0; n < b.N; n++ {
		ranger := cidranger.NewPCTrieRanger()
		for _, ipStr := range IPs {
			_, ipnet, _ := net.ParseCIDR(ipStr)
			ranger.Insert(cidranger.NewBasicRangerEntry(*ipnet))
		}
	}
}

func BenchmarkInfobloxLoad(b *testing.B) {
	for n := 0; n < b.N; n++ {
		ipt := iptree.NewTree()
		for _, ipStr := range IPs {
			_, ipnet, _ := net.ParseCIDR(ipStr)
			ipt.InplaceInsertNet(ipnet, nil)
		}
	}
}

func BenchmarkNRadixLoad(b *testing.B) {
	for n := 0; n < b.N; n++ {
		tree := nradix.NewTree(0)
		for _, ipStr := range IPs {
			tree.AddCIDR(ipStr, nil)
		}
	}
}

func BenchmarkRangerRead(b *testing.B) {
	b.StopTimer()
	ranger := cidranger.NewPCTrieRanger()
	for _, ipStr := range IPs {
		_, ipnet, _ := net.ParseCIDR(ipStr)
		ranger.Insert(cidranger.NewBasicRangerEntry(*ipnet))
	}
	b.StartTimer()

	for n := 0; n < b.N; n++ {
		ranger.Contains(randIP(32))
	}
}

func BenchmarkInfobloxRead(b *testing.B) {
	b.StopTimer()
	ipt := iptree.NewTree()
	for _, ipStr := range IPs {
		_, ipnet, _ := net.ParseCIDR(ipStr)
		ipt.InplaceInsertNet(ipnet, nil)
	}
	b.StartTimer()

	for n := 0; n < b.N; n++ {
		ipt.GetByIP(randIP(32))
	}
}

func BenchmarkNRadixRead(b *testing.B) {
	b.StopTimer()
	tree := nradix.NewTree(0)
	for _, ipStr := range IPs {
		tree.AddCIDR(ipStr, nil)
	}
	b.StartTimer()

	for n := 0; n < b.N; n++ {
		ip := randIP(32)
		tree.FindCIDR(ip.String() + "/32")
	}
}
BenchmarkRangerLoad-16      	       2	 797361462 ns/op
BenchmarkInfobloxLoad-16    	      14	  88521051 ns/op
BenchmarkNRadixLoad-16      	      27	  40973037 ns/op
BenchmarkRangerRead-16      	 8325171	       156.6 ns/op
BenchmarkInfobloxRead-16    	 2999112	       450.3 ns/op
BenchmarkNRadixRead-16      	 3116860	       362.2 ns/op

ย 

Edit: Update: In the end I ended up with a private fork that strips out a ton of code from cidranger to get the performance up. Mostly I removed support for all the different address types and used only netip types, as well as adding a loader which uses the last insert point as the starting point to search for a location for the new node. This got load times over 20x faster, and lookups 1.5x faster.
https://gist.github.com/phemmer/6231b12d5207ea93a1690ddc44a2c811

Trying to insert next IP (/32 subnet) causes index out of range

Example code

func main() {
	ranger = cidranger.NewPCTrieRanger()
	_, network1, _ := net.ParseCIDR("1.2.3.4/32")
	ranger.Insert(cidranger.NewBasicRangerEntry(*network1))

	_, network2, _ := net.ParseCIDR("1.2.3.5/32") // next IP
	ranger.Insert(cidranger.NewBasicRangerEntry(*network2))
}

causes panic: runtime error: index out of range

ip route get

could i get gateway over this by a specific host

Releasing problem : v1.0.0 differs between Github and pkg.go.dev

Thanks for the wonderful package, @yl2chen .

Go's go mod tooling is pulling the wrong v1.0.0 of cidranger from :
https://pkg.go.dev/github.com/yl2chen/[email protected]?tab=doc

Notice the release date of 21-December-2019 above, and that cidranger.AllIPv4 and cidranger.AllIPv6 including the iterator implementation is missing from both sources and the documentation.

This is in variance from the Github releases area, where v1.0.0 with date 24-December-2019, has all the above enhancements: https://github.com/yl2chen/cidranger/releases

sum.golang.org tag error

Hello!

I am getting an error trying to go get this code.

sum.golang.org: SECURITY ERROR
This download does NOT match the one reported by the checksum server.
The bits may have been replaced on the origin server, or an attacker may\nhave intercepted the download attempt.

For more information, see 'go help module-auth'.
" http-method=GET http-path=/github.com/yl2chen/cidranger/@v/v1.0.0.info http-url=/github.com/yl2chen/cidranger/@v/v1.0.0.info kind="Internal Server Error" module=github.com/yl2chen/cidranger operation=download.InfoHandler ops="[download.InfoHandler pool.Info protocol.Info protocol.processDownload redis.Stash stash.Pool stasher.Stash stasher.fetchModule goGetFetcher.Fetch module.downloadModule]" version=v1.0.0

Did you overwrite the 1.0.0 tag when you updated the coverage recently? I think bumping the tag to 1.0.1 might fix the problem.

Finding exact ip from /32 subnet causes index out of range

Similar to #3 - when you try to search for exact IP from "one ip subnet" code panics with runtime error: index out of range .

Test to replicate the issue:

func TestToReplicateIssue(t *testing.T) {
	cases := []struct {
		version  rnet.IPVersion
		inserts  []string
		ip       net.IP
		networks []string
		name     string
	}{
		{
			rnet.IPv4,
			[]string{"192.168.0.1/32"},
			net.ParseIP("192.168.0.1"),
			[]string{"192.168.0.1/32"},
			"basic containing network for /32 mask",
		},
		{
			rnet.IPv6,
			[]string{"a::1/128"},
			net.ParseIP("a::1"),
			[]string{"a::1/128"},
			"basic containing network for /128 mask",
		},
	}
	for _, tc := range cases {
		t.Run(tc.name, func(t *testing.T) {
			trie := newPrefixTree(tc.version)
			for _, insert := range tc.inserts {
				_, network, _ := net.ParseCIDR(insert)
				err := trie.Insert(NewBasicRangerEntry(*network))
				assert.NoError(t, err)
			}
			expectedEntries := []RangerEntry{}
			for _, network := range tc.networks {
				_, net, _ := net.ParseCIDR(network)
				expectedEntries = append(expectedEntries, NewBasicRangerEntry(*net))
			}
			networks, err := trie.ContainingNetworks(tc.ip)
			assert.NoError(t, err)
			assert.Equal(t, expectedEntries, networks)
		})
	}
}

I am looking for a range to CIDR function code So I can use this library

I have an ipfilter.dat and guardian.p2p list which I want to load into this library instance.

The format is something like the next: first math IP(v4/6) comma last match IP(V4/6) comma score integer comma description string

# This is a comment.  These ranges are blocked:
001.000.000.000 , 001.255.255.255 , 100 , Some organization
008.000.000.000 , 008.255.255.255 , 100 , Another organization

I have the library to parse these files into a Ranger which works fine.
However I want to be able to convert the ranges into CIDR and add them into the library.

The next example is being used to find the right CIDR match for IPv4:

Is there any example out there for IPV6?

Thread safe?

Can a global ranger be used across many goroutines for read only search operations?

Iterating over the cidrs

Hi!

Two questions:

I created a ranger and inserted a few entries. Is there a way to iterate over the list of CIDR blocks?

Also, is there something like ranger.Contains() that returns the data instead of a bool? I'm using something that implements the RangerEntry interface and I'd like to get back the entire thing.

Thanks!

Duplicates are not maintained

Hi,

I love the library - thanks for sharing it.

I initially assumed that if I inserted duplicate range entries, then both would be returned by ContainingNetworks(), but it appears that duplicates are eliminated. Looking at this code, I guess that's intentional?

cidranger/trie_test.go

Lines 38 to 41 in 928b519

rnet.IPv4,
[]string{"192.168.0.1/32", "192.168.0.1/32"},
[]string{"192.168.0.1/32"},
"duplicate network insert",

memory leak - removes do not get garbage collected

I'm noticing a behavior where deleted entries seem to not get reaped by garbage collection(runnable example below). For example if I add 1,000,000 entries and then Remove() them all and force a garbage collection, the ranger hangs onto the majority of the memory. If runs to 5 or so, you will see that it just grows and grows. Is this known? I took a look at the code but it's a bit hard for me to follow (as is most code I didn't write ;)

Usage

./cidrtest -ips 1000000 -runs 5

Sample Output

STARTING RUN 0
Starting Inserts
1000000 Inserts Complete
Observe Memory START Alloc = 615 MiB	TotalAlloc = 1828 MiB	Sys = 823 MiB	NumGC = 23
Starting Removes
ENDING RUN 0
Observe Memory END Alloc = 546 MiB	TotalAlloc = 2688 MiB	Sys = 3624 MiB	NumGC = 26
...
STARTING RUN 4
Starting Inserts
1000000 Inserts Complete
Observe Memory START Alloc = 1164 MiB	TotalAlloc = 11699 MiB	Sys = 4585 MiB	NumGC = 39
Starting Removes
ENDING RUN 4
Observe Memory END Alloc = 898 MiB	TotalAlloc = 12350 MiB	Sys = 4654 MiB	NumGC = 40

Code

package main

import (
	"encoding/binary"
	"flag"
	"fmt"
	"math/rand"
	"net"
	"runtime"
	"time"

	"github.com/yl2chen/cidranger"
)

func main() {

	ips := flag.Int("ips", 1000, "Number of IPs.  Default 1000")
	runs := flag.Int("runs", 1000, "Number of runs.  Default 2")
	flag.Parse()

	mask := net.CIDRMask(32, 32)
	trie := cidranger.NewPCTrieRanger()

	// Main loop
	for i := 0; i < *runs; i++ {
		fmt.Printf("STARTING RUN %d\n", i)
		fmt.Printf("Starting Inserts\n")
		for n := 0; n < *ips; n++ {
			trie.Insert(cidranger.NewBasicRangerEntry(net.IPNet{
				IP:   GenIPV4(),
				Mask: mask,
			}))
		}

		fmt.Printf("%d Inserts Complete\n", *ips)
		time.Sleep(5 * time.Second)
		fmt.Printf("Observe Memory START ")
		PrintMemUsage()
		fmt.Printf("Starting Removes\n")
		_, all, _ := net.ParseCIDR("0.0.0.0/0")
		ll, _ := trie.CoveredNetworks(*all)
		for i := 0; i < len(ll); i++ {
			trie.Remove(ll[i].Network())
		}
		ll = nil
		fmt.Printf("ENDING RUN %d\n", i)
		fmt.Printf("Observe Memory END ")
		runtime.GC()
		PrintMemUsage()
		time.Sleep(5 * time.Second)
	}
}

// GenIPV4 generates an IPV4 address
func GenIPV4() net.IP {
	rand.Seed(time.Now().UnixNano())
	var min, max int
	min = 1
	max = 4294967295
	nn := rand.Intn(max-min) + min
	ip := make(net.IP, 4)
	binary.BigEndian.PutUint32(ip, uint32(nn))
	return ip
}

// PrintMemUsage dumps memory stats for the process
func PrintMemUsage() {
	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	fmt.Printf("Alloc = %v MiB", bToMb(m.Alloc))
	fmt.Printf("\tTotalAlloc = %v MiB", bToMb(m.TotalAlloc))
	fmt.Printf("\tSys = %v MiB", bToMb(m.Sys))
	fmt.Printf("\tNumGC = %v\n", m.NumGC)
}

func bToMb(b uint64) uint64 {
	return b / 1024 / 1024
}

Key-value database with search by cidr

Hi Yulin Chen!

Do you have any ideas how to create some key-value database where key is a cidr or some algorithm to fast find occurrence of an address in network?

For example: You have a lot of network with various masks and some data about this networks. You try to find in which network this address exist and return data. I don't want iterate over in a loop all key-cidr and try to find solution with your cidrranger decision

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.