Giter Site home page Giter Site logo

hrbrmstr / iptrie Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 3.0 46 KB

𐂷 Efficiently Store and Query 'IPv4' Internet Addresses with Associated Data

License: Apache License 2.0

R 37.67% C 62.33%
r rstats ip-address cidr trie r-cyber ipv4-trie ipv4-address internet-address ip-trie

iptrie's Introduction

Travis-CI Build Status Coverage Status CRAN_Status_Badge

iptrie

Efficiently Store and Query ‘IPv4’ Internet Addresses with Associated Data

Description

Tries are great ways to store data that has obvious hierarchichal properties, such as ‘IPv4’ addresses. Methods are provided to create ‘IPv4’ tries and store, retrieve and delete ‘IPv4’ address keys with values. Functions are based on the ‘zmap’ ‘iptree’ ‘C’ library.

NOTE

This is an experiment that will not likely turn into a CRAN package or get migrated to either the iptools or asntools package.

I initially wanted to feel the pain (again) of using R’s own C interface (i.e. no Rcpp crutch) since it’s easy to forget just how handy Rcpp is. (It turns out that my ill memories of playing at the R C level are also unjustified).

This is orders of magnitude slower than the method used in astools::as_asntrie() for an astools::routeviews_latest() data set.

What’s Inside The Tin

The following functions are implemented:

  • as_iptrie: Convert a data frame to an IP trie
  • iptrie: Efficiently Store and Query ‘IPv4’ Internet Addresses with Associated Data
  • iptrie_create: Create a new IPv4 Trie
  • iptrie_destroy: Destroy an IP trie
  • iptrie_insert: Insert a value for an IPv4 Address+CIDR combo into an IPv4 Trie
  • iptrie_ip_in: Lookup a value for an IPv4 Address+CIDR combo into an IPv4 Trie or Test for Existence
  • iptrie_lookup: Lookup a value for an IPv4 Address+CIDR combo into an IPv4 Trie or Test for Existence
  • iptrie_remove: Remove a trie entry for an IPv4 Address+CIDR combo into an IPv4 Trie
  • is_iptrie: Create a new IPv4 Trie ## Installation
devtools::install_git("https://sr.ht.com/~hrbrmstr/iptrie.git")
# or
devtools::install_git("https://gitlab.com/hrbrmstr/iptrie.git")
# or (if you must)
devtools::install_github("hrbrmstr/iptree")

Usage

library(iptrie)
library(tidyverse)

# current version
packageVersion("iptree")
## [1] '0.1.0'

Basic Usage

x <- iptrie_create()

iptrie_insert(x, "10.1.10.0/24", "HOME")

iptrie_ip_in(x,"10.1.10.1/32")
## [1] TRUE

iptrie_ip_in(x,"10.1.11.1/32")
## [1] FALSE

iptrie_lookup(x, "10.1.10.1/32")
## [1] "HOME"
## attr(,"ip")
## [1] "10.1.10.0"
## attr(,"ipn")
## [1] 167840256
## attr(,"mask")
## [1] 24

iptrie_lookup(x, "10.1.10.1/32", "exact")
## NULL

Data frame to iptrie

xdf <- data.frame(a = "10.1.10.0/24", b = "HOME", stringsAsFactors = FALSE)

(xt <- as_iptrie(xdf))
## <iptrie>

is_iptrie(xt)
## [1] TRUE

(xt <- as_iptrie(xdf, "a", "b"))
## <iptrie>

iptrie_ip_in(xt, "10.1.10.6")
## [1] TRUE

Bigger example (kinda what as.data.frame.iptrie does)

# Make a trie from autonomous system CIDRs
xdf <- astools::routeviews_latest()

cat(scales::comma(nrow(xdf)), "\n")
## 790,647

asntrie <- iptrie_create()

system.time(for (i in 1:nrow(xdf)) {
  iptrie_insert(asntrie, xdf[["cidr"]][i], xdf[["asn"]][i])
})
##    user  system elapsed 
##   8.835   0.178   9.114

# Get a block list (picked at ransom)
blklst <- url("https://iplists.firehol.org/files/botscout_1d.ipset")
y <- readLines(blklst)
close(blklst)

y <- y[!grepl("^#", y)] # comments at the top

cat(scales::comma(length(y)), "\n")
## 1,079

system.time(do.call(
  rbind.data.frame,
  lapply(y, function(.x) {
    r <- iptrie_lookup(asntrie, .x, "best")
    if (is.null(r)) {
      data.frame(
        ip = .x, 
        asn = NA_character_, 
        cidr = NA_character_,
        stringsAsFactors = FALSE
      )
    } else {
      data.frame(
        ip = .x, 
        asn = r, 
        cidr = sprintf("%s/%d", attr(r, "ip"), attr(r, "mask")),
        stringsAsFactors = FALSE
      )
    }
  })
) -> cdf)
##    user  system elapsed 
##   0.303   0.005   0.314

as_tibble(cdf)
## # A tibble: 1,079 x 3
##    ip            asn    cidr          
##    <chr>         <chr>  <chr>         
##  1 1.10.186.158  23969  1.10.186.0/24 
##  2 1.20.100.251  23969  1.20.100.0/24 
##  3 1.20.253.128  23969  1.20.253.0/24 
##  4 1.179.157.237 131293 1.179.157.0/24
##  5 1.179.198.37  131293 1.179.198.0/24
##  6 2.61.150.231  12389  2.61.0.0/16   
##  7 2.61.173.36   12389  2.61.0.0/16   
##  8 2.95.6.233    3216   2.95.6.0/24   
##  9 2.188.164.58  42337  2.188.164.0/22
## 10 5.8.37.214    50896  5.8.37.0/24   
## # … with 1,069 more rows

count(cdf, cidr, sort=TRUE)
## # A tibble: 786 x 2
##    cidr                 n
##    <chr>            <int>
##  1 185.220.101.0/24    19
##  2 5.188.210.0/24      11
##  3 51.15.0.0/17        10
##  4 183.198.0.0/16       8
##  5 199.249.230.0/24     8
##  6 172.68.244.0/22      7
##  7 173.44.224.0/22      7
##  8 31.184.238.0/24      7
##  9 192.42.116.0/22      6
## 10 104.223.0.0/17       5
## # … with 776 more rows

iptree Metrics

Lang # Files (%) LoC (%) Blank lines (%) # Lines (%)
C 3 0.23 346 0.65 37 0.32 7 0.03
R 8 0.62 103 0.19 25 0.22 165 0.66
Rmd 1 0.08 54 0.10 45 0.39 62 0.25
C/C++ Header 1 0.08 30 0.06 9 0.08 15 0.06

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

iptrie's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.