Giter Site home page Giter Site logo

cre2's Introduction

This is a fork from http://github.com/marcomaggi/cre2/.

with the following changes

  • use cmake to repace autoconf, will only generate a static lib.
  • add shortcut codes in DEFINE_MATCH_REX_FUN when there're no needs to return any matches to speed up my own usage.
  • includes go wrappers adapted from https://github.com/wordijp/golang-re2, which patched the cre2's header for better cgo integration.
  • modify the go wrapper to compile with newer Go compiler(see the notes below), and also some modification to accommodate the previous optimization in C binding's DEFINE_MATCH_REX_FUN.

to use the C binding

  1. install linux distro provided re2 package or build re2 by yourself, and make sure static lib libre2.a is finally available no matter in which way.
  2. mkdir build && cd build
  3. cmake .. will try to use re2 in system paths. but if it's not found, build re2 by yourself and hint cmake like cmake -Dre2_DIR=/<re2-src>/build/install/lib/cmake/re2 ..
    • re2_DIR should contains a usable cmake import script like re2Config.cmake
    • use -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-g" to build the most optimized(-O3) version with debugging symbols
    • use -DCMAKE_BUILD_TYPE=RelWithDebInfo to build the regular optimized(-O2) version with debugging symbols
  4. run make install to get the installed result in build/install/
bool cre2_demo(std::string regstr, std::string textstr) {
    bool ret = false;
    auto opt = cre2_opt_new();
    cre2_opt_set_log_errors(opt, 0);
    cre2_opt_set_max_mem(opt, 1024*1024*10);
    auto regexp = cre2_new(regstr.c_str(), regstr.length(), opt);
    const cre2_string_t text = {textstr.c_str(), int(textstr.length())};
    auto start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < 100000; ++i) {
      ret = cre2_partial_match_re(regexp, &text, NULL, 0);
    }
    auto finish = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::nanoseconds>(finish-start).count() << "ns\n";
    cre2_delete(regexp);
    cre2_opt_delete(opt);
    return ret;
}

to use the go binding

  1. make sure the C binding is properly compiled.
  2. copy or soft link libre2.a and libstdc++.a besides c static binding libcre2.a.
  3. go get -v github.com/tsingakbar/cre2
  4. import this package and use it in your code. currently the go wrapper is 1.5-2.0x slower than the C bindings in my tests. but it is still much more faster than golang's regexp stdlib package which claims it is implemented with the same algorithm as RE2, but completely written in golang.
var (
	regexpFilter *re2.Regexp
	closer       *re2.Closer
	err error
)
if regexpFilter, closer, err = re2.Compile(regexpStr); err != nil {
	panic(err)
}
var result bool
var bEpoch = time.Now()
for i := 0; i < 100000; i++ {
	result = regexpFilter.Match(textbytes)
}
fmt.Printf("cre2 %v\n", time.Since(bEpoch))
fmt.Println(result)
closer.Close(regexpFilter)

NOTE: Since Go 1.6, you can no longer create a C struct like cre2_string_t in Go memory while setting one of its field to another pointer in Go memory like setting cre2_string_t::data to a slice's pointer. To solve this problem, you have to write wrappers to alloc these kind of C struct in C memory (stack space is prefered). By now I have only covered cre2_partial_match_re() to fullfill my own needs. You probably need to implement the others in re2.go, so any pull requests is welcomed.

cre2's People

Contributors

tsingakbar avatar afiaux avatar megahall avatar mks-m avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.