Giter Site home page Giter Site logo

Boost vs GSL about vinecopulib HOT 11 CLOSED

vinecopulib avatar vinecopulib commented on June 3, 2024 1
Boost vs GSL

from vinecopulib.

Comments (11)

tnagler avatar tnagler commented on June 3, 2024

Could you post some numbers on

  • how the speed ratio is affected by the sample size,
  • how the speed compares for fit() and select()?

from vinecopulib.

tnagler avatar tnagler commented on June 3, 2024

Just for reference: with the current version (a783e0d) I get

  • dnorm(): 0.014298 (boost) vs 0.005876 (gsl)
  • pnorm(): 0.017728 (boost) vs 0.004371 (gsl)
  • qnorm(): 0.019062 (boost) vs 0.005081 (gsl)

from vinecopulib.

slayoo avatar slayoo commented on June 3, 2024

Have you compiled the Boost version with the same optimisations as the GSL .so was compiled with? Trying with "-Ofast -march=native" I see Boost being almost twice faster than GSL (dnorm). Without optimisations, I do see similar results as reported above.

#include <functional>
#include <gsl/gsl_cdf.h>
#include <gsl/gsl_randist.h>

template<typename T> T dnorm_gsl(const T& x)
{
    return x.unaryExpr(std::ptr_fun(gsl_ran_ugaussian_pdf));
};


#include <boost/bind.hpp>
#include <boost/math/distributions.hpp>
#include <boost/function.hpp>

template<typename T> T dnorm(const T& x)
{
    boost::math::normal std_normal;
    return x.unaryExpr(boost::bind<double>(boost::math::pdf<boost::math::normal,double>, std_normal, _1));
};


#include <chrono>
#include <iostream>
template <typename T>
void time(const std::string label, const T &it)
{
    auto start = std::chrono::high_resolution_clock::now();
    it();
    auto finish = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> elapsed = finish - start;
    std::cout << label << "Elapsed time: " << elapsed.count() << " s\n";
}


#include <Eigen/Dense>

int main()
{
    Eigen::MatrixXd m(10000, 10000);

    time("Boost:", [m]{ auto a = dnorm(m); });
    time("GSL:  ", [m]{ auto b = dnorm_gsl(m); });
}

Usage:

$ clang++-3.6 -Ofast -march=native -std=c++11 -lgsl -lblas test_boost_vs_gsl.cpp
$ ./a.out
Boost:Elapsed time: 1.10801 s
GSL:  Elapsed time: 1.74596 s

$ clang++-3.6 -std=c++11 -lgsl -lblas test_boost_vs_gsl.cpp           
$ ./a.out
Boost:Elapsed time: 11.6608 s
GSL:  Elapsed time: 4.20882 s

from vinecopulib.

tvatter avatar tvatter commented on June 3, 2024

OK, so I changed the release flags (line 8 of compilerDefOpt.cmake) to use -Ofast -march=native instead of -O3:

Boost:Elapsed time: 1.16187 s
GSL:  Elapsed time: 0.929747 s

When using the debug flags (line 7 of compilerDefOpt.cmake), namely -g -O0 -DDEBUG -fsanitize=address -fno-omit-frame-pointer, I get:

Boost:Elapsed time: 21.7116 s
GSL:  Elapsed time: 4.15891 s

Turns out that:

  • I ran all my tests with the debug version.
  • Boost is still slower than GSL but it's not that bad.

from vinecopulib.

slayoo avatar slayoo commented on June 3, 2024

Re GSL vs. Boost in debug mode: are you linking with debug version of GSL for the benchmark? Otherwise it's an unfair comparison as the number crunching might actually happen in optimised GSL code from the precompiled shared library.

Re flags, perhaps it's worth to add -DNDEBUG to CMAKE_CXX_FLAGS_RELEASE as well.

HTH,
Sylwester

from vinecopulib.

tvatter avatar tvatter commented on June 3, 2024

No, GSL is the release version and it's indeed an unfair comparison. Actually, I was especially interested in the number for our release version (compiled with O3).

By the way, I tried Ofast and march=native but it does not improve much. Furthermore, when using march=native, I get the following error in test_bicop_select:

test_bicop_select(1439,0x7fff7a92f000) malloc: *** error for object 0x110dc1020: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

Thomas also tried and it runs fine on his computer, with similar speed-ups as you get. I have a macbook pro and am running on osx, can it be related?

from vinecopulib.

slayoo avatar slayoo commented on June 3, 2024

I suggest following the suggestion, i.e.:

$ gdb the_failing_binary
(gbd) break malloc_error_break
(gdb) run
(gdb) bt

Should give information on what's wrong with the deallocation.

HTH

from vinecopulib.

tvatter avatar tvatter commented on June 3, 2024

I get:

#0  0x00007fff8a3d1f32 in malloc_error_break () from /usr/lib/system/libsystem_malloc.dylib
#1  0x00007fff8a3c2fd2 in free () from /usr/lib/system/libsystem_malloc.dylib
#2  0x000000010000471a in (anonymous namespace)::ParBicopTest_bicop_select_mle_bic_is_correct_Test<IndepBicop>::TestBody() ()
#3  0x00000001000337ae in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
#4  0x000000010001d27e in testing::Test::Run() ()
#5  0x000000010001dd62 in testing::TestInfo::Run() ()
#6  0x000000010001e323 in testing::TestCase::Run() ()
#7  0x00000001000261bb in testing::internal::UnitTestImpl::RunAllTests() ()
#8  0x0000000100033f70 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#9  0x0000000100025d2e in testing::UnitTest::Run() ()
#10 0x0000000100003ab1 in main ()

I noticed that test_bicop_class is passing, but that the same error arise in test_bicop_parametric when it calls pdf_is_correct (i.e., par_to_tau_is_correct is passing)...

from vinecopulib.

slayoo avatar slayoo commented on June 3, 2024

Google suggested me a somehow similar bug report here: google/sanitizers#70 where LLVM/Clang's sanitizer was blamed (and reported to be later fixed). No idea how relevant it is?

from vinecopulib.

tnagler avatar tnagler commented on June 3, 2024

Not sure if -march=native is a good default option for the release version anyway. This does not allow to build the executables on one machine and use it on another.

Also, I found that -O2 -DNDEBUG gives us the same speed as -O3 -DNDEBUG, but smaller executables. I think we should go with -O2 -DNDEBUG for now and revisit the compiler options before our first release (see #24).

from vinecopulib.

tnagler avatar tnagler commented on June 3, 2024

Since I think we're happy with boost now, I'll close this issue.

from vinecopulib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.