I appreciate the speed this library brings - it's very close to the speed of boost's multiprecision library for basic arithmetic.
I am trying to speed up uint256_t and uint512_t arithmetic in any possible way. I see that uint256_t and uint512_t are defined using 32-bit limbs. Is there anything prohibiting the use of 64-bit limbs instead?
I get a bunch of compile errors when I manually try to set the limb type to uint64_t. They seem to be template checks for the most part along with overflow warnings, but I don't know the code well enough to understand if we can work around them or not:
include/math/wide_integer/uintwide_t.h: In instantiation of ‘struct math::wide_integer::detail::uint_type_helper<128, void>’:
include/math/wide_integer/uintwide_t.h:637:125: required from ‘class math::wide_integer::uintwide_t<256, long unsigned int>’
test_main.cpp:13:39: required from here
include/math/wide_integer/uintwide_t.h:527:5: error: static assertion failed: Error: uint_type_helper is not intended to be used for this BitCount
static_assert(( ((BitCount >= 8U) && (BitCount <= 64U))
^~~~~~~~~~~~~
include/math/wide_integer/uintwide_t.h: In instantiation of ‘class math::wide_integer::uintwide_t<256, long unsigned int>’:
test_main.cpp:13:39: required from here
include/math/wide_integer/uintwide_t.h:645:5: error: static assertion failed: Error: Please check the characteristics of the template parameters ST and LT
static_assert(( (std::numeric_limits<limb_type>::is_integer == true)
^~~~~~~~~~~~~
include/math/wide_integer/uintwide_t.h: In instantiation of ‘class math::wide_integer::uintwide_t<512, long unsigned int, void>’:
test_main.cpp:15:56: recursively required by substitution of ‘template<class UnknownUnsignedWideIntegralType, class> math::wide_integer::uintwide_t<256, long unsigned int>::operator math::wide_integer::uintwide_t<256, long unsigned int>::double_width_type<UnknownUnsignedWideIntegralType, <template-parameter-1-2> >() const [with UnknownUnsignedWideIntegralType = <missing>; <template-parameter-1-2> = <missing>]’
test_main.cpp:15:56: required from here
include/math/wide_integer/uintwide_t.h:645:5: error: static assertion failed: Error: Please check the characteristics of the template parameters ST and LT
include/math/wide_integer/uintwide_t.h: In instantiation of ‘void math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::eval_divide_knuth(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&, math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>*) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’:
include/math/wide_integer/uintwide_t.h:984:26: required from ‘math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>& math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::operator/=(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’
test_main.cpp:16:58: required from here
include/math/wide_integer/uintwide_t.h:2029:80: warning: left shift count >= width of type [-Wshift-count-overflow]
limb_type(double_limb_type( double_limb_type(double_limb_type(1U) << std::numeric_limits<limb_type>::digits)
~~~~~~~~~~~~~~~~~~~~~^~~~~~
include/math/wide_integer/uintwide_t.h:2075:76: warning: left shift count >= width of type [-Wshift-count-overflow]
const double_limb_type u_j_j1 = (double_limb_type(uu[uj]) << std::numeric_limits<limb_type>::digits) + uu[uj - 1U];
~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/math/wide_integer/uintwide_t.h:2089:61: warning: left shift count >= width of type [-Wshift-count-overflow]
<= double_limb_type(double_limb_type(t << std::numeric_limits<limb_type>::digits) + uu[uj - 2U])))
~~^~~~~~
include/math/wide_integer/uintwide_t.h:2145:84: warning: left shift count >= width of type [-Wshift-count-overflow]
+ double_limb_type(double_limb_type(previous_u) << std::numeric_limits<limb_type>::digits));
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
include/math/wide_integer/uintwide_t.h: In instantiation of ‘void math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::eval_divide_by_single_limb(math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::limb_type, uint_fast32_t, math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>*) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void; math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::limb_type = long unsigned int; uint_fast32_t = long unsigned int]’:
include/math/wide_integer/uintwide_t.h:2021:37: required from ‘void math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::eval_divide_knuth(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&, math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>*) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’
include/math/wide_integer/uintwide_t.h:984:26: required from ‘math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>& math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::operator/=(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’
test_main.cpp:16:58: required from here
include/math/wide_integer/uintwide_t.h:1432:97: warning: left shift count >= width of type [-Wshift-count-overflow]
- double_limb_type(double_limb_type(short_denominator) * hi_part)) << std::numeric_limits<limb_type>::digits);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/math/wide_integer/uintwide_t.h:1442:141: warning: left shift count >= width of type [-Wshift-count-overflow]
- double_limb_type(double_limb_type(short_denominator) * hi_part)) << std::numeric_limits<limb_type>::digits);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/math/wide_integer/uintwide_t.h:1444:47: warning: right shift count >= width of type [-Wshift-count-overflow]
*remainder = limb_type(long_numerator >> std::numeric_limits<limb_type>::digits);
~~~~~~~~~~~~~~~^~~~~~
include/math/wide_integer/uintwide_t.h: In instantiation of ‘ST math::wide_integer::detail::make_hi(const LT&) [with ST = long unsigned int; LT = long unsigned int]’:
include/math/wide_integer/uintwide_t.h:2087:48: required from ‘void math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::eval_divide_knuth(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&, math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>*) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’
include/math/wide_integer/uintwide_t.h:984:26: required from ‘math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>& math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::operator/=(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’
test_main.cpp:16:58: required from here
include/math/wide_integer/uintwide_t.h:591:5: error: static assertion failed: Error: Please check the characteristics of the template parameters ST and LT
static_assert(( (std::numeric_limits<local_ushort_type>::is_integer == true)
^~~~~~~~~~~~~
include/math/wide_integer/uintwide_t.h:598:45: warning: right shift count >= width of type [-Wshift-count-overflow]
return static_cast<local_ushort_type>(u >> std::numeric_limits<local_ushort_type>::digits);
~~^~~~~~
include/math/wide_integer/uintwide_t.h: In instantiation of ‘ST math::wide_integer::detail::make_lo(const LT&) [with ST = long unsigned int; LT = long unsigned int]’:
include/math/wide_integer/uintwide_t.h:1617:60: required from ‘static void math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::eval_multiply_n_by_n_to_lo_part(math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::limb_type*, const limb_type*, const limb_type*, uint_fast32_t) [with long unsigned int RePhraseDigits2 = 256; const typename std::enable_if<((std::numeric_limits<LT>::digits * 4) == RePhraseDigits2)>::type* <anonymous> = 0; long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void; math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::limb_type = long unsigned int; uint_fast32_t = long unsigned int]’
include/math/wide_integer/uintwide_t.h:1495:38: required from ‘static void math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::eval_mul_unary(math::wide_integer::uintwide_t<OtherDigits2, LimbType, AllocatorType>&, const math::wide_integer::uintwide_t<OtherDigits2, LimbType, AllocatorType>&, typename std::enable_if<((OtherDigits2 / std::numeric_limits<LT>::digits) < math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::number_of_limbs_karatsuba_threshold)>::type*) [with long unsigned int OtherDigits2 = 256; long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void; typename std::enable_if<((OtherDigits2 / std::numeric_limits<LT>::digits) < math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::number_of_limbs_karatsuba_threshold)>::type = void]’
include/math/wide_integer/uintwide_t.h:946:23: required from ‘math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>& math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>::operator*=(const math::wide_integer::uintwide_t<Digits2, LimbType, AllocatorType>&) [with long unsigned int Digits2 = 256; LimbType = long unsigned int; AllocatorType = void]’
test_main.cpp:15:56: required from here
include/math/wide_integer/uintwide_t.h:569:5: error: static assertion failed: Error: Please check the characteristics of the template parameters ST and LT
static_assert(( (std::numeric_limits<local_ushort_type>::is_integer == true)
^~~~~~~~~~~~~
The reason I ask is that one of the goals of the library is to run on embedded systems, many of which I presume are 32-bit systems. I am running a 64-bit system and would like to know if speed gains are possible using full 64-bit limbs. It looks like Boost uses 32-bit limbs, too, so maybe there's a constraint I don't understand.
Other speedup tips would be very much appreciated, too.