Hi there, In <div class=

Looked into the tflite/gemmlowp stack. So, for quantized conv <a href="https://git

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Is this product range of int8*int8 in comment document expected?,about google/gemmlowp

zhenhuaw-me commented on May 17, 2024

Looked into the tflite/gemmlowp stack. So, for quantized conv
https://github.com/tensorflow/tensorflow/blob/c865ec5621c013a7f8a4a26d380782e63117224f/tensorflow/lite/kernels/internal/optimized/optimized_ops.h#L2082-L2085 , which loads lhs (filter, value range $(0, 255]$) and rhs (input, value range $[0, 255]$), the int8*int8 + int8*int8 value size can hold in int16.

However, I am not sure how the input uint8 data

gemmlowp/internal/kernel_neon.h

Lines 942 to 945 in 58825b1

    
           void Run(std::int32_t* dst_ptr, std::size_t dst_row_stride, 
        
                    std::size_t dst_col_stride, const std::uint8_t* lhs_ptr, 
        
                    const std::uint8_t* rhs_ptr, std::size_t start_depth, 
        
                    std::size_t run_depth) const override {

which is loaded by

gemmlowp/internal/kernel_neon.h

Line 958 in 58825b1

"ld1 {v4.16b}, [%[lhs_ptr]], #16\n"

can be computed by signed instructions like and

gemmlowp/internal/kernel_neon.h

Lines 988 to 989 in 58825b1

    
           "smull    v8.8h,  v0.8b,  v4.8b\n" 
        
           "smull    v9.8h,  v1.8b,  v4.8b\n"

Would you please give a hint?

from gemmlowp.

bjacob commented on May 17, 2024

You are right that the comment at kernel_neon.h:708 is incorrect. It fails to mention that in order to avoid overflow in int16 := int8*int8 + int8*int8, it is necessary to require the int8 values to avoid the value -128.

As you found, this has been amended in the paper and in the way that TFLite uses this. As you found, there is a signedness discrepancy between on the one hand, the 8bit buffers in TFlite and at the API surface of gemmlowp, where everything is unsigned uint8, and in the kernels internally in gemmlowp, where everything is signed int8. The switch from unsigned to signed is implemented in the 'packing' phase of gemmlowp.

For the pack/compute/unpack phases of gemmlowp, refer to this doc:
https://github.com/google/gemmlowp/blob/master/doc/design.md
The portable (not NEON) implementation of the packing phase is this file:
https://github.com/google/gemmlowp/blob/master/internal/pack.h
Inside it, here is where the unsigned->signed conversion occurs:

gemmlowp/internal/pack.h

Lines 272 to 273 in 58825b1

    
           const std::int16_t kernel_val_unwrapped = 
        
               src_val - kZeroPointInputValue;

from gemmlowp.

zhenhuaw-me commented on May 17, 2024

Thank you @bjacob for the detailed knowledge sharing! That's really helpful! I didn't notice that there is a packing process in gemmlowp, should have read the docs carefully.

from gemmlowp.

Is this product range of int8*int8 in comment document expected? about gemmlowp HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	void Run(std::int32_t* dst_ptr, std::size_t dst_row_stride,
	std::size_t dst_col_stride, const std::uint8_t* lhs_ptr,
	const std::uint8_t* rhs_ptr, std::size_t start_depth,
	std::size_t run_depth) const override {

	const std::int16_t kernel_val_unwrapped =
	src_val - kZeroPointInputValue;