Giter Site home page Giter Site logo

blog's Introduction

Hi 👋, I'm lartpang

🧑‍🤝‍🧑 Me

$$ \textbf{life} = \int_{birth}^{now} \mathbf{happy}(time) + \mathbf{sad}(time) d(time) $$

A Python and PyTorch developer, deep-learning worker and open-source activist.

Created by Bing Image Creator 😊

📝 Recent Writing

View the archives @ csdn@p_lart.

📽️ Some Projects

Name Stars Description
Hands-on-Docker (中文) stars 一份详尽的 Docker 使用指南。
Awesome-Class-Activation-Map stars An awesome list of papers and tools about the class activation map (CAM) technology.
PyTorchTricks stars Some tricks of pytorch…
MethodsCmp stars A Simple Toolkit for Counting the FLOPs/MACs, Parameters and FPS of Pytorch-based Methods.
PySODEvalToolkit stars A Python-based salient object detection and video object segmentation evaluation toolbox.
PySODMetrics stars A simple and efficient implementation of SOD metrcis.
PyLoss stars Some loss functions for deeplearning.
OpticalFlowBasedVOS stars A simple and efficient codebase for the optical flow based video object segmentation.
CoSaliencyProj stars A project for co-saliency detection. Some codes are borrowed from ICNet. Thanks to ICNet Intra-saliency Correlation Network for Co-Saliency Detection (NIPS2020)
RunIt stars A simple program scheduler for your code on different devices.
RegisterIt stars Register it: A more flexible register for the DeepLearning project.
mssim.pytorch stars A better pytorch-based implementation for the mean structural similarity. Differentiable simpler SSIM and MS-SSIM.
tta.pytorch stars Test-Time Augmentation library for Pytorch.
YuQueTools stars A simple tool to download your own articles from yuque.
ManageMyAttachments stars Manage the attachments of your own obsidian vault.

blog's People

Contributors

github-actions[bot] avatar lartpang avatar

Stargazers

 avatar  avatar  avatar

blog's Issues

Snippets of OpenVINO-CPP for Model Inference

Header File

#include <openvino/openvino.hpp>

Create Infer Request

void preprocessing(std::shared_ptr<ov::Model> model) {
  ov::preprocess::PrePostProcessor ppp(model);
  ppp.input().tensor().set_layout("NHWC"); // input data is NHWC from OpenCV Mat
  ppp.input().model().set_layout("NCHW"); // In the model, the layout is NCHW
  model = ppp.build();
}

ov::Core core;

auto model = core.read_model(model_path); # can use onnx or openvino's xml file
preprocessing(model);

auto compiled_model = core.compile_model(model, "CPU");  // Or without `"CPU"`
auto input_port = compiled_model.input();
auto infer_request = compiled_model.create_infer_request();

Input and Output

  • single input
infer_request.set_input_tensor(blob);
infer_request.crop_net.infer();
  • single output
ov::Tensor single_output = this->point_net.get_output_tensor(0);
  • multiple outputs
ov::Tensor multi_outputs0 = this->point_net.get_output_tensor(0);
ov::Tensor multi_outputs1 = this->point_net.get_output_tensor(1);

OpenCV cv::Mat <-> OpenVINO ov::Tensor

The key to these steps is the alignment of the data layout.

cv::Mat -> ov::Tensor

// converting the uint8 3-channels image mat to a float32 tensor
image.convertTo(image, CV_32FC3, 1.0 / 255);
// NHWC layout as mentioned above. (N=1, C=3)
ov::Tensor blob(input_port.get_element_type(), input_port.get_shape(), (float *)image.data);

ov::Tensor -> cv::Mat

// tensor follows the NCHW layout, so tensor_shape is (N,C,H,W)
ov::Shape tensor_shape = tensor.get_shape();
// Due to N=1 and C=1, we can directly assign all data to a new mat.
cv::Mat mat(tensor_shape[2], tensor_shape[3], CV_32F, tensor.data());

Reference

[Linux] File Packaging and Compression

.tar

# 仅打包,并非压缩
tar -xvf FileName.tar         # 解包
tar -cvf FileName.tar DirName # 将DirName和其下所有文件(夹)打包

.gz

gunzip FileName.gz  # 解压1
gzip -d FileName.gz # 解压2
gzip FileName       # 压缩,只能压缩文件

.tar.gz/.tgz

tar -zxvf FileName.tar.gz               # 解压
tar -zcvf FileName.tar.gz DirName       # 将DirName和其下所有文件(夹)压缩
tar -C DesDirName -zxvf FileName.tar.gz # 解压到目标路径

.zip

unzip FileName.zip          # 解压
zip FileName.zip DirName    # 将DirName本身压缩
zip -r FileName.zip DirName # 压缩,递归处理,将指定目录下的所有文件和子目录一并压缩

.rar

rar x FileName.rar      # 解压
rar a FileName.rar DirName # 压缩

Reference

[Deep Learning] Using CAM/Grad-CAM/Grad-CAM++ to understand CNN

CAM: Class Activation Maps

def generate_cam(input_model, image, layer_name='block5_conv3', H=224, W=224):
    cls = np.argmax(input_model.predict(image)) # Obtain the predicted class
    conv_output = input_model.get_layer(layer_name).output # Get the weights of the last output layer
    
    last_conv_layer_model = keras.Model(input_model.inputs, conv_output) # Create a model with the last output layer    
    class_weights = input_model.get_layer(layer_name).get_weights()[0] # Get the weights of the output layer
    class_weights = class_weights[0,:,:,:]
    class_weights = np.mean(class_weights, axis=(0, 1))    
    
    last_conv_output = last_conv_layer_model.predict(image) # The feature map output from last output layer
    last_conv_output = last_conv_output[0, :]
    cam = np.dot(last_conv_output, class_weights)    
    
    cam = zoom(cam, H/cam.shape[0]) # Spatial Interpolation/zooming to image size
    cam = cam / np.max(cam) # Normalizing the gradcam    
    return cam

Grad-CAM

def grad_cam(input_model, image, layer_name='block5_conv3', H=224, W=224):    
    cls = np.argmax(input_model.predict(image)) # Get the predicted class
    y_c = input_model.output[0, cls] # Probability Score
    conv_output = input_model.get_layer(layer_name).output #Tensor of the last layer of cnn
    grads = K.gradients(y_c, conv_output)[0] # Gradients of the predicted class wrt conv_output layer
    
    get_output = K.function([input_model.input], [conv_output, grads]) 
    output, grads_val = get_output([image]) # Gives output of image till conv_output layer and the gradient values at that level
    output, grads_val = output[0, :], grads_val[0, :, :, :]    
    
    weights = np.mean(grads_val, axis=(0, 1)) # Mean of gradients which acts as our weights
    cam = np.dot(output, weights) #Grad-CAM output
    
    cam = np.maximum(cam, 0) # Applying Relu
    cam = zoom(cam,H/cam.shape[0]) # Spatial Interpolation/zooming to image size
    cam = cam / cam.max() # Normalizing the gradcam    
    return cam

Grad-CAM++

def grad_cam_plus(input_model, image, layer_name='block5_conv3',H=224, W=224):
    cls = np.argmax(input_model.predict(image))
    y_c = input_model.output[0, cls]
    conv_output = input_model.get_layer(layer_name).output
    grads = K.gradients(y_c, conv_output)[0]
    
    first = K.exp(y_c)*grads # Variables used to calculate first second and third gradients
    second = K.exp(y_c)*grads*grads
    third = K.exp(y_c)*grads*grads*grads

    # Gradient calculation
    get_output = K.function([input_model.input], [y_c,first,second,third, conv_output, grads])
    y_c, conv_first_grad, conv_second_grad,conv_third_grad, conv_output, grads_val = get_output([img])
    global_sum = np.sum(conv_output[0].reshape((-1,conv_first_grad[0].shape[2])), axis=0)

    # Used to calculate the alpha values for each spatial location
    alpha_num = conv_second_grad[0]
    alpha_denom = conv_second_grad[0]*2.0 + conv_third_grad[0]*global_sum.reshape((1,1,conv_first_grad[0].shape[2]))
    alpha_denom = np.where(alpha_denom != 0.0, alpha_denom, np.ones(alpha_denom.shape))
    alphas = alpha_num/alpha_denom
    
    # Calculating the weights and alpha's which is the scale at which we multiply the weights with more importance
    weights = np.maximum(conv_first_grad[0], 0.0)
    alpha_normalization_constant = np.sum(np.sum(alphas, axis=0),axis=0)
    alphas /= alpha_normalization_constant.reshape((1,1,conv_first_grad[0].shape[2])) # Normalizing alpha
    
    # Weights with alpha multiplied to get spatial importance
    deep_linearization_weights = np.sum((weights*alphas).reshape((-1,conv_first_grad[0].shape[2])),axis=0)
    
    grad_CAM_map = np.sum(deep_linearization_weights*conv_output[0], axis=2) # Grad-CAM++ map
    cam = np.maximum(grad_CAM_map, 0)
    cam = zoom(cam,H/cam.shape[0])
    cam = cam / np.max(cam)     
    return cam

Reference

Build OpenCV and OpenVINO for Windows 10 with VS 2022.

In this guide, I will build the two powerful open-source libraries, i.e., OpenCV and OpenVINO for running my deeplearning model on windows 10.
Interestingly, both libraries are closely associated with Intel 🖥️.

OpenCV 😮

First of all, we must download the related code projects (opencv and opencv_contrib containing some plugins for opencv) into our computer from this links:

Make sure the selected versions of the two libararies are the same.
Here, I choice the latest version 4.7.0.
Because we will recompiling them by ourselves, we can just download the source code zip files.
Put the two unpacked libraries into the same parent folder opencv_dir as follows:

-opencv_dir
  -opencv-4.7.0
    -...
  -opencv_contrib-4.7.0
    -modules
    -...

NOTE: To avoid the network issue that may be encountered during using CMake, we need to add the url proxy prefix https://ghproxy.com/ before the urls of some setting of the relevant modules like https://ghproxy.com/https://raw.github***:

  • .cmake in opencv-4.7.0/3rdparty/ippicv
  • .cmake in opencv-4.7.0/3rdparty/ffmpeg
  • CMakeLists.txt in opencv_contrib-4.7.0/modules/face
  • Files in cmake of opencv_contrib-4.7.0/modules/xfeatures2d
  • CMakeLists.txt in opencv_contrib-4.7.0/modules/wechat_qrcode
  • CMakeLists.txt in opencv_contrib-4.7.0/modules/cudaoptflow

Next, start compiling OpenCV.

  1. Create the build folder: cd opencv_dir && mkdir opencv-build-vs2022
  2. Configure and generate the VS solution by CMake with some config items:
  • General:
    • source folder: <opencv-4.7.0>
    • build folder: <opencv-build-vs2022>
    • BUILD_OPENCV_WORLD=ON
    • CMAKE_BUILD_TYPE=RELEASE
    • OPENCV_ENABLE_NONFREE=ON
    • BUILD_opencv_dnn=ON
    • OPENCV_EXTRA_MODULES_PATH=<opencv_contrib-4.7.0/modules>
  • CUDA:
    • WITH_CUDA=ON
    • WITH_CUDNN=ON
    • WITH_CUBLAS=ON
    • WITH_CUFFT=ON
    • CUDA_FAST_MATH=ON
    • CUDA_ARCH_BIN=7.5 (We can fill the single value corresponding to the real GPU for accelerating the compilation process.)
    • OPENCV_DNN_CUDA=ON
  1. Go to the build directory: cd <opencv-build-vs2022>
  2. Start build by cmake and msvc compiler: cmake --build . --config Release --verbose -j8
  3. Install the built opencv into the install folder in the current path: cmake --install . --prefix install
  4. Add the bin directory into the user environment: <path>\install\x64\vc17\bin
  5. In VS:
    • add the <path>\install\include directory into "解决方案资源管理器->右键点击属性->VC++目录->外部包含目录"
    • add the <path>\install\x64\vc17\lib directory into "解决方案资源管理器->右键点击属性->VC++目录->库目录"
    • add the opencv_world470.lib into "解决方案资源管理器->右键点击属性->链接器->输入->附加依赖项"

OpenVINO 🍰

The document of OpenVINO is intuitive and the readability is better than OpenCV.
The relevant content about building and installing the libirary is listed in these links:

After building and install the OpenCV library, it's time to move on to OpenVINO.

  1. We need clone the project and the sub modules.
    git clone https://github.com/openvinotoolkit/openvino.git
    cd openvino
    git submodule update --init --recursive
    
  2. Create the build folder: mkdir build && cd build
  3. Configure and generate the VS solution by CMake:
    • ENABLE_INTEL_GPU=OFF (We only use the Intel CPU.)
    • Disable some frontend items:
      • ENABLE_OV_PDPD_FRONTEND=OFF
      • ENABLE_OV_TF_FRONTEND=OFF
      • ENABLE_OV_TF_LITE_FRONTEND=OFF
      • ENABLE_OV_PYTORCH_FRONTEND=OFF
    • For Python:
      • ENABLE_PYTHON=ON It seems that openvino-dev needs to be installed first in the detected environment, otherwise a warning message will be thrown in the cmake-gui window.
      • PYTHON_EXECUTABLE=<python.exe>
      • PYTHON_INCLUDE_DIR=<incude directory>
      • PYTHON_LIBIRARY=<pythonxx.lib in libs directory>
    • For OpenCV:
      • ENABLE_OPENCV=ON
      • OpenCV_DIR=<opencv-build-vs2022/install>
  4. Build the library: cmake --build . --config Release --verbose -j8
  5. Install the library into the install directory: cmake --install . --prefix install
  6. Add the bin directory into the environment:
    • <path>\install\runtime\bin\intel64\Release
    • <path>\install\runtime\3rdparty\tbb\bin
  7. In VS:
    • add the <path>\install\runtime\include directory into "解决方案资源管理器->右键点击属性->VC++目录->外部包含目录"
    • add the <path>\install\runtime\lib\intel64\Release directory into "解决方案资源管理器->右键点击属性->VC++目录->库目录"
    • add the 🌟 openvino.lib, 🌟 openvino_onnx_frontend.lib, openvino_c.lib into "解决方案资源管理器->右键点击属性->链接器->输入->附加依赖项"

Set DLL path in IDE

  • VS: "right click on solution -> Properties -> Debugging -> Environment -> PATH=<path>\install\x64\vc17\bin;%PATH%"
  • Qt Creator: "Projects -> Build & Run -> Build/Run -> Environment -> Details -> Eidt %PATH% -> Add <path>\install\x64\vc17\bin"

[OpenCV] Six methods of indexing pixels in Mat

.at<>()

// modify the pixel directly
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols; ++w) {
        image.at<Vec3b>(h, w)[0] = 255;
        image.at<Vec3b>(h, w)[1] = 0;
        image.at<Vec3b>(h, w)[2] = 0;
    }
}

// modify the pixel by the reference
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols; ++w) {
        Vec3b& bgr = image.at<Vec3b>(h, w);
        bgr.val[0] = 0;
        bgr.val[1] = 255;
        bgr.val[2] = 0;
    }
}

// the image has one channel
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols / 2; ++w) {
        image.at<uchar>(h, w) = 128;
    }
}

.ptr<>()

// use uchar type
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols / 2; ++w) {
        uchar* ptr = image.ptr<uchar>(h, w);
        ptr[0] = 255;
        ptr[1] = 0;
        ptr[2] = 0;
    }
}
// use cv::Vec3b type
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols / 2; ++w) {
        Vec3b* ptr = image.ptr<Vec3b>(h, w);
        ptr->val[0] = 0;
        ptr->val[1] = 255;
        ptr->val[2] = 0;
    }
}

// use the row pointer and the image has one channel
for (int h = 0; h < image.rows; ++h) {
    uchar* ptr = image.ptr(h);
    for (int w = 0; w < image.cols / 2; ++w) {
        ptr[w] = 128;
    }
}

// use the pixel pointer and the image has one channel
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols / 2; ++w) {
        uchar* ptr = image.ptr<uchar>(h, w);
        *ptr = 255;
    }
}

iterator

// the image has three channels
Mat_<Vec3b>::iterator it = image.begin<Vec3b>();
Mat_<Vec3b>::iterator itend = image.end<Vec3b>();
for (; it != itend; ++it) {
    (*it)[0] = 255;
    (*it)[1] = 0;
    (*it)[2] = 0;
}

// the image has one channel
Mat_<uchar>::iterator it1 = image.begin<uchar>();
Mat_<uchar>::iterator itend1 = image.end<uchar>();
for (; it1 != itend1; ++it1) {
    (*it1) = 128;
}

.data pointer

// 3 channels
uchar* data = image.data;
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols / 2; ++w) {
        *data++ = 128;
        *data++ = 128;
        *data++ = 128;
    }
}

// 1 channel
uchar* data = image.data;
for (int h = 0; h < image.rows; ++h) {
    for (int w = 0; w < image.cols / 2; ++w) {
        *data++ = 128;
    }
}

.row() and .col()

for (int i = 0; i < 100; ++i) {
    image.row(i).setTo(Scalar(0, 0, 0)); // modify the i th row data
    image.col(i).setTo(Scalar(0, 0, 0)); // modify the i th column data
}

when isContinuous() is true

Mat image = imread("...");
int nRows = image.rows;
int nCols = image.cols * image.channels();

if (image.isContinuous()) {
    nCols = nRows * nCols;
    nRows = 1;
}

for (int h = 0; h < nRows; ++h) {
    uchar* ptr = image.ptr<uchar>(h);
    for (int w = 0; w < nCols; ++w) {
        // ptr[w] = 128 ;
        *ptr++ = 128;
    }
}

Reference

New Attempt

This is a new attempt.

Let me try writing some articles only in English, which is my second language in daily life and my first one in work and research.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.