packtpublishing / hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda Goto Github PK

View Code? Open in Web Editor NEW

596.0 22.0 222.0 22.79 MB

Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt

License: MIT License

Cuda 40.92% C++ 42.61% Python 16.46%

hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda's Introduction

Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA

Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt

This is the code repository for Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA, published by Packt.

**Effective techniques for processing complex image data in real time using GPUs **

What is this book about?

Computer vision has been revolutionizing a wide range of industries, and OpenCV is the most widely chosen tool for computer vision with its ability to work in multiple programming languages. Nowadays, in computer vision, there is a need to process large images in real time, which is difficult to handle for OpenCV on its own. This is where CUDA comes into the picture, allowing OpenCV to leverage powerful NVDIA GPUs. This book provides a detailed overview of integrating OpenCV with CUDA for practical applications.

This book covers the following exciting features: Understand how to access GPU device properties and capabilities from CUDA programs

Learn how to accelerate searching and sorting algorithms
Detect shapes such as lines and circles in images
Explore object tracking and detection with algorithms
Process videos using different video analysis techniques in Jetson TX1
Access GPU device properties from the PyCUDA program
Understand how kernel execution works

If you feel this book is for you, get your copy today!

Instructions and Navigations

All of the code is organized into folders. For example, Chapter02.

The code will look like the following:

while (tid < N)
    {
       d_c[tid] = d_a[tid] + d_b[tid];
       tid += blockDim.x * gridDim.x;
    }

Following is what you need for this book: This book is a go-to guide for you if you are a developer working with OpenCV and want to learn how to process more complex image data by exploiting GPU processing. A thorough understanding of computer vision concepts and programming languages such as C++ or Python is expected.

With the following software and hardware list you can run all code files present in the book (Chapter 1-12).

Software and Hardware List

Chapter	Software required	OS required
1-4	CUDA Toolkit X.X, Microsoft Visual Studio Community Edition, Nsight	Windows, Mac OS X, and Linux (Any)
5-8	OpenCV Library	Windows, Mac OS X, and Linux (Any)
10-12	Anaconda Python, PyCUDA	Windows, Mac OS X, and Linux (Any)

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Visit the following link to check out videos of the code being run: http://bit.ly/2PZOYcH

Get to Know the Author

Bhaumik Vaidya Bhaumik Vaidya is an experienced computer vision engineer and mentor. He has worked extensively on OpenCV Library in solving computer vision problems. He is a University gold medalist in masters and is now doing a PhD in the acceleration of computer vision algorithms built using OpenCV and deep learning libraries on GPUs. He has a background in teaching and has guided many projects in computer vision and VLSI(Very-large-scale integration). He has worked in the VLSI domain previously as an ASIC verification engineer, so he has very good knowledge of hardware architectures also. He has published many research papers in reputable journals to his credit. He, along with his PhD mentor, has also received an NVIDIA Jetson TX1 embedded development platform as a research grant from NVIDIA.

Suggestions and Feedback

Click here if you have any feedback or suggestions.

Download a free PDF

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.

https://packt.link/free-ebook/9781789348293

hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda's People

Contributors

Stargazers

Watchers

Forkers

tovarnovm steveshaw gridl aihua yhyuan zzhuuh2 fenix0817 bhaumik2450 ychuan1115 fuxuliu paradoxicall fendaq jerrybonjour aust-hansen yueyihua hust-wayne hongbowei anhuipl2010 xiaodongsky micalson zyxrrr technoinc-india kareem1925 htkang369 stonejadeking magictour ferryleaf avinash-chouhan xiaoyc003 asdlei99 dennishucd earlbabson daviddsun juno119 platon3344 youhengchan akki6843 trustxu mengzhangjian tony-hou mymistakes mrojasabregu fcqing nergnixouhm9 shanliwa1 idreamboat neveroldmilk allensmile booool kevinchen1223 benjamesbabala mmtaksuu 245003478 mikeshuaige szqxx jsxyhelu kingsleyzw beoy secondmover kevinzhanggg dovanhuong huwade cwwubq dreamplayerzhang hundunmuchi abdurrasith ucasiggcas abrliu zinbers stefanruan jgabriellima sunchuanxi enginbozaba raymanyu baoyufuyou qqdkg lizheng1997 wliang410 lyfpeter mrfan-karl hitsz-zuoqi sfxiang culturenotes thomasderzweifler ox1d0 bluestupidyu liuxubit fengwucb arunsprasad xrosliang csyxzhang dex-wu zhangwei730 1035326373 dw-liedji shmrodrigues jigs899 robotchaox littlegreyzhang dudusnoopy

hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda's Issues

Questions about 02_variable_addition_reference.cu

I have tested this code. When I remove the blockIdx, such as the code as follows:

__global__ void myfirstkernel(void) {
	//blockIdx.x gives the block number of current kernel
	//printf("Hello!!!I'm thread in block: %d\n", blockIdx.x);
	printf("Hello!!!I'm thread in block: %d\n");
}

int main(void) {
	//A kernel call with 16 blocks and 1 thread per block
	myfirstkernel << <16, 1 >> > ();
	//Function used for waiting for all kernels to finish
	cudaDeviceSynchronize();
	printf("All threads are finished!\n");
	return 0;
}

I can not get the kernel running results as the book shows. Just as:

If add blockIdx back, as printf("Hello!!!I'm thread in block: %d\n", blockIdx.x) then I get the right results.
Why?

Opencv CUDA detected much more Hough circles than CPU

Hello,

System information (version)
OpenCV 4.5.1, Windows 10 64 bits, Python3.7, Cuda 10.2

Detailed description
I have found that cuda HoughCirclesDetector gives much more circles than CPU HougnCircles with the same parameters.
Also the parameter minDist seems inefficient, larger the minDist is, it still gives the circels between which distance less than minDist.

For pic "eight.tif", it gives incrediable 24 circles , the code as following:

import cv2
import numpy as np
gpuImg = cv2.cuda_GpuMat()
def cv_show(name, image):
    cv2.namedWindow(name, cv2.WINDOW_NORMAL)
    cv2.imshow(name, image)

    cv2.waitKey(0)
    cv2.destroyAllWindows()
def getGpuResize(src):
    global gpuImg
    basePixSize = 1280
    height = src.shape[0]
    width = src.shape[1]
    print('height:', height)
    print('width:', width)
    largeSize = max(height, width)
    resizeRate = basePixSize/largeSize
    gpuImg.upload(src)
    gpuDst = cv2.cuda.resize(gpuImg, (int(width*resizeRate),int(height*resizeRate)))
    return gpuDst

def getGpuCircles(gpuSrc,src):
    ksize = (5,5)
    gpuFilter = cv2.cuda.createGaussianFilter(srcType=cv2.CV_8UC1, dstType=cv2.CV_8UC1, ksize=ksize, sigma1=0, sigma2=0)
    gpuSrc = cv2.cuda_Filter.apply(gpuFilter, gpuSrc)
    detector = cv2.cuda.createHoughCirclesDetector(dp=1, minDist=100, cannyThreshold=122, votesThreshold=50,
                                                   minRadius=1,
                                                   maxRadius=308)  # 100/110/50/445/510 or 150/110/50/430/510/MINRADIUS260
    cuCircles = detector.detect(gpuSrc)
    circles = cuCircles.download()
    print("circles0:", circles)
    print(circles.shape)
    if circles is not None:  # detect circles
        circles = np.uint0(np.around(circles))
        print("circles1:", circles)
    for circle in circles[0, :]:
        dst = cv2.circle(src, (circle[0], circle[1]), circle[2], (255, 0, 255), 2)
    return dst
if __name__ == '__main__':
    img = cv2.imread('./eight.tif', cv2.IMREAD_GRAYSCALE)
    cuImg = getGpuResize(img)
    img = cuImg.download()
    print(img.shape)
    # cv_show('img', img)
    dst = getGpuCircles(cuImg, img)
    cv_show('res', dst)

Appreciate it.

OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call

OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /home/idriver/data/opencv-3.2.0/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 340
terminate called after throwing an instance of 'cv::Exception'
what(): /home/data/opencv-3.2.0/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:340: error: (-217) no kernel image is available for execution on the device in function call

Aborted (core dumped)
after compiling the Chapter5/07_image_addition.cpp. i execute 07_image_ddition,and the error occured.

Using opencv cuda library functions on TX1 is very slow at 1FPS

I first tested the data addition code for the 50,000 members of the book on TX1, and it worked just fine.

But when I tested the image addition code, I only got 1FPS.

book:

my test:

Does the tx1 have any NVIDIA drivers to install?I only installed the system and cuda as follows:

chapter2 GPU VS CPU runtime

the chapter 2 of this book show that the GPU and CPU delay time when N = 10000000 by using clock_t to compute. But when set N = 10000000, and the code show nothing, may be the stack overflows. I want to konw how could you solved, but the repo has no the appropriate code, could you add the source code.
Thanks!

Memory corruption

Following your book and try to run code for adding two image
Compilation is successful but when every try to run getting
Malloc () : memory corruption
Backtrace
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libc.so.6
/usr/lib/nvidia-384/tls/libnvidia-tls.so.384.130
Memory map

Core dump()

Using i5 6gen Nvidia GeForce 820m