Giter Site home page Giter Site logo

packtpublishing / hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda Goto Github PK

View Code? Open in Web Editor NEW
596.0 22.0 222.0 22.79 MB

Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt

License: MIT License

Cuda 40.92% C++ 42.61% Python 16.46%

hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda's Introduction

Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA

Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt

Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA

This is the code repository for Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA, published by Packt.

**Effective techniques for processing complex image data in real time using GPUs **

What is this book about?

Computer vision has been revolutionizing a wide range of industries, and OpenCV is the most widely chosen tool for computer vision with its ability to work in multiple programming languages. Nowadays, in computer vision, there is a need to process large images in real time, which is difficult to handle for OpenCV on its own. This is where CUDA comes into the picture, allowing OpenCV to leverage powerful NVDIA GPUs. This book provides a detailed overview of integrating OpenCV with CUDA for practical applications.

This book covers the following exciting features: Understand how to access GPU device properties and capabilities from CUDA programs

  • Learn how to accelerate searching and sorting algorithms
  • Detect shapes such as lines and circles in images
  • Explore object tracking and detection with algorithms
  • Process videos using different video analysis techniques in Jetson TX1
  • Access GPU device properties from the PyCUDA program
  • Understand how kernel execution works
  • If you feel this book is for you, get your copy today!

    https://www.packtpub.com/

    Instructions and Navigations

    All of the code is organized into folders. For example, Chapter02.

    The code will look like the following:

    while (tid < N)
        {
           d_c[tid] = d_a[tid] + d_b[tid];
           tid += blockDim.x * gridDim.x;
        }
    

    Following is what you need for this book: This book is a go-to guide for you if you are a developer working with OpenCV and want to learn how to process more complex image data by exploiting GPU processing. A thorough understanding of computer vision concepts and programming languages such as C++ or Python is expected.

    With the following software and hardware list you can run all code files present in the book (Chapter 1-12).

    Software and Hardware List

    Chapter Software required OS required
    1-4 CUDA Toolkit X.X, Microsoft Visual Studio Community Edition, Nsight Windows, Mac OS X, and Linux (Any)
    5-8 OpenCV Library Windows, Mac OS X, and Linux (Any)
    10-12 Anaconda Python, PyCUDA Windows, Mac OS X, and Linux (Any)

    We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

    Visit the following link to check out videos of the code being run: http://bit.ly/2PZOYcH

    Related products

    Get to Know the Author

    Bhaumik Vaidya Bhaumik Vaidya is an experienced computer vision engineer and mentor. He has worked extensively on OpenCV Library in solving computer vision problems. He is a University gold medalist in masters and is now doing a PhD in the acceleration of computer vision algorithms built using OpenCV and deep learning libraries on GPUs. He has a background in teaching and has guided many projects in computer vision and VLSI(Very-large-scale integration). He has worked in the VLSI domain previously as an ASIC verification engineer, so he has very good knowledge of hardware architectures also. He has published many research papers in reputable journals to his credit. He, along with his PhD mentor, has also received an NVIDIA Jetson TX1 embedded development platform as a research grant from NVIDIA.

    Suggestions and Feedback

    Click here if you have any feedback or suggestions.

    Download a free PDF

    If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
    Simply click on the link to claim your free PDF.

    https://packt.link/free-ebook/9781789348293

hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda's People

Contributors

bhaumik2450 avatar divyavadhyar avatar githubce avatar packt-itservice avatar packtutkarshr avatar poojaparvatkar avatar zyxrrr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hands-on-gpu-accelerated-computer-vision-with-opencv-and-cuda's Issues

Questions about 02_variable_addition_reference.cu

I have tested this code. When I remove the blockIdx, such as the code as follows:

__global__ void myfirstkernel(void) {
	//blockIdx.x gives the block number of current kernel
	//printf("Hello!!!I'm thread in block: %d\n", blockIdx.x);
	printf("Hello!!!I'm thread in block: %d\n");
}

int main(void) {
	//A kernel call with 16 blocks and 1 thread per block
	myfirstkernel << <16, 1 >> > ();
	//Function used for waiting for all kernels to finish
	cudaDeviceSynchronize();
	printf("All threads are finished!\n");
	return 0;
}

I can not get the kernel running results as the book shows. Just as:
1111
If add blockIdx back, as printf("Hello!!!I'm thread in block: %d\n", blockIdx.x) then I get the right results.
Why?

Opencv CUDA detected much more Hough circles than CPU

Hello,

System information (version)
OpenCV 4.5.1, Windows 10 64 bits, Python3.7, Cuda 10.2

Detailed description
I have found that cuda HoughCirclesDetector gives much more circles than CPU HougnCircles with the same parameters.
Also the parameter minDist seems inefficient, larger the minDist is, it still gives the circels between which distance less than minDist.

For pic "eight.tif", it gives incrediable 24 circles , the code as following:

import cv2
import numpy as np
gpuImg = cv2.cuda_GpuMat()
def cv_show(name, image):
    cv2.namedWindow(name, cv2.WINDOW_NORMAL)
    cv2.imshow(name, image)

    cv2.waitKey(0)
    cv2.destroyAllWindows()
def getGpuResize(src):
    global gpuImg
    basePixSize = 1280
    height = src.shape[0]
    width = src.shape[1]
    print('height:', height)
    print('width:', width)
    largeSize = max(height, width)
    resizeRate = basePixSize/largeSize
    gpuImg.upload(src)
    gpuDst = cv2.cuda.resize(gpuImg, (int(width*resizeRate),int(height*resizeRate)))
    return gpuDst

def getGpuCircles(gpuSrc,src):
    ksize = (5,5)
    gpuFilter = cv2.cuda.createGaussianFilter(srcType=cv2.CV_8UC1, dstType=cv2.CV_8UC1, ksize=ksize, sigma1=0, sigma2=0)
    gpuSrc = cv2.cuda_Filter.apply(gpuFilter, gpuSrc)
    detector = cv2.cuda.createHoughCirclesDetector(dp=1, minDist=100, cannyThreshold=122, votesThreshold=50,
                                                   minRadius=1,
                                                   maxRadius=308)  # 100/110/50/445/510 or 150/110/50/430/510/MINRADIUS260
    cuCircles = detector.detect(gpuSrc)
    circles = cuCircles.download()
    print("circles0:", circles)
    print(circles.shape)
    if circles is not None:  # detect circles
        circles = np.uint0(np.around(circles))
        print("circles1:", circles)
    for circle in circles[0, :]:
        dst = cv2.circle(src, (circle[0], circle[1]), circle[2], (255, 0, 255), 2)
    return dst
if __name__ == '__main__':
    img = cv2.imread('./eight.tif', cv2.IMREAD_GRAYSCALE)
    cuImg = getGpuResize(img)
    img = cuImg.download()
    print(img.shape)
    # cv_show('img', img)
    dst = getGpuCircles(cuImg, img)
    cv_show('res', dst)

Appreciate it.

OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call

OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /home/idriver/data/opencv-3.2.0/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 340
terminate called after throwing an instance of 'cv::Exception'
what(): /home/data/opencv-3.2.0/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:340: error: (-217) no kernel image is available for execution on the device in function call

Aborted (core dumped)
after compiling the Chapter5/07_image_addition.cpp. i execute 07_image_ddition,and the error occured.

Using opencv cuda library functions on TX1 is very slow at 1FPS

I first tested the data addition code for the 50,000 members of the book on TX1, and it worked just fine.
图片
But when I tested the image addition code, I only got 1FPS.
图片
book:
图片
my test:
图片
Does the tx1 have any NVIDIA drivers to install?I only installed the system and cuda as follows:
图片

chapter2 GPU VS CPU runtime

the chapter 2 of this book show that the GPU and CPU delay time when N = 10000000 by using clock_t to compute. But when set N = 10000000, and the code show nothing, may be the stack overflows. I want to konw how could you solved, but the repo has no the appropriate code, could you add the source code.
Thanks!

Memory corruption

Following your book and try to run code for adding two image
Compilation is successful but when every try to run getting
Malloc () : memory corruption
Backtrace
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libc.so.6
/usr/lib/nvidia-384/tls/libnvidia-tls.so.384.130
Memory map



Core dump()

Using i5 6gen Nvidia GeForce 820m

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.