ihhub / penguinv Goto Github PK

View Code? Open in Web Editor NEW

118.0 13.0 89.0 3.98 MB

Computer vision library with focus on heterogeneous systems

License: Other

C++ 85.95% Makefile 0.83% Cuda 10.31% C 0.66% Shell 0.09% CMake 1.75% Python 0.04% Python 0.05% SWIG 0.32%

image-processing cuda simd computer-vision avx cpp sse thread-pool opencl python

penguinv's People

Contributors

Stargazers

Watchers

Forkers

arcrode tbelaire bryonglodencissp prashcr nfm-8 yunchih seancyw lakshmanaram wwb203 thinkng filipaog krutonium kanaderu dreamplayerzhang schlenkibus faisalkholid subhahu123 ssaahhaajj daviev ngetahun tonyarkles arjunann satye1809 0x72d0 lockdrew ajaidubey yannlabou captain-darko thirdknife lionelfw andre8359 lrm25 ya1gaurav nontana kumarvishalgupta dianarg majorgrits kgorecki jvce92 shobhitsrivastava18th aathmikam xeonray mekot femiagbabiaka marias210 matthewmcgonagle aasimk2000 belial2010 bascodlowell reshmapd vrozin scrounchtike frankjwh theoniko 0x24d fullset dylanirlbeck lnhieuvn josephnicholas skn123 phoffmeister gkrls rahulravishankar akhilgeothom manekenpix sukhbeersingh tanyachauhan9 gumeo y2s82 curioustauseef emmakimo ajunlonglive swift24 plinythemid ducbx dongyanchaotj yuechengyin roitangene chmarp grommers00 jigneshbadrakhiya b51ak47 dreamplayer-zhang dubiao1986144783 mfkiwl tlalarus webstorage119 matthewroberthenderson

penguinv's Issues

Template matching code

Does template matching code need to be complicated? How many methods we need to cover? Do we need to use FFT for all of methods?

Blob detection possible speedup

Speed up blob detection for the part of map creation before finding any blobs. How fast it could be? How to make it code friendly without changing a lot of structure?

Multi GPU support

Nowadays many systems contain multiple GPU so it would be good to have a multi GPU support.

Raspberry Pi camera usage example

The library has been developed for Raspberry Pi we need some code for camera support or 3rd part library as an example code.

Add MakeFile files for all example projects

Using command line is inconvenient when project contains a lot of file plus end user doesn't like to do extra moves to copy, paste, run and etc.

Project logo

As a project evolves it is important to make it recognizable.

It would be great if someone helps us with logo for our project.

Dilate and erode functions

They are common used functions. One of solution is to use FFT.

Add examples with GUI on different OS

It is image processing library so people would like to see examples with GUI. Need simple resizing window able to draw an image [gray-scale and/or color] and simple drawings for following systems:

Windows (Win32 Api is preferable)
Linux
Android
iOS
~~QT framework (OS independent)~~

Example code must be as easy as possible and at the same time it is good to have an option to build without framework or specific tools (not everyone has QT for example).

Example code should follow structure: load image from drive --> show --> wait for user response --> do something on image --> show image

This is simple structure how gui object (window) could look like:

class GuiWindow
{
public:
    GuiWindow(const std::string & caption) { };
    virtual ~GuiWindow() { };

    virtual void show() = 0;

    virtual void resize( uint32_t width, uint32_t height ) = 0; // resize window
    virtual void draw( const Bitmap_Image::Image & image ) = 0; // draw an image
    virtual void wait(uint32_t timeMs) = 0; // wait for event. If time is not 0 then wait for specified time
                                                                  // otherwise any user response such as mouse click or key press
private:
    std::string _caption;
};

Correct image type support for png files

Currently we fix support only to RGBA images while there are different types of images. Plus we allocate memory twice for image loading what we actually have to do only once.

Rework library structure

As stated in #57 project structure should adapt to standard project structure. It is also necessary to create headers that expose API
Reference

EDIT: include

Installation of library for Linux/MacOS

To simplify developer's life it would be useful to investigate the possibility of installation library within OS so it would be stored in one place.

Performance tests for AVX/SSE/NEON/function pool/basic function

It is important to have some real numbers showing the difference in some technologies utilization. Plus it is good tool to verify quality of code implementation: compare expected speed with real.

Add ARM Neon instructions support

SIMD instructions on ARM are very similar to SSE/AVX. It would be great to have such support. The structure of functions could be gotten from SSE namespace.

Of course it is necessary to have unit tests for such functions.

Simplification of Visual Studio compilation

How would it be easier for user to compiler the library? Do we need to use CMake but this is extra moves? Do we need to give just an instruction for projects? What to do with 3-rd parties libraries?

Add an example project on Android NDK

As the library is cross-platform it would be great to have an example project for Android system.

Sample code for filter usage

It would be useful to add examples for each (or multiple) filter to show what they do and which results they give.

CUDA example

Currently we don't have any example of CUDA usage within the library. It would be very good to show capacity of the library.

Solve the problem with image borders for median filter

It is well-known problem and there are multiple solutions for this. Now the library just copies a part of image which is close to borders. Proper way is to apply median filter on borders too but might be with smaller size or center-shifted.

CPU architecture code verification and instruction set support validation

We need a code what checks which CPU architecture it is compiled on, plus all possible supported instructions sets such as SSE, AVX, NEON. Please refer to folder PenguinV/Library/penguinv for more information.

cpu_identification.h file should contain all necessary checks inside by right logic. Now it is empty :(

The idea is once compiled application/library could choose possible supported instruction sets and utilize it without extra magic. First step is to identify CPU architecture on what is compiled. This could be done by macros. Second step is verification supported instruction sets on current platform/architecture.

MMX support for basic functions

It looks weird but it could be useful in situations as:

1. Width of image ROI (region of interest) is less than 16 but more than 7. So MMX will be faster than normal code.

2. Old CPU support

Please refer to cpp image_function_sse.h and cpp image_function_sse.cpp for example how to make a proper code. At the beginning it is enough to have Bitwise And, Or and Xor functions.

More basic functions by SSE/AVX

Some basic functions could be done with SSE/AVX. This is a short list of them:

ConvertToGrayScale
ConvertToRgb

~~Sum~~
~~ProjectionProfile~~
~~RgbToBgr~~

FFT (DFT) integration?

FFT is must be thing for image processing. At the same time does FFTW have proper licence for this project?

GPU (OpenCL) Support

Are there any plans to add OpenCL enhancements for operations that are easy to parallelise and would benefit from GPU support (such as gaussian blurs)?.

Unit tests fail to compile, missing `rand`

On OSX, after cloning the library, and trying to run the unit tests, it fails to compile:

$ g++ -std=c++11 -Wall unit_tests.cpp ../Library/image_function.cpp unit_test_framework.cpp unit_test_helper.cpp unit_test_image_buffer.cpp unit_test_image_function.cpp -o application
In file included from unit_test_helper.cpp:1:
./unit_test_helper.h:70:30: error: use of undeclared identifier 'rand'
                        return static_cast<data>( rand() ) % maximum;
                                                  ^
./unit_test_helper.h:80:36: error: use of undeclared identifier 'rand'
                        data value = static_cast<data>( rand() ) % maximum;
                                                        ^
2 errors generated.
In file included from unit_test_image_buffer.cpp:2:
./unit_test_helper.h:70:30: error: use of undeclared identifier 'rand'
                        return static_cast<data>( rand() ) % maximum;
                                                  ^
./unit_test_helper.h:80:36: error: use of undeclared identifier 'rand'
                        data value = static_cast<data>( rand() ) % maximum;
                                                        ^
unit_test_image_buffer.cpp:278:25: error: use of undeclared identifier 'rand'
                                uint8_t fakeValue = rand() % 2;
                                                    ^
3 errors generated.
In file included from unit_test_image_function.cpp:3:
./unit_test_helper.h:70:30: error: use of undeclared identifier 'rand'
                        return static_cast<data>( rand() ) % maximum;
                                                  ^
./unit_test_helper.h:80:36: error: use of undeclared identifier 'rand'
                        data value = static_cast<data>( rand() ) % maximum;
                                                        ^
2 errors generated.

Memory pool support

Memory pool is very useful in situations when RAM is not so fast and memory allocation/deallocation takes time. At the same time memory allocator requires some time to execute the code for allocation/deallocation while memory pool is ~O(1) speed because it contains fixed size of data chunks. Evolution of memory pool and memory allocation could give a good results in performance.

libjpeg support

We have libpng support but not libjpeg. Many images are stored in this format and it would be irrational to do not support it.

Correlates with #7

Implement functions in function pool

Some basic functions could be done with multithreading. Some of them are:

Flip
Transpose

~~ConvertToGrayScale~~
~~ConvertToRgb~~
~~ExtractChannel~~
~~IsEqual~~
~~GammaCorrection~~
~~Resize~~
~~Accumulate~~

Rename Bitmap_Image namespace

Bitmap_Image namespace name confuses some people because by logic it is not a bitmap image itself. Remove default 4 pixel alignment of rows .

Bitmap code simplification

Current code is bulky and it is not obvious as it should be.

Ideally the code must look like this:

Image image = loadBitmap("someplace.bmp");

But there is no guarantee that loaded image is gray-scale. So the code should be at least like this:

TemplateImage<uint8_t> image = loadBitmap("someplace.bmp");

But then how to convert to normal BitmapImage standard without branching code?

Streams in CUDA

Currently all code for CUDA based on default stream and this could be downgrade in performance.

Test-issue-hacktoberfest

Output file for performance tests

Looking at values from performance tests some people couldn't understand while seeing a fancy graph is more likely to understand of performance. We need some file output for another third-party software to show results in more user-friendly way.

Add a makefile for performance tests

It is required to add a makefile for performance tests (see performance_tests) folder to support make. Please refer to makefile structure for examples.

More unit tests for the rest of basic functions

Not all basic functions are covered by unit tests so it is must to have it.

It is very difficult to create correct unit tests for some of functions. This is a list of functions which currently do not have unit tests:

GetThreshold
SetPixel

~~Flip~~
~~Convert~~
~~Fill~~
~~Histogram~~
~~IsEqual~~
~~Normalize~~
~~Resize~~
~~ProjectionProfile~~
~~Transpose~~

CUDA: correct calculations for thread count and blocks per grid

We need to have a proper way to calculate optimum number of threads and blocks for device functions. Please take a look at Library/cuda/image_function_cuda.cu function void getKernelParameters(int & threadsPerBlock, int & blocksPerGrid, uint32_t size).

CUDA code for basic functions

Need to add CUDA OS independent code at least for some basic functions. Many PCs contain NVidia videocards so it is not rational to do not support such thing as CUDA. At the same time it would be better to have an example code/project.

Please remember: the code must be as simple as possible with proper comments :)

Support of libpng for Windows

Windows doesn't have third-party support of libpng (and zlib). We need to do a nice code which would be easy to use

Correlates with #7

Missing virtual destructor in BitmapDibHeader struct

FFT from Gaussian is Gaussian

No need to convert gaussian filter from original domain into frequency domain because we could calculate it directly in frequency domain.

RGBA conversion functions

Currently the code supports only Gray and RGB images while we've introduced RGBA type as well. We have to have support for RGBA images as well.

Add support of 32-bit images for Bitmap_Operation namespace

The solution is just ignore alpha channel in image because it is for representation of image.

Add multithreading support for blob detection

Blob detection is one of slowest operations in image processing. Doing blob detection in parallel is not so obvious task as it requires additional step: merging results. I have one implementation of blob detection in parallel but need some overview of implementation. I will add the code soon.

Image borders in CUDA

We use only one dimension in CUDA kernels to execute operations on image but according CUDA specification for devices lower than 3.0 computation version maximum resolution per X axis is only 65535. We use 256 threads per block so it makes our image limit to 16,776,960 bytes or around 16 MB.
Our unit test maximum resolution is only 2048 by 2048 pixels which is less than 16 MB so unit tests do not fail for us.