Giter Site home page Giter Site logo

dcv's People

Contributors

9il avatar aferust avatar carun avatar dmitryolshansky avatar gitter-badger avatar henrygouk avatar ljubobratovicrelja avatar seagetch avatar timotheecour avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dcv's Issues

Global Goal

Possible variants:

  • Library for D. Disadvantage: D community is really small. Advantage: can be implemented in any programming style including GC + OOP. A project to play with D and have a fun.
  • Library written in D. Disadvantage: nothrow @nogc layer should be presented. OOP API should be optional or removed. Advantage: number of users and contributors are limited only by langauge bindings like Python, Julia, Ruby, Rust, and Go (yes, we can build libraries for Rust and Go ๐Ÿ˜„ I think it is better way to move forward with D ). A professional project that can be live during many-many years.

Library for D in 99% cases is useless as general purpose library. Library written in D is a library for D, which can be used in other languages like a common C library.

imwrite should accept lazy slice

  • lazy image / image with sride!1 != 1 -> buffering (4 KB buffer) -> writer stream
  • image with srride!1 == 1 -> writer stream
  • image with ideal strides -> writer stream (single call).

For example void write_png(Writer stream, long w, long h, in ubyte[] data, long tgt_chans = 0) from image formats can be used.

norm and asType can be deprecated.

Both are single line functions. See #33
normalized single line too.
scaled is not used if am not wrong. User can do slice[] *= scalar/slice[] += scalar or use ndEach if both * and + are required.

Proof of concept for ndslice

dcv is amazing proof of concept for ndslice. Please add notes, that it is based on ndslice in the future forum announce and dcv readme.

examples/video

Won't work

# ./video -f ../data/centaur_1.mpg

[mpegvideo @ 0xd9d320] Estimating duration from bitrate, this may be inaccurate
core.exception.AssertError@../../source/dcv/core/image.d(171): Assertion failure

Iteration performance

aSlice[i] requires 1 addition and 1 multiplication
aSlice[i, j] requires 2 additions and 2 multiplications
90% of algorithm can be iterated using ndslice.algorithm (ndMap, ndReduce). Other 10% can use front!d, popFront!d methods

~= concatenation should be removed

From RHT module:

    /// Run RHT using non-zero points in image as edge points.
    auto opCall(T)(Slice!(2, T*) image)
    {
        Point[] points;
        foreach (y; 0 .. image.length!0)
            foreach (x; 0 .. image.length!1)
            {
                if (image[y, x] > 0)
                {
                    points ~= Point(cast(int)x, cast(int)y);
                }
            }
        return this.opCall(image, points);
    }

This example has 2 issue. The first is slow indexing. The second is ~= concatenation, which changes complexity from O(n) to O(n^2).

Note about LDC

  1. That it is required.
  2. ldmd2 should be used with DUB, not ldc2

Double allocation without reason

as!aType.slice.asImage allocates without reason, bacause asImage allocates data anyway. asImage is uses data too. In addition, it should be something like toImage.

Is it's possible to remove background with dcv?

I have got a lot of images in jpg format with black background. Is it's possible to find background and remove it (make transparent) with current version of DCV? PNG as output would be enough for me (in ideally wepb, but D do not have native encoder/decoder :( )

If not DCV, maybe there is any better tools for this task?

101_005329_2_0_03

Go forward with LLVM and drop DMD BE?

According to libmir/dcompute#7 upcoming openCL can be used for thread and synchronization management instead of druntime for CPU (not only for GPUs) and kernels can be optimized as good as common CPU code. So the idea is to drop support for DMD. The benefits:

  1. Do not need optimise DMD BE, currently it is 20 times slower for matrix multiplication comparing with LDC. In addition, @WalterBright has a lot of another work with DMD FE.
  2. Simple, fast and nothrow @nogc parallellism using upcoming dcompute.
  3. Code will be simplified.
  4. Less maintaining efforts will be required.
  5. Less uncertainty for users.

ranged should be deprecated (again)

As soon as mir.image be implemented, we will have good conversions between all formats we use. ranged is functions that stretch color domain for each channel. If you apply this for not very contrast image, it will make it more contrast. This implicit filter functionality is bad practice.

Filters with fixed size

Many separable filters are used with fixed size, e.g. 2, 3, 4, 5. It can be done with static foreach.

Naming convensins

It seem your naming conventions and abbreviations does not give a good experience IMO. Long naming might be better in some cases.

Like;

Image image = imread("/path/to/image.png"); // read an image from filesystem.

auto slice = image.sliced; // slice image data (calls std.experimental.ndslice.slice.sliced on image data)

slice
    .asType!float[0..$, 0..$, 1] // convert slice data to float, and take the green channel only.
    .conv!symmetric(sobel!float(GradientDirection.DIR_X)) // convolve image with horizontal Sobel kernel.
    .byElement
    .ranged(0, 255).array.sliced(slice.shape[0..2]) // scale values to fit the range between the 0 and 255
    .imshow("Sobel derivatives"); // preview changes on screen.

waitKey();

some thing like readImage(), showImage().

You tend to mix them, like there is byElement() which is not consistent with imread(), imshow() etc.

[Docs] Move examples to gh-pages

Remove README.md content from examples in the project - not to have doubled example content that requires updating and syncing.

Remove dcv.core.image.Image

Proposition by @9il, started in #62.

Here are the relevant copy/pasted messages:

9il:

Do we really need Image type? Why?

ljubobratovicrelja:

As said in the description of the module in docs it is designed mainly to help with image I/O, but also to hold additional image metadata. Since it's data type is defined in runtime, it allows reading of unknown image format. Since Slice format is statically defined, we would have to expect certain image format when reading it, and if read image is not of expected format, we'd have to convert it. Also, Image contains additional metadata, e.g. color format (HSV, YUV, RGB etc.). And, in future Image should hold EXIF metadata.

Pipeline in DCV should be:

Image = dcv.core.image.Image
Slice = mir.ndslice.slice.Slice

LoadImage(path) --> dcv.core.image.Image
InspectAndAdoptImageFormat(Image) --> Slice
Processing(Slice) --> Slice
PackSliceToImage(Slice) --> Image
SaveImage(Image, path)

Long story short, we need image container with runtime defined data type, and additional image related metadata.

9il:

This is scripting language idioms. They are not good for D.

If you have processing, then you work with one, two, maximum three formats for processing.
They should have their own CT instantiations because performance reasons.
Then, when you want to save something, you can just call a function which accepts Slice, Metadata, and optionally RT/CT format.

The last one issue os reading. Yes, when we read something, the image format is unknown. But, as was said above, only beforehand image types are interesting. So, a user or library should define mapping, for example:

  • RT image type1 -> Alg1
  • RT image type2 -> Alg1
  • RT image type3 -> Alg2
  • Other RT image -> Error

It is not possible to eliminate this mapping. But rather hiding it in different classes implementations it is better how have an explicit way to do it and library helpers if required.

Please avoid any usage of classes (except already existing D libs, which can be replaced in future). Even async I/O can be performed without classes. D users like it because they are familiar with OOP. But this is bad practice for D. Structural programming is proper way to move forward with D.

Image I/O

As discussed in #48, maybe we should start planning on how to enrich the image I/O package of the library. Imageformats library is good (especially because it is purely written in D), but format coverage is poor (especially for encoders).

Use C libraries

So first idea is to build minimal bindings (or use existing ones) to popular C libraries:

Pros:

  • industry-wide used and tested libraries
  • minimal pain for maximal gain :)

Cons:

  • more C dependencies

Translate libraries to D

Some people already translated some of the popular encoder/decoders to D. I feel that's not that easy to do, and I'd personally much more like to focus on the DCV's core, but if we decide to take this step, no problem.

Using FFmpeg

@henrygouk suggested we could use ffmpeg to encode/decode image formats. This also seems like a great choice since lot of formats are supported, and ffmpeg-d is already a dependency.

Custom image I/O library's synergy with dcv:core

There was also discussion that users should be free to use 3rd party libraries for image I/O with DCV. I believe this is already achievable in DCV - e.g. if user is working with gtkd, and dcv:core, he/she can load pixbuf from file, then slice it's data and work along with dcv algorithms. So, I believe we're OK here, except maybe we should make an example on this topic to show it to people.

Separation of Image I/O from Video

Also we discussed if image io should be separated from video - in #48 we defined dcv:io sub-package, where we could have defined dcv:ioimage, and dcv:iovideo. If we decide to go with first option (bind C libs), I really think we should do this since it would be heavy loaded with C libs.

Any comment is welcome.

Library separation

A user may want to use this library only for CV algorithms. Library should not force a user to install any C libraries.
Looks like DCV may be splitt to

  • algorithms / image manipulations (DCV)
  • decoding / codecs
  • visualization

Current library looks like it is oriented for the end user. Comparing with Python, it is better practice for D to have an API, which can be used to build extended functionality, e.g. to be used in other libraries and cross-platform products.

How the linear resizing is supposed to work?

When I resize an image in GIMP, I got completely different result.
For example, in DCV I do this:

// w == 3; The image is a 2MP RGBA photo.

auto slice = image.sliced;
writeln(slice[0, 0, 0 .. $]);
auto thumbnail = slice.resize!linear([w, w]);
auto rgb = thumbnail[0 .. $, 0 .. $, 0 .. 3];
writeln(rgb);

and I got this:

[196, 198, 249, 255]
[[[196, 198, 249], [156, 166, 189], [82, 59, 100]], [[231, 163, 178], [35, 39, 34], [104, 98, 122]], [[141, 120, 119], [26, 36, 44], [169, 141, 130]]]

The first pixel of the downscaled image is the same as the first pixel of original image.
This is not what I expect.

migration to ndslice.algorithm and Mir

The common pattern for dcv is aSlice.byElement.rangedFunction. It has few performance issues:

  1. It can not be vectroized
  2. It requires additional computations because range interface

So, you may want to swtich from dcv.core.algorithm to ndslice.algorithm.

The documentation for ndslice.algorithm can be found here. ndslice.algorithm is currently available only in Mir.

So, switching to Mir is good option. It is providing recent and upcoming ndslice changes for both DMD and LDC. Also, mir will migrate to Phobos's ndslice after 2-3 DMD releases and will provide deprecation imports with aliasing. ndslice,algorithm is not the last module in ndslice package, also ndslice.concatenation will be added this year.

Please ask me questions if you have any.

debug

HI again, cv library always is complex, so i wanna join to develop, but some internals hard to understand, i think we need provide some debug output (to help bug reports) in "debug build" or some version definition
like USE_DCV_DEBUG

GPU acceleration

Nicholas Wilson merged dcompute to libmir organization. It is not ready yet, but we can figure out what DCV algorithms can be expressed as GPU kernels, and what GPU subroutines required in DCV should be implemented in Mir.

Use std.color based images type of `Slice!(2, XXX*)`

Benefits:

  1. Faster iterations (the color dimension becames CT loop).
  2. Explicit image type system is less buggy for devs and users
  3. Single conversion shell based on std.color can be used instead of set of conversion.

std.color can be improved to support DCV if it is required. In addition, we can add fastmath to std.color if we want.

Code clean and optimization: remove std.range

I have reviewed a set of files. We still have std.range and std.array are used frequently. The reasons to remove them:

  1. Less template bloat with iota - iotaSlice uses size_t only and it is faster
  2. Many usage cases can be improved with ndslice primitives.

ndslice does not require std.range and incorporates its funcitonality.
Maybe few cases with std.array can be still useful. But most of them are used for slice allocation

Dub review

dub.json should not import examples, but in unittest configuration.
Also, according to the #25, dcv should not depend on C libraries. Mir will have matrix inversion: that will allow to exclude scid (C LAPACK) dependency.

Support 32 bit compilation

  • As noted in #19 ulong is used for size type, which should be fixed.
  • Surely there's other stuff to be discovered that's brake 32bit compilation...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.