Giter Site home page Giter Site logo

arrayfire-ml's Introduction

ArrayFire is a general-purpose tensor library that simplifies the software development process for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market.

Several of ArrayFire's benefits include:

  • Hundreds of accelerated tensor computing functions, in the following areas:
    • Array handling
    • Computer vision
    • Image processing
    • Linear algebra
    • Machine learning
    • Standard math
    • Signal Processing
    • Statistics
    • Vector algorithms
  • Easy to use, stable, well-documented API
  • Rigorous benchmarks and tests ensuring top performance and numerical accuracy
  • Cross-platform compatibility with support for CUDA, oneAPI, OpenCL, and native CPU on Windows, Mac, and Linux
  • Built-in visualization functions through Forge
  • Commercially friendly open-source licensing
  • Enterprise support from ArrayFire

ArrayFire provides software developers with a high-level abstraction of data that resides on the accelerator, the af::array object. Developers write code that performs operations on ArrayFire arrays, which, in turn, are automatically translated into near-optimal kernels that execute on the computational device.

ArrayFire runs on devices ranging from low-power mobile phones to high-power GPU-enabled supercomputers. ArrayFire runs on CPUs from all major vendors (Intel, AMD, ARM), GPUs from the prominent manufacturers (AMD, Intel, NVIDIA, and Qualcomm), as well as a variety of other accelerator devices on Windows, Mac, and Linux.

Getting ArrayFire

Instructions to install or to build ArrayFire from source can be found on the wiki.

Conway's Game of Life Using ArrayFire

Visit the Wikipedia page for a description of Conway's Game of Life.

Conway's Game of Life

static const float h_kernel[] = { 1, 1, 1, 1, 0, 1, 1, 1, 1 };
static const array kernel(3, 3, h_kernel, afHost);

array state = (randu(128, 128, f32) > 0.5).as(f32); // Init state
Window myWindow(256, 256);
while(!myWindow.close()) {
    array nHood = convolve(state, kernel); // Obtain neighbors
    array C0 = (nHood == 2);  // Generate conditions for life
    array C1 = (nHood == 3);
    state = state * C0 + C1;  // Update state
    myWindow.image(state);    // Display
}

The complete source code can be found here.

Perceptron

Perceptron

array predict(const array &X, const array &W) {
    return sigmoid(matmul(X, W));
}

array train(const array &X, const array &Y,
        double alpha = 0.1, double maxerr = 0.05,
        int maxiter = 1000, bool verbose = false) {
    array Weights = constant(0, X.dims(1), Y.dims(1));

    for (int i = 0; i < maxiter; i++) {
        array P   = predict(X, Weights);
        array err = Y - P;
        if (mean<float>(abs(err) < maxerr) break;
        Weights += alpha * matmulTN(X, err);
    }
    return Weights;
}
...

array Weights = train(train_feats, train_targets);
array test_outputs  = predict(test_feats, Weights);
display_results<true>(test_images, test_outputs,
                      test_targets, 20);

The complete source code can be found here.

For more code examples, visit the examples/ directory.

Documentation

You can find the complete documentation here.

Quick links:

Language support

ArrayFire has several official and community maintained language API's:

C++ Python Rust Julia Nim

  Community maintained wrappers

In-Progress Wrappers

.NET Fortran Go Java Lua NodeJS R Ruby

Contributing

The community of ArrayFire developers invites you to build with us if you are interested and able to write top-performing tensor functions. Together we can fulfill The ArrayFire Mission for fast scientific computing for all.

Contributions of any kind are welcome! Please refer to the wiki and our Code of Conduct to learn more about how you can get involved with the ArrayFire Community through Sponsorship, Developer Commits, or Governance.

Citations and Acknowledgements

If you redistribute ArrayFire, please follow the terms established in the license. If you wish to cite ArrayFire in an academic publication, please use the following citation document.

ArrayFire development is funded by AccelerEyes LLC and several third parties, please see the list of acknowledgements for an expression of our gratitude.

Support and Contact Info

Trademark Policy

The literal mark "ArrayFire" and ArrayFire logos are trademarks of AccelerEyes LLC (dba ArrayFire). If you wish to use either of these marks in your own project, please consult ArrayFire's Trademark Policy

arrayfire-ml's People

Contributors

9prady9 avatar fohx13 avatar pavanky avatar plavin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arrayfire-ml's Issues

Convolution Functions

The following functions and modules need to be implemented:

  • Basic Convolutions: No strides, No pooling
  • Strided Convolutions without pooling
  • Strided Convolutions with pooling

Merging Strided Convolutions with pooling could potentially allow call wrap and unwrap to be only called once instead of twice.

Remove bias from Weights.hpp

Weights are currently initializing bias'
This is not always desirable, eg LSTM:

            /* Standard LSTM Implementation
             * Reference: http://deeplearning.net/tutorial/lstm.html
             * Refernece: http://www.cs.toronto.edu/~graves/phd.pdf */
            Weights mWi, mUi, mIDiffs;   // input gate weights + recurrence
            Weights mWf, mUf, mFDiffs;   // forget gate weights + recurrence
            Weights mWc, mUc, mCDiffs;   // memory cell weights + recurrence
            Weights mWo, mUo, mODiffs;   // output gate weights + recurrence

Here, each node needs only needs 1 bias, i.e. 1 bias for input gate (as opposed to the 2 we will now have for mWi & mUi)
Either we should :

  1. refactor the above file into 2 files or
  2. we can provide a way to NOT initialize the bias in the CTOR

To me 1. makes more sense

Autodiff

Hi guys,

I've seen a lot of things in terms of ML frameworks, however for my work as a research I find autodiff very usefull (e.g. symbolic graph manipulation). On the other hand arrayfire seems like the perfect the backend for such framework, as it seemsly can be adopted for GPUs and so on (I'm still interested if there is any chance of adding somehow fallback to cuDNN where possible).

Since I've atempted such things, and however lack the experience of well written performance code, was wondering how open are you guys to such proposal and would anyone be interested in giving help for making better code generation to arrayfire.

I understand this is not your main goal, but I find arrayfire probably as best candidate for a computational backend, while autodiff gives a few orders of magnititude productivity. If I happen to this I do intend to do it in c++.

RNN Models

Once we have an implementation of the Layer Class #17 , the Optimizer class and the DataSet class we can go about creating RNN flavors. There are 3 models that should be implemented:

  • Vanilla RNN
  • LSTM
  • GRU

These will require the implementation of their derivatives and their forward prop values.
Certain details to consider:

  • RNN's have a stack of weight matrices and bias' (not just 1 per Layer, thus the Layer needs to be general enough to handle this)
  • The optimization needs to be handled via two methods:
    • RTRL (real time recurrent learning) &
    • BPTT (backprop through time)

To enable the above two methods of learning we should consider inheriting from Layer and implementing a Recurrent Layer.

Static Activations Class

Should provide str2enum and then return a function ptr or something similar.
This will be helpful for things like:

        private:

            Weights mWeights;
            Weights mDiffs;
            Activation mActivation;

        public:

            LinearNode(const int inputSize, const int outputSize,
                       std::string activation='tanh',
                       float spread = 0.05,
                       const char *name="none"):
                Node(1, &inputSize, 1, &outputSize, name),
                mWeights(inputSize, outputSize, spread),
                mActivation.get(activation),
                mDiffs()
{}

            ArrayVector forward(const ArrayVector &input){
                return {mActivation(af::matmul(mWeights.getWeights(), input[0])) +
                        af::tile(mWeights.getBias(), 1, input[0].dims(1))};

Add Initializers

Currently Nodes are defined to have spreads:

            LinearNode(const int inputSize, const int outputSize,
                       float spread = 0.05,
                       const char *name="none") :

This needs to be refactored to provide an initializations class.
Examples of initializations are :

  1. Le-Cun Normal
  2. Glorot Uniform

OpenCL Error OSX Radeon HD6750

jason@tesla ~/p/a/build> ./test/Activations_opencl
in[0] [5 1 1 1]
-8.9262
32.2371
45.1800
-32.0635
-8.0176

libc++abi.dylib: terminating with uncaught exception of type af::exception: ArrayFire Exception(401): Double precision not supported for this device
In /var/lib/jenkins-slave/workspace/arrayfire-osx-graphics-installer/src/api/cpp/data.cpp:27
fish: Job 1, './test/Activations_opencl ' terminated by signal SIGABRT (Abort)

Base classifier class

This class is inherited from all the classifier types. This should provide methods for training and then performing the classify operation.

Perceptron

Specialization of MultiLayerPerceptron

Loss functions

  • mean squared error
  • mean absolute error
  • binary cross entropy
  • negative log likelyhood

Proposal for the initial API

This library will sit on top of arrayfire. The default implementation will be C++, however C wrappers will be added on top so that the library can be used from other languages.

The optimization function and distance metrics need to be abstract out so the user has a choice of switching them out whenever possible.

Base classes

Classifiers

Dimensionality Reduction

Regression Analysis

Clustering

Optimization functions

Needs research

Is ML project dead?

Didn't see any updates in a while, so wondering if ArrayFire ML project is still ongoing or abandoned.

Ideally would be great to have a "drop in" replacement for cuDNN used by Torch.

Degraded performance for variable input size

I have extended arrayfire-ml with CuDNN bindings and was running benchmarks to compare with convnet benchmarks from https://github.com/soumith/convnet-benchmarks/.

The benchmarks are run in the following way

// Define the Network
screen shot 2018-06-18 at 11 36 41 am

// Benchmark code
for (int i = 0; i < ntimes; ++i) {
    af::sync();
    auto s = af::timer::start();
    input = <INPUT_INITIALIZATION USING AF::RANDU>
    auto out = model.forward(input);
    out.backward();
    af::sync();
    auto e = af::timer::stop(s);
    std::cout << std::setprecision(5) << e * 1000.0 << std::endl;
}

I was able to match the performance with torch7 CuDNN bindings if I make the input size always constant. However, if I pass random input between the sizes [lo, hi] the avg. performance if actually worse than sending input of size hi always.
image

You can notice the spike at regular intervals which is increasing the avg. time taken.

Note that all the buffers, arrays are initialized using af::array(..) constructor (No CudaMallocs used) and all Cudnn Operation are placed on Arrayfire's Cuda Stream on the device.

I was wondering if the spikes at regular intervals strike something to you. Could the continuous memory allocations (of different sizes) be optimized with Arrayfire memory manager ?

Thanks in advance !

TODO List for 0.1 release

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.