ArrayFire is a general-purpose tensor library that simplifies the software development process for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market.

Several of ArrayFire's benefits include:

Hundreds of accelerated tensor computing functions, in the following areas:
- Array handling
- Computer vision
- Image processing
- Linear algebra
- Machine learning
- Standard math
- Signal Processing
- Statistics
- Vector algorithms
Easy to use, stable, well-documented API
Rigorous benchmarks and tests ensuring top performance and numerical accuracy
Cross-platform compatibility with support for CUDA, oneAPI, OpenCL, and native CPU on Windows, Mac, and Linux
Built-in visualization functions through Forge
Commercially friendly open-source licensing
Enterprise support from ArrayFire

ArrayFire provides software developers with a high-level abstraction of data that resides on the accelerator, the af::array object. Developers write code that performs operations on ArrayFire arrays, which, in turn, are automatically translated into near-optimal kernels that execute on the computational device.

ArrayFire runs on devices ranging from low-power mobile phones to high-power GPU-enabled supercomputers. ArrayFire runs on CPUs from all major vendors (Intel, AMD, ARM), GPUs from the prominent manufacturers (AMD, Intel, NVIDIA, and Qualcomm), as well as a variety of other accelerator devices on Windows, Mac, and Linux.

Getting ArrayFire

Instructions to install or to build ArrayFire from source can be found on the wiki.

Conway's Game of Life Using ArrayFire

Visit the Wikipedia page for a description of Conway's Game of Life.

static const float h_kernel[] = { 1, 1, 1, 1, 0, 1, 1, 1, 1 };
static const array kernel(3, 3, h_kernel, afHost);

array state = (randu(128, 128, f32) > 0.5).as(f32); // Init state
Window myWindow(256, 256);
while(!myWindow.close()) {
    array nHood = convolve(state, kernel); // Obtain neighbors
    array C0 = (nHood == 2);  // Generate conditions for life
    array C1 = (nHood == 3);
    state = state * C0 + C1;  // Update state
    myWindow.image(state);    // Display
}

The complete source code can be found here.

Perceptron

array predict(const array &X, const array &W) {
    return sigmoid(matmul(X, W));
}

array train(const array &X, const array &Y,
        double alpha = 0.1, double maxerr = 0.05,
        int maxiter = 1000, bool verbose = false) {
    array Weights = constant(0, X.dims(1), Y.dims(1));

    for (int i = 0; i < maxiter; i++) {
        array P   = predict(X, Weights);
        array err = Y - P;
        if (mean<float>(abs(err) < maxerr) break;
        Weights += alpha * matmulTN(X, err);
    }
    return Weights;
}
...

array Weights = train(train_feats, train_targets);
array test_outputs  = predict(test_feats, Weights);
display_results<true>(test_images, test_outputs,
                      test_targets, 20);

The complete source code can be found here.

For more code examples, visit the examples/ directory.

Documentation

You can find the complete documentation here.

Quick links:

Language support

ArrayFire has several official and community maintained language API's:

_† _†

^† Community maintained wrappers

In-Progress Wrappers

Contributing

The community of ArrayFire developers invites you to build with us if you are interested and able to write top-performing tensor functions. Together we can fulfill The ArrayFire Mission for fast scientific computing for all.

Contributions of any kind are welcome! Please refer to the wiki and our Code of Conduct to learn more about how you can get involved with the ArrayFire Community through Sponsorship, Developer Commits, or Governance.

Citations and Acknowledgements

If you redistribute ArrayFire, please follow the terms established in the license. If you wish to cite ArrayFire in an academic publication, please use the following citation document.

ArrayFire development is funded by AccelerEyes LLC and several third parties, please see the list of acknowledgements for an expression of our gratitude.

Support and Contact Info

Slack Chat
Google Groups
ArrayFire Services: Consulting | Support | Training

Trademark Policy

The literal mark "ArrayFire" and ArrayFire logos are trademarks of AccelerEyes LLC (dba ArrayFire). If you wish to use either of these marks in your own project, please consult ArrayFire's Trademark Policy

Degraded performance for variable input size

I have extended arrayfire-ml with CuDNN bindings and was running benchmarks to compare with convnet benchmarks from https://github.com/soumith/convnet-benchmarks/.

The benchmarks are run in the following way

// Define the Network

// Benchmark code
for (int i = 0; i < ntimes; ++i) {
    af::sync();
    auto s = af::timer::start();
    input = <INPUT_INITIALIZATION USING AF::RANDU>
    auto out = model.forward(input);
    out.backward();
    af::sync();
    auto e = af::timer::stop(s);
    std::cout << std::setprecision(5) << e * 1000.0 << std::endl;
}

I was able to match the performance with torch7 CuDNN bindings if I make the input size always constant. However, if I pass random input between the sizes [lo, hi] the avg. performance if actually worse than sending input of size hi always.

You can notice the spike at regular intervals which is increasing the avg. time taken.

Note that all the buffers, arrays are initialized using af::array(..) constructor (No CudaMallocs used) and all Cudnn Operation are placed on Arrayfire's Cuda Stream on the device.

I was wondering if the spikes at regular intervals strike something to you. Could the continuous memory allocations (of different sizes) be optimized with Arrayfire memory manager ?

Thanks in advance !

arrayfire / arrayfire-ml Goto Github PK

arrayfire-ml's Introduction

Getting ArrayFire

Conway's Game of Life Using ArrayFire

Perceptron

Documentation

Language support

Contributing

Citations and Acknowledgements

Support and Contact Info

Trademark Policy

arrayfire-ml's People

Contributors

Stargazers

Watchers

Forkers

arrayfire-ml's Issues

Other

Base classes

Classifiers

Dimensionality Reduction

Regression Analysis

Clustering

Optimization functions

Base Classes

Autograd

Neural Network

Solvers / Optimizers

Examples

Recommend Projects

Recommend Topics

Recommend Org