hasktorch / hasktorch Goto Github PK

Tensors and neural networks in Haskell

License: Other

Haskell 98.64% Shell 0.08% Nix 0.40% Python 0.04% Dockerfile 0.02% Jupyter Notebook 0.19% C++ 0.35% CSS 0.03% HTML 0.23% C 0.02%

hasktorch's Introduction

Hasktorch

Hasktorch is a library for tensors and neural networks in Haskell. It is an independent open source community project which leverages the core C++ libraries shared by PyTorch.

This project is in active development, so expect changes to the library API as it evolves. We would like to invite new users to join our Hasktorch slack space for questions and discussions. Contributions/PR are encouraged.

Currently we are developing the second major release of Hasktorch (0.2). Note the 1st release, Hasktorch 0.1, on hackage is outdated and should not be used.

Documentation

The documentation is divided into several sections:

Introductory Videos
Getting Started
Known Issues
Contributing
Notes for Library Developers

Introductory Videos

Getting Started

The following steps will get you started. They assume the hasktorch repository has just been cloned. After setup is done, read the online tutorials and API documents.

linux+cabal+cpu
linux+cabal+cuda11
macos+cabal+cpu
linux+stack+cpu
macos+stack+cpu
nixos+cabal+cpu
nixos+cabal+cuda11
docker+jupyterlab+cuda11

linux+cabal+cpu

Starting from the top-level directory of the project, run:

$ pushd deps       # Change to the deps directory and save the current directory.
$ ./get-deps.sh    # Run the shell script to retrieve the libtorch dependencies.
$ popd             # Go back to the root directory of the project.
$ source setenv    # Set the shell environment to reference the shared library locations.
$ ./setup-cabal.sh # Create a cabal project file

To build and test the Hasktorch library, run:

$ cabal build hasktorch  # Build the Hasktorch library.
$ cabal test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ cabal build examples  # Build the Hasktorch examples.
$ cabal test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE=cpu             # Set device to CPU for the MNIST CNN example.
$ cabal run static-mnist-cnn    # Run the MNIST CNN example.

linux+cabal+cuda11

Starting from the top-level directory of the project, run:

$ pushd deps              # Change to the deps directory and save the current directory.
$ ./get-deps.sh -a cu118  # Run the shell script to retrieve the libtorch dependencies.
$ popd                    # Go back to the root directory of the project.
$ source setenv           # Set the shell environment to reference the shared library locations.
$ ./setup-cabal.sh        # Create a cabal project file

To build and test the Hasktorch library, run:

$ cabal build hasktorch  # Build the Hasktorch library.
$ cabal test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ cabal build examples  # Build the Hasktorch examples.
$ cabal test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE="cuda:0"        # Set device to CUDA for the MNIST CNN example.
$ cabal run static-mnist-cnn    # Run the MNIST CNN example.

macos+cabal+cpu

Starting from the top-level directory of the project, run:

$ pushd deps       # Change to the deps directory and save the current directory.
$ ./get-deps.sh    # Run the shell script to retrieve the libtorch dependencies.
$ popd             # Go back to the root directory of the project.
$ source setenv    # Set the shell environment to reference the shared library locations.
$ ./setup-cabal.sh # Create a cabal project file

To build and test the Hasktorch library, run:

$ cabal build hasktorch  # Build the Hasktorch library.
$ cabal test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ cabal build examples  # Build the Hasktorch examples.
$ cabal test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE=cpu             # Set device to CPU for the MNIST CNN example.
$ cabal run static-mnist-cnn    # Run the MNIST CNN example.

linux+stack+cpu

Install the Haskell Tool Stack if you haven't already, following instructions here

Starting from the top-level directory of the project, run:

$ pushd deps     # Change to the deps directory and save the current directory.
$ ./get-deps.sh  # Run the shell script to retrieve the libtorch dependencies.
$ popd           # Go back to the root directory of the project.
$ source setenv  # Set the shell environment to reference the shared library locations.

To build and test the Hasktorch library, run:

$ stack build hasktorch  # Build the Hasktorch library.
$ stack test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ stack build examples  # Build the Hasktorch examples.
$ stack test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE=cpu             # Set device to CPU for the MNIST CNN example.
$ stack run static-mnist-cnn     # Run the MNIST CNN example.

macos+stack+cpu

Install the Haskell Tool Stack if you haven't already, following instructions here

Starting from the top-level directory of the project, run:

$ pushd deps     # Change to the deps directory and save the current directory.
$ ./get-deps.sh  # Run the shell script to retrieve the libtorch dependencies.
$ popd           # Go back to the root directory of the project.
$ source setenv  # Set the shell environment to reference the shared library locations.

To build and test the Hasktorch library, run:

$ stack build hasktorch  # Build the Hasktorch library.
$ stack test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ stack build examples  # Build the Hasktorch examples.
$ stack test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE=cpu             # Set device to CPU for the MNIST CNN example.
$ stack run static-mnist-cnn     # Run the MNIST CNN example.

nixos+cabal+cpu

(Optional) Install and set up Cachix:

$ nix-env -iA cachix -f https://cachix.org/api/v1/install  # (Optional) Install Cachix.
# (Optional) Use IOHK's cache. See https://input-output-hk.github.io/haskell.nix/tutorials/getting-started/#setting-up-the-binary-cache
$ cachix use hasktorch                                     # (Optional) Use hasktorch's cache.

Starting from the top-level directory of the project, run:

$ nix develop  # Enter the nix shell environment for Hasktorch.

To build and test the Hasktorch library, run:

$ cabal build hasktorch  # Build the Hasktorch library.
$ cabal test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ cabal build examples  # Build the Hasktorch examples.
$ cabal test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE=cpu             # Set device to CPU for the MNIST CNN example.
$ cabal run static-mnist-cnn    # Run the MNIST CNN example.

nixos+cabal+cuda11

(Optional) Install and set up Cachix:

$ nix-env -iA cachix -f https://cachix.org/api/v1/install  # (Optional) Install Cachix.
# (Optional) Use IOHK's cache. See https://input-output-hk.github.io/haskell.nix/tutorials/getting-started/#setting-up-the-binary-cache
$ cachix use hasktorch                                     # (Optional) Use hasktorch's cache.

Starting from the top-level directory of the project, run:

$ cat > nix/dev-config.nix
{
  profiling = true;
  cudaSupport = true;
  cudaMajorVersion = "11";
}
$ nix develop  # Enter the nix shell environment for Hasktorch.

To build and test the Hasktorch library, run:

$ cabal build hasktorch  # Build the Hasktorch library.
$ cabal test hasktorch   # Build and run the Hasktorch library test suite.

To build and test the example executables shipped with hasktorch, run:

$ cabal build examples  # Build the Hasktorch examples.
$ cabal test examples   # Build and run the Hasktorch example test suites.

To run the MNIST CNN example, run:

$ cd examples                   # Change to the examples directory.
$ ./datasets/download-mnist.sh  # Download the MNIST dataset.
$ mv mnist data                 # Move the MNIST dataset to the data directory.
$ export DEVICE="cuda:0"        # Set device to CUDA for the MNIST CNN example.
$ cabal run static-mnist-cnn    # Run the MNIST CNN example.

docker+jupyterlab+cuda11

This dockerhub repository provides the docker-image of jupyterlab. It supports cuda11, cuda10 and cpu only. When you use jupyterlab with hasktorch, type following command, then click a url in a console.

$ docker run --gpus all -it --rm -p 8888:8888 htorch/hasktorch-jupyter
or
$ docker run --gpus all -it --rm -p 8888:8888 htorch/hasktorch-jupyter:latest-cu11

Known Issues

Tensors Cannot Be Moved to CUDA

In rare cases, you may see errors like

cannot move tensor to "CUDA:0"

although you have CUDA capable hardware in your machine and have followed the getting-started instructions for CUDA support.

If that happens, check if /run/opengl-driver/lib exists. If not, make sure your CUDA drivers are installed correctly.

Weird Behaviour When Switching from CPU-Only to CUDA-Enabled Nix Shell

If you have run cabal in a CPU-only Hasktorch Nix shell before, you may need to:

Clean the dist-newstyle folder using cabal clean.
Delete the .ghc.environment* file in the Hasktorch root folder.

Otherwise, at best, you will not be able to move tensors to CUDA, and, at worst, you will see weird linker errors like

gcc: error: hasktorch/dist-newstyle/build/x86_64-linux/ghc-8.8.3/libtorch-ffi-1.5.0.0/build/Torch/Internal/Unmanaged/Autograd.dyn_o: No such file or directory
`cc' failed in phase `Linker'. (Exit code: 1)

Contributing

We welcome new contributors.

Contact us for access to the hasktorch slack channel. You can send an email to [email protected] or on twitter as @austinvhuang, @SamStites, @tscholak, or @junjihashimoto3.

Notes for library developers

See the wiki for developer notes.

Project Folder Structure

Basic functionality:

deps/ - submodules and downloads for build dependencies (libtorch, mklml, pytorch) -- you can ignore this if you are on Nix
examples/ - high level example models (xor mlp, typed cnn, etc.)
experimental/ - experimental projects or tips
hasktorch/ - higher level user-facing library, calls into ffi/, used by examples/

Internals (for contributing developers):

codegen/ - code generation, parses Declarations.yaml spec from pytorch and produces ffi/ contents
inline-c/ - submodule to inline-cpp fork used for C++ FFI
libtorch-ffi/- low level FFI bindings to libtorch
spec/ - specification files used for codegen/

hasktorch's People

Contributors

Stargazers

Watchers

Forkers

omaclaren stites austinvhuang dataopt halhenke yuhangwang sighingnow mitchellwrosen silky pkolachi tstat denisse-dev cnheider adamwespiser ayush1999 kiaragrouwstra junjihashimoto stjordanis adlucem j-haj murphymatt alexander-myltsev stephenra tscholak tarsbase o1lo01ol1o wavewave dicky98 faezs vicfred nottatdat thejoebourneidentity rizary mushie101 windchimeran iem-computer-vision mmesch db7894 saeedhk zeta1999 iamsuyogjadhav chetank99 championballer juliendehos leofisg jcberentsen sailfish009 surajk7 leftaroundabout andredaprato liaopeiyuan proger cdepillabout ai-stuff mcwitt hauntsaninja go-and-practice cmdv jhashekhar adhalanay sami-badawi tiagoooliveira kutyel litxio daisukebekki chediak jamesthesnake joejiong wz1000 rubenastudillo learning-functional-programming jul1u5 laplacekorea prasanthchettri minimario skh18 danielpatrickhug reep236 fuag15 xiaochengjf jakesnell smunix algoskynet 00mjk schrammlb2 zarak etorreborre mark-watson schnecki contractcarter siddharth-krishna ngua vejmin haedosa erdal-pb jameswhqi andreasabel theliker01clone02 collinarnett someoneserge

hasktorch's Issues

update readme with Makefile commands

Write the "Deep Learning with Torch: the 60-minute blitz" tutorial

We should probably start writing the Deep Learning with Torch: the 60-minute blitz tutorial for the haskorch bindings, which would be a good first step to familiarise with the basic stuff.

Benchmark against pytorch

instances of Functor & Comonad for `Tensor d`

It looks like this is isomorphic to the current implementation:

type Tensor d = Tagged d Dynamic

from tagged. Tagged is also a Comonad and this makes me think that, perhaps we can get a comonad out of the dimensions in Tensor if we can get a functor out of Dynamic. This might require something fancy. Opening and closing this issue, since it's not actionable.

Investigate tensorboardX-like package.

More specific version of #56, aiming for the lowest-hanging fruit of integrating with tensorboard by mimicking tensorboardX. Preemptively assigned to @xilnocas (but I can take it off of your plate if you are overloaded).

Here is the main module to implement: tensorboardX/writer.py
Here is the tensorflow/haskell logging system we might be able to piggy-back off of: tensorflow-logging

TensorRawSpec does not play well with quickcheck

I was writing a few property checks for TensorRaw and got the following error:

$ Error: $ Torch: not enough memory: you tried to allocate 92488408GB. Buy new RAM! at path/to/hasktorch/vendor/TH/THGeneral.c:270

This might not be a problem for other tensors.

Codegen THHalf* function to use CTHHalf arguments

I'm not sure if this is actually the correct thing to do (I haven't dug too deep into codegen, yet) but some sed replacements let this compile and, semantically, using CTHHalf at this stage might be correct.

This would also let us use hpack in raw without too much pain

Haskell implementation of optim package

https://github.com/torch/optim/blob/master/doc/algos.md

Reduce number of packages in signatures/

Currenty tree outputs:

signatures
├── floating
│   ├── exe
│   │   └── Main.hs
│   ├── hasktorch-signatures-floating.cabal
│   └── src
│       └── Torch
│           └── Sig
│               ├── NN.hsig
│               ├── Tensor
│               │   ├── Math
│               │   │   ├── Blas.hsig
│               │   │   ├── Floating.hsig
│               │   │   ├── Lapack.hsig
│               │   │   ├── Pointwise
│               │   │   │   └── Floating.hsig
│               │   │   ├── Random.hsig
│               │   │   └── Reduce
│               │   │       └── Floating.hsig
│               │   └── RandomFloating.hsig
│               └── Types
│                   └── NN.hsig
├── random-th
│   ├── exe
│   │   └── Main.hs
│   ├── hasktorch-signatures-random-th.cabal
│   └── src
│       └── Torch
│           └── Sig
│               └── TH
│                   └── Tensor
│                       ├── Math
│                       │   └── Random.hsig
│                       └── Random.hsig
├── random-thc
│   ├── exe
│   │   └── Main.hs
│   ├── hasktorch-signatures-random-thc.cabal
│   └── src
│       └── Torch
│           └── Sig
│               └── THC
│                   └── Tensor
│                       └── Random.hsig
├── signed
│   ├── exe
│   │   └── Main.hs
│   ├── hasktorch-signatures-signed.cabal
│   └── src
│       └── Torch
│           └── Sig
│               └── Tensor
│                   └── Math
│                       └── Pointwise
│                           └── Signed.hsig
├── types
│   ├── hasktorch-signatures-types.cabal
│   └── src
│       └── Torch
│           └── Sig
│               ├── State.hsig
│               ├── Storage
│               │   └── Memory.hsig
│               ├── Tensor
│               │   └── Memory.hsig
│               ├── Types
│               │   └── Global.hsig
│               └── Types.hsig
└── unsigned
    ├── exe
    │   └── Main.hs
    ├── hasktorch-signatures-unsigned.cabal
    └── src
        └── Torch
            └── Sig
                ├── Blas.hsig
                ├── Storage
                │   └── Copy.hsig
                ├── Storage.hsig
                ├── Tensor
                │   ├── Conv.hsig
                │   ├── Copy.hsig
                │   ├── Index.hsig
                │   ├── Masked.hsig
                │   ├── Math
                │   │   ├── Compare.hsig
                │   │   ├── CompareT.hsig
                │   │   ├── Pairwise.hsig
                │   │   ├── Pointwise.hsig
                │   │   ├── Reduce.hsig
                │   │   └── Scan.hsig
                │   ├── Math.hsig
                │   ├── Mode.hsig
                │   ├── Random.hsig
                │   ├── ScatterGather.hsig
                │   ├── Sort.hsig
                │   └── TopK.hsig
                ├── Tensor.hsig
                └── Vector.hsig

~~@sam~~ @stites, I wonder, could all of these .hsig files live in the same cabal library for simplicity?

Fix cuda-less build

Memory managed tensors

Implement memory managed representation of tensors using ForeignPtr

Find out if bindings-dsl might provide a more robust approach to codegen

TODOs have a habit of getting forgotten, so I figured it would be nice to turn this one into a ticket.

Docs for bindings-dsl: https://github.com/jwiegley/bindings-dsl/wiki

Support GHC 8.4

Need annotation of side effect vs. non-side effect functions

initialSeed in THRandom should be in IO even though return value is CLong due to mutation of THGenerator. More generally, need a data repository of side effect vs. non-side effect functions taking pointer arguments.

remove all todos from code -- file them in issues

and FIXMEs

Add NumHask instances to Dynamic/Tensor

blocks diffhask and AD (See #61)

reintroduce the newer `standard_gamma` function from ATen's TH library

See #42 for the full details but, in short, I commented out the standard_gamma functions for expediency on a long-standing PR. A good first issue is to go through the codebase, see how everything is tied together, and then reintroduce this code.

Break dependency on GHC-8.4 for release

I believe this dependency comes from singletons-2.4.0, which we actually use very little of. There was a soft attempt at backwards compatibility in Torch/Indef/Static/Tensor.hs -- this needs to be more pervasive if we want to release hasktorch to hackage (which only does builds up to 8.2.2) and stackage (if we want it in an lts like 11).

Breaking the dependency on ghc-8.4 stops the dependency on the mstksg/type-combinators#ghc-8.4 git submodule and frees us from laurmcarter/type-combinators#9.

Read through Backprop as Functor, report how it fits into hasktorch

I'm imagining that "Normalization layer" can be expressed in a neural network as some permutation of the following api change:

data ConvLayer h w c = ConvLayer
  { conv :: Tensor '[h, w, c]
  }

-- to the following pseudo-haskell:
data ConvLayer h w c = ConvLayer
  { conv :: Normalize (Batch (Tensor '[h, w, c]))
  }

This would be akin to taking current deep learning patterns and codifying them into monoidal categories.

Should hasktorch 2.0 be pure?

Until things stabilize, I don't think it's a good idea to do this, but in some future we can treat the IO calls as pure and unsafePerformIO them. This would be similar to what hmatrix (and others) do. I haven't seen enough of this in the wild to say if this is a good or bad idea for the FFI calls that happen in this library, and I am very nervous about it given how early on hasktorch is (and all of the future changes which have to happen). This issue will probably be a long-standing one to entertain the idea.

Handle TH_REAL_IS_* conditioning in declarations

See for example in THTensorRandom.h:

#if defined(TH_REAL_IS_FLOAT) || defined(TH_REAL_IS_DOUBLE)

Results in duplicate declarations for example abs function in THTensorMath.h

Convert shape types to use the dimensions package.

http://hackage.haskell.org/package/dimensions

Allow scalars to be 0-rank

We currently dismiss 0-rank tensors and have a few places where we throw pure exceptions when trying to access them. Without needing to move into a MonadThrow, we can fix this by allowing scalars to be represented as 0-rank tensors.

Ref: ATen readme

Implement prototype autograd / backprop using API

Additional tests of raw binding functionality

Make fromList take an infinite list

Might be a bit tricky, but it would be nice to say something like:

main :: IO ()
main = do
  t :: DoubleTensor '[1,2,3] <- fromList [1..]
  printTensor t

Currently the types take precedence, so it actually doesn't matter what you put into the list: if the list is too short, it will be filled with (I assume) whatever was last in memory, whereas if the list is too long it will be truncated.

Edit: fromList is now pure and returns a Maybe type depending on the runtime length of the list.

Explore use of Aten bindings

Render type signatures in code generation

use CPP flags to only build with Byte, Long, and Float/Double

This might shorten dev-time for the full build

Determine THError C -> Haskell exception handling approach

Diffhask integration

@o1lo01ol1o I know you are already working on this with diffhask so here's the official ticket to say "hasktorch needs this, please keep doing you." If you need anything, feel free to ping people here.

Followup ticket will be "Implement Learning to learn by gradient descent by gradient descent".

Split mutable code into newtypes like vector

with freeze = copy and thaw = newtypeUnwrap, to ensure that we are unsafePerformIOing safely.

On that note, maybe we should consider the entire vector design for dealing with the static-dynamic code duplication problem.

Broaden support for operations on memory-managed tensors. Currently a subset of raw bindings (`raw/`) are implemented for memory-managed tensors (`core/`)

Show instance for tensors

Show instance for tensors: better solution - use pytorch str functions as template | easy solution - convert to hmatrix and reuse hmatrix show instances

Get GHC to compile generated code

Document core/src/Torch/Raw module

It's pretty straightforward, but all changes should be documented

write small intro to backpack for development

Add benchmarks

add raw bindings for THCUNN

Next after #63. Note that this will require a cabal flag and CUDA bindings might have to be moved to a seperate subpackage (hasktorch-cuda) so that people can choose not to compile if they don't have the right cuda library files.

write `ptr` combinator to clean up codegen functions

Implement memory-managed THRandom

Fill out typeclasses in `Torch.Raw`

Adapt TH bindings to work with THC cuda code

Tools for visualizing models

starting points for this include:

servant client for visdom
use crayon-hs, and/or use haskell tensorflow API to tensorboard

add raw bindings for THNN

generate cabal files with hpack and package.yaml

Since it looks like this project is developed with stack in mind, I'm wondering if you've considered using hpack to generate cabal files. stack runs hpack on every build, so it requires no dependencies, but it does require checking in both package.yaml and .cabal files into git.

Some of that boilerplate includes autogeneration of the other-modules and exposed-modules fields.

Full coverage of Tensor.h operations in test suite

Add raw THC bindings

Clarify FFI resource handling interactions between unsafe FFI wrappers and GHC optimizations, guidelines for avoiding resource leaks

render examples with tintin

https://theam.github.io/tintin/

hasktorch / hasktorch Goto Github PK

hasktorch's Introduction

Hasktorch

Documentation

Introductory Videos

Getting Started

linux+cabal+cpu

linux+cabal+cuda11

macos+cabal+cpu

linux+stack+cpu

macos+stack+cpu

nixos+cabal+cpu

nixos+cabal+cuda11

docker+jupyterlab+cuda11

Known Issues

Tensors Cannot Be Moved to CUDA

Weird Behaviour When Switching from CPU-Only to CUDA-Enabled Nix Shell

Contributing

Notes for library developers

Project Folder Structure

hasktorch's People

Contributors

Stargazers

Watchers

Forkers

hasktorch's Issues

Recommend Projects

Recommend Topics

Recommend Org