softwareunderground / geo-docker Goto Github PK

View Code? Open in Web Editor NEW

13.0 18.0 5.0 24 KB

A docker image fully loaded with Geo* & ML related packages

License: Other

Makefile 18.69% Batchfile 8.37% Dockerfile 72.94%

geo-docker's Introduction

DEPRECATED please use https://github.com/softwareunderground/subsurface-ml-docker

geo-docker

A docker image fully loaded with Geo* & ML related packages.

clone this repo into a folder alongside other repos or folders with code you want to execute in
build and launch a container with python notebook attached with make notebook.
In the notebook, browse to the /workspace folder, you should see the contents of your the parent directory mounted as a volume, which means read/write access from the container.
A folder ~/Datasets is also mounted in the container at /data

To change any of the mounted paths, or add more edit the Makefile

At the moment, the configuration has first class setup for keras, as that is where it started out.

What's in the Box

A full Anaconda install is huge and we are adding to that with common ml and geo packages. To try and stop this getting too bloated we have stuck with a MiniConda base image, meaning we need to be exoplicit about what we add but we only get what we want.

Some attept has been made to have sections in the Dockerfile in the hope that it's easier for people to customise to their needs.

The container currently holds:

loads of packages
that probably should be listed
somewhere, maybe here
.....

Missing stuff

Here is a list of packages that were not included initially, maybe these should be turned into issues! :)

GPRMax package
noddy executable for pynoddy
torch & pytorch
pygimli
fatiando
cupy - slow to install
devito

There are some other things it would be nice to do too:

add a dead simple example notebook athe at least exercises the GPU via tensorflow, maybe even jsut an xor, or minst example or something.
setup tensorboard properly with jupyter-tensorflow and provide an example of how to use it

Installing Docker

General installation instructions are on the Docker site, but we give some quick links here:

Running the container

We are using Makefile to simplify docker commands within make commands.

Build the container and start a Jupyter Notebook

$ make notebook

Build the container and start an iPython shell

$ make ipython

Build the container and start a bash

$ make bash

For GPU support install NVIDIA drivers (ideally latest) and nvidia-docker. Run using

$ make notebook GPU=0 # or [ipython, bash]

Switch keras between Theano and TensorFlow

$ make notebook BACKEND=theano
$ make notebook BACKEND=tensorflow

Mount a volume for external data sets

$ make DATA=~/mydata

Prints all make tasks

$ make help

You can change Theano parameters by editing /docker/theanorc.

Note: If you would have a problem running nvidia-docker you may try the old way we have used. But it is not recommended. If you find a bug in the nvidia-docker report it there please and try using the nvidia-docker as described above.

$ export CUDA_SO=$(\ls /usr/lib/x86_64-linux-gnu/libcuda.* | xargs -I{} echo '-v {}:{}')
$ export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
$ docker run -it -p 8888:8888 $CUDA_SO $DEVICES gcr.io/tensorflow/tensorflow:latest-gpu

License

MIT

Credits

This docker and Makefile layout was originally based on the docker starter example in the keras repo. THe Docker file in particular has been customised to make it easier to see groups of related packages and add remove as necessary. But the makefile and instructions in this readme are pretty much as-is and lovely. The original repository available under MIT here

geo-docker's People

Contributors

Stargazers

Watchers

Forkers

justingosses jesperdramsch dfcastap aashish24

geo-docker's Issues

Geostatistics with gslib

If you can get pygslib working.

https://github.com/opengeostat/pygslib

CUDNN

This repository installs Cuda and cudnn7 from the official image, then downgrades cudnn7, as Nvidia ships their latest.

Instead of pulling the combined image, which is here:
https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/cudnn7/Dockerfile

FROM ${repository}:9.0-devel-ubuntu16.04
ENV CUDNN_VERSION 7.1.2.21
RUN apt-get update && apt-get install -y --no-install-recommends \ > libcudnn7=$CUDNN_VERSION-1+cuda9.0
libcudnn7-dev=$CUDNN_VERSION-1+cuda9.0 &&
rm -rf /var/lib/apt/lists/*

Then essentially this docker file repeats the argument to downgrade cudnn7. We could save some time adjusting the FROM to cuda only and install our required cudnn7 version.

Bokeh toolchain for plotting

Following packages:

Bokeh
Datashader
Holoviews
Geoviews

Jupyter lab

This/Notebook + some script to autostart on the current folder (sharing it to Docker) would be amazing.

Check pip installs

Hi,

Dockerfile uses conda. Is running pip failing with conda? I may be wrong.

https://fmgdata.kinja.com/using-docker-with-conda-environments-1790901398

Should we run all the RUN pip install ... with explicitly RUN bin/bash -c pip install ... ?

PyTorch

PyTorch would be awesome. I'm so sick of trying to get caffe to compile on my mac.

Add Devito

We have a wish list item to get Devito added to the image.

https://github.com/opesci/devito

Make getting started on windows really easy

Users should necessarily have to install make to get access to convenience commands on windows. Let's add some .bat files to make that easy.

Also the readme should be updated to give some per-platform instructions

fix documentation

The docs are terrible, the README doesn;t even explain properly how to get started.
Add / improve

quick start section for getting running from dockerhub - with key commands
getting start from this repo section - with key startup commands
a section on customising the dockerfile
how to contribute / add a package
license

Use conda-forge channel

If possible. Usually they are more up-to-date.

Misleading commands and vars in makefile

These couple of files i borrowed from the keras repo, which was a good starting point but I think some of the makefile vars are not used and misleading, the same goes for some of the info in the readme. e.g. i think GPU support is on all the time, so the GPU flag?

I have not tested yet though so raising an issue. We should simplify the commands in the Makefile and cleanup the readme.

Also I think the CUDA and cuDNN versions should be managed in the Dockerfile and not in the Makefile, especially since we have to do special stuff like downgrade cuDNN to get this to work, might as well have that all contained in the docker file.

This issue should be about cleaning this up, removing redundant commands, vars and readme items and making things consistent.

Should we rename / move to a new repo?

it turns out geo-docker is a very active organisation creating a geospatial / GIS type server thingy.

So we have a name conflict. We could fix this early before lots of people start using this repo by just moving to a new repo and associated slot on dockerhub. if we are gonna do this then the earlier the better.

possible new name is gng-ml-docker but open to any suggestions.

Add GPRMax package

http://www.gprmax.com/

Cannot run tensorboard

I have been trying to run tensorboard on startup via the dockerfile. I get problems there and the same problems when trying directly via bash.

ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Tensorflow works and can find and the GPU fine, so CUDA is installed OK. but tensorboard can't.

Doing a which tensorboard reveals /opt/conda/bin/tensorboard

I am not sure if that is right. We install tensorflow via pip.

As tensorflow works this seems like a path or configuration problem, of maybe we have multiple tensorboards?

has anyone hit this issue with tensorboard elsewhere?

Support for parquet

Support for parquet would be great. It works really well with pandas/dask.

https://github.com/dask/fastparquet

or(/and)

https://arrow.apache.org/docs/python/

Add TPOT plus its dependencies

conda install numpy scipy scikit-learn pandas
pip install deap update_checker tqdm stopit
pip install xgboost
pip install scikit-mdr skrebate
pip install tpot