Giter Site home page Giter Site logo

williamfalcon / tensorflow-gpu-install-ubuntu-16.04 Goto Github PK

View Code? Open in Web Editor NEW
285.0 8.0 110.0 36 KB

Tensorflow GPU install instructions for ubuntu 16.04 - Deep learning

ubuntu tensorflow nvidia-driver nouveau tensorflow-gpu deep-learning machine-learning

tensorflow-gpu-install-ubuntu-16.04's Introduction

Tensorflow GPU install on ubuntu 16.04

These instructions are intended to set up a deep learning environment for GPU-powered tensorflow.
See here for pytorch GPU install instructions

After following these instructions you'll have:

  1. Ubuntu 16.04.
  2. Cuda 9.0 drivers installed.
  3. A conda environment with python 3.6.
  4. The latest tensorflow version with gpu support.

Step 0: Noveau drivers

Before you begin, you may need to disable the opensource ubuntu NVIDIA driver called nouveau.

Option 1: Modify modprobe file

  1. After you boot the linux system and are sitting at a login prompt, press ctrl+alt+F1 to get to a terminal screen. Login via this terminal screen.

  2. Create a file: /etc/modprobe.d/nouveau-blacklist.conf e.g. by

sudo touch /etc/modprobe.d/nouveau-blacklist.conf
  1. Put the following in the above file...
blacklist nouveau
options nouveau modeset=0
  1. Regenerate the kernel initramfs
sudo update-initramfs -u
  1. reboot system
reboot
  1. On reboot, verify that noveau drivers are not loaded
lsmod | grep nouveau

If nouveau driver(s) are still loaded do not proceed with the installation guide and troubleshoot why it's still loaded.

Option 2: Modify Grub load command
From this stackoverflow solution

  1. When the GRUB boot menu appears : Highlight the Ubuntu menu entry and press the E key. Add the nouveau.modeset=0 parameter to the end of the linux line ... Then press F10 to boot.
  2. When login page appears press [ctrl + ALt + F1]
  3. Enter username + password
  4. Uninstall every NVIDIA related software:
sudo apt-get purge nvidia*  
sudo reboot   

Installation steps

  1. update apt-get
sudo apt-get update
  1. Install apt-get deps
sudo apt-get install openjdk-8-jdk git python-dev python3-dev python-numpy python3-numpy build-essential python-pip python3-pip python-virtualenv swig python-wheel libcurl3-dev curl   
  1. install nvidia drivers
# The 16.04 installer works with 16.10.
# download drivers
curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.0.176-1_amd64.deb

# download key to allow installation
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub

# install actual package
sudo dpkg -i ./cuda-repo-ubuntu1604_9.0.176-1_amd64.deb

#  install cuda (but it'll prompt to install other deps, so we try to install twice with a dep update in between
sudo apt-get update
sudo apt-get install cuda-9-0   

2a. reboot Ubuntu

sudo reboot

2b. check nvidia driver install

nvidia-smi   

# you should see a list of gpus printed    
# if not, the previous steps failed.   
  1. Install cudnn
wget https://s3.amazonaws.com/open-source-william-falcon/cudnn-9.0-linux-x64-v7.3.1.20.tgz
sudo tar -xzvf cudnn-9.0-linux-x64-v7.3.1.20.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
  1. Add these lines to end of ~/.bashrc:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda
export PATH="$PATH:/usr/local/cuda/bin"

4a. Reload bashrc

source ~/.bashrc
  1. Install miniconda
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh   

# press s to skip terms   

# Do you approve the license terms? [yes|no]
# yes

# Miniconda3 will now be installed into this location:
# accept the location

# Do you wish the installer to prepend the Miniconda3 install location
# to PATH in your /home/ghost/.bashrc ? [yes|no]
# yes    

5a. Reload bashrc

source ~/.bashrc
  1. Create python 3.6 conda env to install tf
conda create -n tensorflow python=3.6

# press y a few times 
  1. Activate env
source activate tensorflow   
  1. update pip (might already be up to date, but just in case...)
pip install --upgrade pip
  1. Install stable tensorflow with GPU support for python 3.6
pip install --upgrade tensorflow-gpu

# If the above fails, try the part below
# pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.0-cp36-cp36m-linux_x86_64.whl
  1. Test tf install
# start python shell   
python

# run test script   
import tensorflow as tf   

hello = tf.constant('Hello, TensorFlow!')

# when you run sess, you should see a bunch of lines with the word gpu in them (if install worked)
# otherwise, not running on gpu
sess = tf.Session()
print(sess.run(hello))

or alternatively

tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"

tensorflow-gpu-install-ubuntu-16.04's People

Contributors

dylanbstorey avatar fractalbass avatar mhaghighat avatar msmdev avatar ogail avatar williamfalcon avatar wydwww avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflow-gpu-install-ubuntu-16.04's Issues

How to resolve the issue of "ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory" when import tensorflow

Here is the error trace:

$ python 
Python 3.6.3 (default, May 20 2018, 18:46:07) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Traceback (most recent call last):
  File "/home/yubrshen/.pyenv/versions/udacity_workspace/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/yubrshen/.pyenv/versions/udacity_workspace/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/yubrshen/.pyenv/versions/udacity_workspace/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/yubrshen/.pyenv/versions/3.6.3/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/yubrshen/.pyenv/versions/3.6.3/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

Disabling nouveau on ubuntu 16.04.3

The supplied instructions for disabling nouveau drivers on ubuntu 16.04.3 did not work for me. I had to follow NVIDIA instructions to achieve that (plus a restart after finishing this), check instructions here. Additionally I suggest that the README add check to verify that nouveau drivers are unloaded by running this cmd and it should not return anything

lsmod | grep nouveau

Happy to submit a PR with the suggestions

Add a reboot step before running nvidia-smi

I encountered an issue when running nvidia-smi. I needed to reboot my system after installing CUDA before the command would list my GPUs. It worked after rebooting, and I continued with the process.

Cuda compute capability

Ignoring visible gpu device (device: 0, name: GeForce GTX 660M, pci bus id: 0000:01:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
How can we solve this problem?
Thank you in advance.

Beginning installing Nvidia 390 or 396

You skipped the 1st part of installing Nvidia drivers... And to check
" nvidia-smi "command
2.b ... Please tell what you'll get for cuda toolkit 9.0.. ?

tf-nightly-gpu build

First, thanks for this got me up and running with minimal fuss.

While running some test models, was running into runtime/complie time version mismatch error.

pip install tf-nightly-gpu

solved it for me.

cuDNN 7.1 incompatible

Thank you for the great tutorial! I followed the exact same steps and everything seems working fine. However, when I tried to run keras projects in PyCharm, I got an error similar to the one reported
Here. I am trying to downgrade to cudnn 7.0.4 now and see how it goes.

Reboot required after step 2

Hey ! It worked like a charm. Many thanks.
I just noticed that reboot is required after step 2, at least for me.
Maybe you want to update ;)
Cheers,
Edouard

Ubuntu 16.04 display is gone

Tried this on Ubuntu 16.04 Followed the step. The display is no longer active since the reboot on 2a. I am stuck now. Can still SSH to the machine, but it seems the boot configs are damaged and no longer recognizes the GPUs. Output for: "sudo update-initramfs -u" is:

update-initramfs: Generating /boot/initrd.img-4.13.0-32-generic
/usr/bin/objcopy:/boot/efi/EFI/ubuntu/shimx64.efi: No such file or directory
/usr/bin/objcopy: --change-section-vma .initrd=0x0000000003000000 never used
/usr/bin/objcopy: --change-section-vma .linux=0x0000000000040000 never used
/usr/bin/objcopy: --change-section-vma .cmdline=0x0000000000030000 never used
/usr/bin/objcopy: --change-section-vma .osrel=0x0000000000020000 never used
update-initramfs: Failed to generate Linuxium bootscript.

cudnn needed upgrading

The cudnn 7.0 is no longer compatible with the current release of tensorflow-gpu, so when running some jobs will give segment fault. Switching to newest version of cudnn fixed this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.