AWS p2.xlargeインスタンスの最新のTensorflow（1.0）にCUDA 8.0をインストールする方法AMI ami-edb11e8dとnvidiaドライバを最新のバージョン（375.39）

Tensorflowバージョン1.0にアップグレードし、CUDA 8.0をcudnn 5.1バージョンとnvidiaドライバを最新の375.39に更新しました。私のNVIDIAハードウェアは、p2.xlargeインスタンス、Tesla K-80を使用してAmazon Web Services上にあるハードウェアです。私のOSはLinux 64ビットです。 tf.Session（）AWS p2.xlargeインスタンスの最新のTensorflow（1.0）にCUDA 8.0をインストールする方法AMI ami-edb11e8dとnvidiaドライバを最新のバージョン（375.39）

[[email protected] CUDA]$ python 
Python 2.7.12 (default, Sep 1 2016, 22:14:00) 
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import tensorflow as tf 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally 
>>> sess = tf.Session() 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 
E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUDA_ERROR_NO_DEVICE 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: ip-172-31-7-96 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: ip-172-31-7-96 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Invalid argument: expected %d.%d or %d.%d.%d form for driver version; got "1" 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:363] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.39 Tue Jan 31 20:47:00 PST 2017 
GCC version: gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) 
""" 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.39.0

私はこの問題を解決する方法について完全に無知だ：

は、私は次のエラーメッセージ私はコマンドを使用するたびに取得します。私はNvidiaドライバとCUDAの異なるバージョンを試しましたが、まだ動作しません。

ご了承ください。

出典

2017-02-23 basuam

は、おそらくあなたのGPUドライバが正しくインストールされていません。 'nvidia-smi'を実行した結果はどうですか？ [cuda linux install guide]（http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#abstract）で説明されているように、CUDAインストールの検証を実行しましたか？ –

すぐにお返事ありがとうございます。 nvidia-smiが機能し、私はウェブサイトに記載されている「検証」に従わなかった。私はRedhat 7.3システムで最初からやり直すことに決めました。これは最初に機能したので、これ以上の支援は必要ありませんでした。 – basuam

ドライバをアンインストールします。& cudaを実行してから、official guideに従って再インストールしてください。

deviceQueryを実行して、デバイスが正しくインストールされていることを確認します。

出典

2017-02-23 19:18:00 tblue

ご返信ありがとうございます。あなたが示唆したように、私は最初からすべてをインストールした後にdeviceQueryを実行しました。私はRedHat 7.3を使用して別のインスタンスを作成し、すべてのパッケージを更新するのに少し時間を費やしました。最後に、うまく働いています。 – basuam

これはうまくいきました！ – tblue

NVIDIAドライバをインストールし、CUDA 8.0インストーラを実行する必要があります。

# Requirements 
# - NVIDIA Driver - NVIDIA-Linux-x86_64-375.39.run - http://www.nvidia.fr/Download/index.aspx 
# - CUDA runfile (local) - cuda_8.0.61_375.26_linux.run - https://developer.nvidia.com/cuda-downloads 
# - cudnn-8.0-linux-x64-v5.0-ga.tgz 

sudo apt update -y && sudo apt upgrade -y 
sudo apt install build-essential linux-image-extra-`uname -r` -y 

chmod +x NVIDIA-Linux-x86_64-375.39.run 
sudo ./NVIDIA-Linux-x86_64-375.39.run 

chmod +x cuda_8.0.61_375.26_linux.run 
./cuda_8.0.61_375.26_linux.run --extract=`pwd`/extracts 
sudo ./extracts/cuda-linux64-rel-8.0.61-21551265.run 

echo -e "export CUDA_HOME=/usr/local/cuda\nexport PATH=\$PATH:\$CUDA_HOME/bin\nexport LD_LIBRARY_PATH=\$LD_LINKER_PATH:\$CUDA_HOME/lib64" >> ~/.bashrc 
source .bashrc 

tar xf cudnn-8.0-linux-x64-v5.0-ga.tgz 
cd cuda 
sudo cp lib64/* /usr/local/cuda/lib64/ 
sudo cp include/cudnn.h /usr/local/cuda/include/

出典

2017-03-22 14:34:05 Kmaschta

私はcudnn 8がまだテンソルフローでサポートされていないと思った – Goddard

p3（v100 GPU）インスタンスで「NVIDIA Volta Deep Learning AMI」を試すこともできます。

登録してhttps://www.nvidia.com/en-us/gpu-cloud/?ncid=van-gpu-cloudに登録し、AMIを無料で使用するための「APIキー」を取得してください。

EC2/GPUの設定情報：https://aws.amazon.com/blogs/aws/new-amazon-ec2-instances-with-up-to-8-nvidia-tesla-v100-gpus-p3/

出典

2017-10-28 15:45:44 Setogit

AWS p2.xlargeインスタンスの最新のTensorflow（1.0）にCUDA 8.0をインストールする方法AMI ami-edb11e8dとnvidiaドライバを最新のバージョン（375.39）

答えて

関連する問題