首页 / TENSORFLOW / python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码

python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码

内容导读

互联网集市收集整理的这篇技术教程文章主要介绍了python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码，小编现在分享给大家，供广大互联网技能从业者学习和参考。文章包含3971字，纯文字阅读大概需要6分钟。

内容图文

python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码

我试图在2或3个1080Ti上测试多GPU版本cifar10_estimator的性能,但没有收到加速.

我找到了一些有关硬件here的有用信息,但仍然困惑如何解决它.

我的环境：

> Ubuntu VERSION = 16.04.5 LTS(Xenial Xerus)
> Python3
> CUDA_VERSION = 9.0.176
> tensorflow-gpu = 1.11.0

GPU信息：

nvidia-smi topo -m

    GPU0    GPU1    GPU2    GPU3    GPU4    GPU5    GPU6    GPU7    CPU Affinity
GPU0     X  PIX PHB PHB SYS SYS SYS SYS 0-7
GPU1    PIX  X  PHB PHB SYS SYS SYS SYS 0-7
GPU2    PHB PHB  X  PIX SYS SYS SYS SYS 0-7
GPU3    PHB PHB PIX  X  SYS SYS SYS SYS 0-7
GPU4    SYS SYS SYS SYS  X  PIX PHB PHB 8-15
GPU5    SYS SYS SYS SYS PIX  X  PHB PHB 8-15
GPU6    SYS SYS SYS SYS PHB PHB  X  PIX 8-15
GPU7    SYS SYS SYS SYS PHB PHB PIX  X  8-15

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing a single PCIe switch
  NV#  = Connection traversing a bonded set of # NVLinks

1 gpu bach_size = 128

INFO:tensorflow:loss = 2.2576141, step = 200 (3.729 sec)
INFO:tensorflow:learning_rate = 0.1, loss = 2.2576141 (3.729 sec)
INFO:tensorflow:Average examples/sec: 2821.06 (2858.65), step = 200
INFO:tensorflow:Average examples/sec: 2847.23 (3496.06), step = 210
INFO:tensorflow:Average examples/sec: 2857.91 (3102.29), step = 220
INFO:tensorflow:Average examples/sec: 2867.04 (3083.62), step = 230
INFO:tensorflow:Average examples/sec: 2889.21 (3514.15), step = 240
INFO:tensorflow:Average examples/sec: 2913.15 (3636.28), step = 250
INFO:tensorflow:Average examples/sec: 2915.99 (2988.94), step = 260
INFO:tensorflow:Average examples/sec: 2901.94 (2578.95), step = 270
INFO:tensorflow:Average examples/sec: 2888.87 (2575.46), step = 280
INFO:tensorflow:Average examples/sec: 2892.13 (2986.66), step = 290
INFO:tensorflow:global_step/sec: 24.25

2 gpu bach_size = 256

INFO:tensorflow:loss = 2.4630964, step = 200 (5.971 sec)
INFO:tensorflow:learning_rate = 0.1, loss = 2.4630964 (5.971 sec)
INFO:tensorflow:Average examples/sec: 3255.68 (4296.71), step = 200
INFO:tensorflow:Average examples/sec: 3297.51 (4437.93), step = 210
INFO:tensorflow:Average examples/sec: 3332.15 (4275.33), step = 220
INFO:tensorflow:Average examples/sec: 3363.86 (4254.65), step = 230
INFO:tensorflow:Average examples/sec: 3395.09 (4316.94), step = 240
INFO:tensorflow:Average examples/sec: 3418.44 (4094.23), step = 250
INFO:tensorflow:Average examples/sec: 3447.17 (4364.24), step = 260
INFO:tensorflow:Average examples/sec: 3474.56 (4379.02), step = 270
INFO:tensorflow:Average examples/sec: 3492.73 (4067.13), step = 280
INFO:tensorflow:Average examples/sec: 3514.19 (4244.23), step = 290
INFO:tensorflow:global_step/sec: 16.6026

3 gpu bach_size = 384

INFO:tensorflow:loss = 2.0980535, step = 200 (9.329 sec)
INFO:tensorflow:learning_rate = 0.1, loss = 2.0980535 (9.329 sec)
INFO:tensorflow:Average examples/sec: 3214.65 (4165.7), step = 200
INFO:tensorflow:Average examples/sec: 3272.85 (5130.99), step = 210
INFO:tensorflow:Average examples/sec: 3324.15 (4955.13), step = 220
INFO:tensorflow:Average examples/sec: 3376.65 (5174.76), step = 230
INFO:tensorflow:Average examples/sec: 3425.48 (5132.15), step = 240
INFO:tensorflow:Average examples/sec: 3468.29 (4954.35), step = 250
INFO:tensorflow:Average examples/sec: 3509.91 (5014.23), step = 260
INFO:tensorflow:Average examples/sec: 3544.29 (4755.56), step = 270
INFO:tensorflow:Average examples/sec: 3579.69 (4901.39), step = 280
INFO:tensorflow:Average examples/sec: 3617.84 (5156.66), step = 290
INFO:tensorflow:global_step/sec: 13.1009

解决方法:

我想我现在可以回答我的问题.如果我想为多个gpus提供更高的性能,我应该查看https://github.com/tensorflow/benchmarks/.有关我在tf_cnn_benchmarks的测试结果,请参阅this issue.

内容总结

以上是互联网集市为您收集整理的python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码全部内容，希望文章能够帮你解决python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码所遇到的程序开发问题。如果觉得互联网集市技术教程内容还不错，欢迎将互联网集市网站推荐给程序员好友。

内容备注

版权声明：本文内容由互联网用户自发贡献，该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至 gblab@vip.qq.com 举报，一经查实，本站将立刻删除。

内容手机端

扫描二维码推送至手机访问。

本文链接：https://qyyshop.com/info/767615.html

来源：【匿名】

【上一篇】Python TensorFlow框架实现手写数字识别系统【下一篇】详解tensorflow载入数据的三种方式

更多 ►

【python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码】教程文章相关的互联网学习教程文章

mac下的tensorflow安装与测试【代码】【图】

先安装了Anaconda（点击进入官网）。因为它集成了很多Python的第三方库，而且可以方便的管理不同版本的Python，在不同版本的Python之间切换。而且Anaconda是一个科学计算环境，在电脑上安装完Anaconda之后，除了相当于安装了Python，也安装好了一些常用的库。笔者安装的是Python 2.7版的Anaconda，在安装好Anaconda之后，就已经安装好了Python和一些常用的库了。此外，还自动安装了Spyder。Spyder是Python一个简单的集成开发环境，...

android things sample（sample-tensorflow-imageclassifier）测试【图】

今天来运行的是tensorflow-imageclassifier的sample。这个sample的功能是，当led亮的时候，点击button，进行照相，系统会对图片进行分析，图片中的内容。图片和结果的显示，需要连接显示屏，分析结果以及指示命令，可以通过连接speaker或者是耳机。源码地址：https://github.com/androidthings/sample-tensorflow-imageclassifier关于组装内容，大家可以参考上面源码地址里面的官方介绍。另外，这个sample的组装，我觉得是之前另外...

Python / Tensorflow – 我已经训练了卷积神经网络,如何测试它？【代码】

我已经训练了一个卷积神经网络(CNN),其中包含我在二进制文件中的以下数据(标签,文件名,数据(像素))：[array([2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1,0, 2, 1, 0, 2, 1, 0]), array(['10_c.jpg', '10_m.jpg', '10_n.jpg', '1_c.jpg','1_m.jpg', '1_n.jpg', '2_c.jpg', '2_m.jpg','2_n.jpg', '3_c.jpg', '3_m.jpg', '3_n.jpg','4_c.jpg', '4_m.jpg', '4_n.jpg', '5_c.jpg','5_m.jpg', '5_n.jpg', '6_...

TensorFlow：好用的时间序列训练测试集生成器（Python）【代码】

TensorFlow：好用的时间序列训练测试集生成器（Python）前言一、tf.keras.preprocessing.sequence.TimeseriesGenerator介绍二、示例展示总结前言当我们使用TensorFlow框架搭建时间序列训练模型的时候，如何处理时间序列数据，生成训练集和测试集往往是一个不那么重要但是很麻烦的步骤，很多人选择自己写程序，但是有工具干嘛不用？官方教程使用的是timeseries_dataset_from_array，但是这个是适用TensorFlow在2.3或者以上的版本，...

吴裕雄--天生自然 pythonTensorFlow自然语言处理：Attention模型--测试【代码】【图】

import sys import codecs import tensorflow as tf# 1.参数设置。 # 读取checkpoint的路径。9000表示是训练程序在第9000步保存的checkpoint。 CHECKPOINT_PATH = "F:\\temp\\attention_ckpt-9000"# 模型参数。必须与训练时的模型参数保持一致。 HIDDEN_SIZE = 1024 # LSTM的隐藏层规模。 DECODER_LAYERS = 2 # 解码器中LSTM结构的层数。 SRC_VOCAB_SIZE = 10000 ...

吴裕雄--天生自然 pythonTensorFlow自然语言处理：Seq2Seq模型--测试【代码】【图】

import sys import codecs import tensorflow as tf# 1.参数设置。 # 读取checkpoint的路径。9000表示是训练程序在第9000步保存的checkpoint。 CHECKPOINT_PATH = "F:\\temp\\seq2seq_ckpt-9000"# 模型参数。必须与训练时的模型参数保持一致。 HIDDEN_SIZE = 1024 # LSTM的隐藏层规模。 NUM_LAYERS = 2 # 深层循环神经网络中LSTM结构的层数。 SRC_VOCAB_SIZE = 10000...

python – Tensorflow和cifar 10,测试单个图像【代码】

我试图用tensorflow的cifar-10预测单个图像的类. 我找到了这个代码,但它失败了这个错误：分配要求两个张量的形状匹配. lhs shape = [18,384] rhs shape = [2304,384]我理解这是因为批次的大小只有1.(使用expand_dims我创建一个假批次.) 但我不知道如何解决这个问题？我到处搜索但没有解决方案..提前致谢！from PIL import Image import tensorflow as tf from tensorflow.models.image.cifar10 import cifar10 width = 24 height...

python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码【代码】

我试图在2或3个1080Ti上测试多GPU版本cifar10_estimator的性能,但没有收到加速. 我找到了一些有关硬件here的有用信息,但仍然困惑如何解决它. 我的环境： > Ubuntu VERSION = 16.04.5 LTS(Xenial Xerus)> Python3> CUDA_VERSION = 9.0.176> tensorflow-gpu = 1.11.0 GPU信息：nvidia-smi topo -mGPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 CPU Affinity GPU0 X PIX PHB PHB SYS SYS SYS SYS 0-7 GPU1 ...

python-Tensorflow MNIST教程-测试精度非常低【代码】

我从tensorflow开始并一直遵循这个标准MNIST tutorial. 但是,与预期的92％的准确性相反,在训练集和测试集上获得的准确性不会超过67％.我熟悉softmax和多项式回归,并且使用草稿python实现以及sklearn.linear_model.LogisticRegression获得了94％以上的收益. 我曾使用CIFAR-10数据集尝试过相同的方法,在这种情况下,准确性太低,只有10％左右,这等于随机分配类.这让我怀疑我的张量流的安装,但是我对此不确定. 这是my implementation of...

python – 批量培训但是在Tensorflow中测试单个数据项？【代码】

我已经训练了一个批量大小为10的卷积神经网络.但是在测试时,我想分别预测每个数据集的分类而不是分批预测,这给出了错误：Assign requires shapes of both tensors to match. lhs shape= [1,3] rhs shape= [10,3]我理解10指的是batch_size,3指的是我分类的类数. 我们不能使用批次进行培训并单独测试吗？更新：培训阶段：batch_size=10 classes=3 #vlimit is some constant : same for training and testing phase X = tf.placehol...

tensorflow(二十四)：fashion mnist数据集，训练与测试【代码】

一、代码import tensorflow as tf from tensorflow import keras from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics import osos.environ[TF_CPP_MIN_LOG_LEVEL] = 2def preprocess(x, y): #数据预处理x = tf.cast(x, dtype=tf.float32)/ 255.y = tf.cast(y, dtype=tf.int32)return x,y(x, y),(x_test, y_test) = datasets.fashion_mnist.load_data() print(x.shape, y.shape)batchsize = 128#...

tensorflow学习笔记——获取训练数据集和测试数据集【代码】

训练神经网络模型之前，需要先获取训练数据集和测试数据集，本文介绍的获取数据集（get_data_train_test）的方法包括以下步骤： 1 在数据集文件夹中，不同类别图像分别放在以各自类别名称命名的文件夹中； 2 获取所有图像路径以及分类； 3 将分类转为字典格式； 4 将所有图像路径打乱； 5 将所有图像路径切分为训练部分和测试部分； 6 获取x部分 6.1 获取图像； 6.2 图像尺寸调整； 6.3 图像降维； 6.4 图像像素值取反； 6.5 图像像...

首页 / TENSORFLOW / python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码

python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码

内容导读

内容图文

内容总结

内容备注

内容手机端

【python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码】教程文章相关的互联网学习教程文章

mac下的tensorflow安装与测试【代码】【图】

android things sample（sample-tensorflow-imageclassifier）测试【图】

Python / Tensorflow – 我已经训练了卷积神经网络,如何测试它？【代码】

TensorFlow：好用的时间序列训练测试集生成器（Python）【代码】

吴裕雄--天生自然 pythonTensorFlow自然语言处理：Attention模型--测试【代码】【图】

吴裕雄--天生自然 pythonTensorFlow自然语言处理：Seq2Seq模型--测试【代码】【图】

python – Tensorflow和cifar 10,测试单个图像【代码】

python – 多个gpus(1080Ti)不能加速tensorflow中的训练,测试cifar10_estimator代码【代码】

python-Tensorflow MNIST教程-测试精度非常低【代码】

python – 批量培训但是在Tensorflow中测试单个数据项？【代码】

tensorflow(二十四)：fashion mnist数据集，训练与测试【代码】

tensorflow学习笔记——获取训练数据集和测试数据集【代码】

TENSORFLOW - 相关标签

PYTHON - 相关标签

测试 - 相关标签

TENSORFLOW - 最新教程

TENSORFLOW - 最热教程