
So, we can’t conclude that if we have a larger memory in a GPU and take a larger batch size, it will give better validation accuracy. In fact, the accuracy and the other metrics are around 2% less in the case of P100. Training with these batch sizes and taking 2000 training steps gives the following results for the evaluation metrics and the loss in validation:Īs we can visualize from the bar chart, there is not much difference in the metrics even though we had taken a larger batch size for the P100. While the largest batch size for 1080 Ti is 32, the largest batch size for the P100 is 85. 5 GB more than the 1080 Ti), it is able to fit a larger batch size for the given dataset. Since the P100 has a larger memory size (i.e. Evaluation Metrics and Validation Loss comparison the time taken for training is reduced by approximately 20%. Regarding the comparison between both the GPU’s, the P100 outperforms the 1080 Ti, though there is only 1.3X speedup, i.e. We can see that the GPU’s are significantly faster than the CPU, with about 4X speedup.

Tesla P100 has more memory than GTX 1080 Ti. The hardware specifications of both the devices are:

The second is a Tesla P100 GPU, a high-end device devised for data centers which provides high-performance computing for Deep Learning. The first is a GTX 1080 Ti GPU, a gaming device. This whitepaper aims at comparing two different pieces of hardware that are often used for Deep Learning tasks. T4-dd-1-1:Performance: 518164.721 tau/day, 1199.Performance Comparison between NVIDIA’s GeForce GTX 1080 and Tesla P100 for Deep Learning Introduction The T4 can do double precision if needed but it's strength is mixed and single precision.

The explanation was found T4 benchmarks fp64 and fp32. But running a colloid example in Lammps compiled for these GPUs with DOUBLE_DOUBLE, all three models obtain the same result in 500,000 loops. Nvidia does not publish any data on FP64 for T4 and certain RTX models. The P100 is best at double precision (FP64), the RXT6000 is modest and the T4 actually has no specs regarding FP64. Starting with the Fermi architecture, Quadro and Tesla variants have better double-precision support than consumer Ge Force models.” So I'm utterly confused by this outcome. “Every GPU with SM 1.3 (Tesla/GTX2xx) or better has hardware double-precision support. Comparison of Nvidia, GeForce GPUs and Nvidia Tesla GPUs
