NVIDIA recently announced its ultra high-end Quadro GP100 graphics card in February. Comparing to previous generations of Quadro cards, the new card runs much faster and is more power efficient. The new GP100 GPU has 3,584 CUDA cores, which deliver 10.6 and 5.3 teraflops floating point performances for single- and double-precision, respectively.
The GPU is also equipped with 16 GB HBM2 (the 2nd generation high-bandwidth memory) which allows data to be transferred at a lightning fast speed of 720 GB/sec. Both factors enhance the performance for running the most demanding transient electromagnetic simulation.
I had a chance to benchmark ANSYS 18 HFSS Transient on a DELL Precision T5600 workstation and witnessed Quadro GP100’s speedup over NVIDIA Tesla K40 and Intel Xeon E5-2687W. The speedup of Quadro GP100 over Tesla K40 can be as high as 1.8x based on the entire simulation time (not just the calculations that are accelerated on the GPU). For the best case, the speedup of Quadro GP100 over eight cores of Xeon E5-2687W is 8.3x.
While testing Tesla K40, I had to install another card (Quadro K5000) for graphics. But for testing Quadro GP100, I only needed one card for both graphics and computing. It is not difficult to see that Quadro GP100 has the advantages of reducing costs and saving space. However, for users who want to launch jobs from Remote Desktop or run multiple HFSS Transient tasks on multiple GPUs, they have to use a dedicated graphics card for display. For those scenarios, Quadro GP100 can serve as a compute-only card after setting its modes to TCC and EXCLUSIVE_PROCESS. To get the best performance from Quadro GP100, its graphics and memory clocks should be set to 1556 and 715 MHz, respectively.
I will be happy to answer your questions about running ANSYS HFSS Transient on NVIDIA’s Pascal graphics cards.