If you’ve ever waited hours—or even days—for a full‑wave FDTD simulation to finish, you know how quickly those delays can bottleneck a photonics project. Whether you’re tuning a waveguide or sweeping parameters for a metasurface, speed matters. That’s why more engineers and researchers are turning to GPU acceleration.
This post shows how much faster—and more cost‑effective—your simulations can run with GPUs in Ansys Lumerical. You’ll see benchmark results, learn why GPUs are a natural fit for FDTD, and pick up optimisation tips for both on‑prem and cloud hardware.
FDTD solves Maxwell’s equations directly in the time domain, making it highly flexible for broadband problems and complex 3‑D geometries. The trade‑off is computational cost: fine spatial resolution, long propagation times, and large meshes quickly balloon to billions of Yee cells—especially in integrated photonics and nanophotonics design.
GPUs contain thousands of lightweight cores that can update Yee cells concurrently, whereas CPUs rely on a few heavyweight cores. Because each cell update is independent, FDTD maps almost perfectly onto GPU hardware, yielding order‑of‑magnitude speed‑ups and superior energy efficiency.
Whether you start with a workstation‑class RTX A6000, scale up to a server‑grade A100, or burst to multi‑GPU L40S nodes in the cloud, GPU acceleration can slash turnaround time from hours to minutes.
For supported features and best practices, see Ansys KB:
Getting started with running FDTD on GPU.
Benchmarking Methodology
To provide actionable insights, we benchmarked the performance using the Ansys metalens Lumerical file (~0.85 B Yee cells), which is attached to this page for reference. This model represents a realistic, computationally demanding scenario. Consistent settings were maintained across all simulations, employing an auto-shutoff criterion when the field energy decayed to 10⁻⁵. This consistency ensures performance comparisons focus purely on hardware capability.
We evaluated multiple hardware platforms reflecting diverse user scenarios, ranging from local workstations and enterprise HPC clusters to scalable cloud solutions.
Hardware Configurations Tested
Configuration |
Wall Time |
Speed‑Up vs CPU |
Throughput (Mnodes/s) |
Peak GPU Memory |
63× CPU Cluster |
62 min |
— |
1 985 |
— |
1× RTX A6000 |
25 min |
2.5× |
7 277 |
32 GiB |
1× A100 80 GB |
12 min |
5.2× |
13 200 |
32 GiB |
4× L40S (Burst) |
9 min |
7.1× |
29 074 |
8 GiB / GPU |
Ready to try GPU acceleration yourself?
A downloadable benchmark file is available at the following here (file):
Ozen Engineering Inc. leverages its extensive consulting expertise in CFD, FEA, optics, photonics, and electromagnetic simulations to achieve exceptional results across various engineering projects, addressing complex challenges such as antenna design, signal integrity, electromagnetic interference (EMI), and electric motor analysis using Ansys software.
We offer support, mentoring, and consulting services to enhance the performance and reliability of your electronics systems. Trust our proven track record to accelerate projects, optimize performance, and deliver high-quality, cost-effective results. For more information, please visit https://ozeninc.com.
If you want to learn more about our consulting services, please visit: https://www.ozeninc.com/consulting/
CFD: https://www.ozeninc.com/consulting/cfd-consulting/
FEA: https://www.ozeninc.com/consulting/fea-consulting/
Optics: https://www.ozeninc.com/consulting/optics-photonics/
Photonics: https://www.ozeninc.com/consulting/optics-photonics/
Electromagnetic Simulations: https://www.ozeninc.com/consulting/electromagnetic-consulting/
Thermal Analysis & Electronics Cooling: https://www.ozeninc.com/consulting/thermal-engineering-electronics-cooling/