Altair_Blog_hero_1920x225

Featured Articles

Rightsizing HPC Infrastructure Deployments with Altair Simulators and 3rd Gen AMD EPYC™ Processors

As design engineers know, CAE organizations have an insatiable demand for performance. Manufacturers compete based on the effectiveness of their HPC for everything from validating new designs to selecting materials to optimizing product performance and durability. These pressures are becoming more intense with ever more complex designs, increased competition, and shorter product cycles.

While the appetite for compute cycles may be limitless, capacity and IT budgets are finite. Manufacturers need to achieve the best possible results while operating within limited budgets for software tools and HPC infrastructure. In short, they need to do more with less. 

Fortunately, the latest generation of AMD EPYC™ processors coupled with simulation and HPC management tools from Altair can help. This article discusses six ways that the combination of AMD EPYC processors and Altair software can help organizations boost efficiency, reduce cost, and rightsize HPC infrastructure deployments. 

Meet the Money Savers

Processor speed matters when it comes to productivity, but raw performance is only part of the puzzle. Organizations need to balance throughput with factors that include capital and operational costs, facilities, sustainability, administrative considerations, and preserving existing investments.  

Altair simulators and HPC middleware combined with the latest AMD EPYC™ processors provide organizations with multiple opportunities to boost productivity and reduce cost. These include:

  • Leveraging new processor technologies to reduce data center footprint and speed throughput
  • Boosting efficiency with power-efficient CPUs and flexible cloud bursting
  • Working smarter with state-of-the-art simulators and advanced workload scheduling solutions

Read on and meet the money savers.

1. Accelerate Workloads with AMD EPYC with 3D V-Cache™ Technology

For CAE users, performance is critical. Turning around simulations faster means that engineers can more thoroughly explore design parameters in less time. This results in higher-quality products, fewer warranty issues, reduced physical prototyping, and faster time to market – all impacting the bottom line. Key metrics for HPC data center managers include job turnaround time and simulations per rack.

In benchmarks conducted by Altair, standard finite-element analysis (FEA) and computational fluid dynamics (CFD) workloads ran anywhere between up to 1.5x and 1.8x faster using Altair® AcuSolve® and Altair® Radioss® on the latest AMD EPYC™ processors and AMD 3D V-Cache™ technology.1


This means that CAE users are not only more productive, but they can also meet performance goals with a smaller server footprint.

2. Reduce Data Center Footprint Using Fewer, Higher-Throughput Processors

Productivity is critical, but so too is reducing cost and data center real estate. With the performance gains described above, design organizations can not only achieve higher levels of productivity — they can potentially reduce the number of servers required by up to 33 to 44%.2 By running Altair solvers on the latest AMD EPYC™ processors organizations can deploy fewer servers while achieving the same overall throughput. 

The savings can be considerable. Fewer servers translates into fewer server racks, fewer network drops, reduced power consumption, less administration, and a lower data center footprint.

3. Help Reduce Carbon Emissions, Power, and Cooling with Energy-Efficient EPYC Processors

According to the International Energy Agency, modern data centers consume approximately 200 terawatt-hours (TWh) of electricity per year, accounting for nearly 1% of global energy demand.3 HPC data centers are major energy consumers. Reducing a data center’s energy use is not just good for business – it can be good for the planet. Organizations can reduce power and cooling costs and mitigate costs and risks related to carbon taxes depending on their jurisdiction.

Today, AMD EPYC processor-powered systems deliver the industry's best throughput per watt, holding the top spots in the industry-standard SPECpower_ssj® 2008 benchmark.4 Moreover, midway through 2022, AMD is  on-track to achieve an ambitious goal to deliver a 30x increase in energy efficiency for AMD processors and accelerators powering servers for HPC and AI-training from 2020 to 2025.

Altair workload managers also play an essential role in further improving energy efficiency. Green provisioning features in Altair® PBS Professional® and energy-aware scheduling place jobs to reduce power consumption while automatically shutting down nodes when they are not in use.5


4. Address Peak Utilization with Flexible Cloud Bursting

Another way to rightsize Infrastructure is to practice “peak-shaving,” leveraging cloud-based compute resources to offload workloads during peak periods. While most CAE users operate their own Infrastructure, some see value in bursting simulation to the cloud during busy periods. 

Traditionally, complexity has been a barrier to expanding workloads to the cloud. Customers needed to worry about various issues, including provisioning cloud resources, synchronizing data between on-prem and cloud storage environments, providing access to on-prem license servers, and avoiding unexpected cost overruns.

Altair HPC solutions virtually eliminate the complexity of tapping the latest AMD EPYC™ processors in the cloud with tools such as Altair® Control® and Altair® Navops®. CAE users can also access AMD EPYC™-based cloud instances using an intuitive cloud-bursting GUI built into PBS Professional.6

5. Work Smarter with State-of-the-Art Altair Simulators

Often overlooked in efficiency discussions is the quality of the simulation tools themselves. High-performance processors and scalable server platforms are of limited use if the simulation software cannot fully utilize them.

Altair simulators deliver outstanding scalability, exploiting the inherent parallelism of the latest AMD EPYC™ processors. Radioss employs a hybrid massively parallel processing (HMPP) model – combining both shared memory processing (SMP) and single program multiple data (SPMD) parallelism techniques implemented via MPI.7 By employing HMPP, simulations can scale across 512 cores, keeping processors busy and minimizing simulation runtimes. 

Altair solvers also allow users to employ different numerical methods depending on the simulation and the required fidelity. Altair solvers provide engineers with multiple “knobs and dials” they need to maximize simulation efficiency. By choosing the best simulation technique for the job, engineers can achieve results faster, with only minimal impact on simulation accuracy. 

6. Rightsize HPC Investments with Smarter Scheduling

In a field where even single-digit performance gains are considered dramatic, the role of effective workload management cannot be overstated. It’s often said that if you can’t measure it, you can’t manage it. Tools such as Altair Breeze™ and Altair Mistral™ can help customers understand how simulations use infrastructure. HPC schedulers such as Altair® PBS Professional®, Altair Grid Engine™, and Altair® Accelerator™ can help customers automate scheduling policies to maximize throughput and utilization. Workload monitoring and scheduling should be at the top of every HPC cluster administrator’s “to do” list, given the outsized impact of scheduling on performance.  

Altair schedulers complement and extend the capabilities of both Altair simulators and AMD EPYC™ hardware. AMD EPYC™ supports configurable NUMA modes that partition processors into multiple virtual processors, each with dedicated processor cores, cache, and memory channels.8 With topology-aware scheduling and core-affinity features in Altair PBS Professional, engineers can wring every ounce of performance out of their HPC investments. In some cases, throughput could potentially be improved simply by using scheduling policies to place workloads optimally across available server resources.  

Hardware is not the only data center asset that needs to be optimized. In electronic design automation (EDA), the cost of tools and talent is often much greater than investments in Infrastructure. License-first scheduling in Altair Accelerator can help semiconductor designers minimize their spending on software tools and infrastructure and maximize throughput for verification jobs.


Multi-Dimensional Efficiency Gains

For HPC users, the benefits described above are cumulative. By layering these approaches, the whole becomes greater than the sum of the parts. The combination of high-performance hardware, smart HPC middleware, and sophisticated solvers provides CAE data center managers with the tools and flexibility to reduce runtimes, boost throughput, optimize workload placement, and minimize their HPC data center footprint.

To learn more about third-gen AMD EPYC processors, visit amd.com/en/processors/epyc-7003-series. To learn more about Altair HPC solutions, visit altair.com/hpc-cloud-applications.

 


 

1. These results compared the latest AMD EPYC 7003 series processors with AMD 3D V-Cache to earlier AMD 7003 series processors. See the article "Breakthrough Computing Performance with Altair and 3rd Gen AMD EPYC™ Processors with AMD 3D V-Cache™ Technology" for details.
2. A throughput increase of 1.5x corresponds to a 33% reduction in required infrastructure (1-1/1.5) to achieve the same aggregate throughput. Similarly, a 1.8x throughput gain corresponds to roughly a 44% reduction (1-1/1.8). 
3. See November 2021 IEA.org report – Data Centres and Data Transmission Networks. 
4. See EPYC-028 – AMD EPYC provides leading results on SPECpower_ssj® 2008 benchmark.
5. 
Altair Unveils PBS Professional® 10, Adds Green Provisioning Feature, Improves Performance and Administrator Controls. 
6. See Cloud Bursting with Altair PBS Professional. 
7. 
Learn about Shared Memory Parallelism (SMP) and Hybrid MPP in Altair Radioss in the Altair Forum. 
8. 
See the AMD EPYC Tuning Guide for 7003 series processors, March 2022 – Section 2.2.2. NUMA Nodes per Socket (NPS).