Modern AI Compute workloads have pushed legacy processor architectures to their limits. The need for higher performance, greater scalability, and efficient communication across compute has now become a top priority for enterprises and cloud providers alike in order to advance their AI ambitions.
Ampere® Computing is addressing this challenge with high-performance, high-efficiency AI Compute solutions, powered by its vast portfolio of IP innovations. In this Q&A, Jeff Wittich, Chief Product Officer at Ampere, provides insights into how Ampere's innovative approach and custom IP is enabling breakthroughs in performance, supporting future products, and shaping the next generation of AI Compute systems.
At the heart of the AmpereOne® product family, including the highly anticipated AmpereOne Aurora, are Ampere’s custom cores. What’s different about these cores? And how are they equipped to handle AI workloads?
Ampere’s custom Arm-based cores offer several advantages for AI workloads, particularly AI inference. A few of their key benefits include:
Ampere recently revealed that it has created its own custom mesh. With AI scaling rapidly and becoming more data-intensive, how does Ampere’s custom mesh architecture scale for higher bandwidth demand and improve performance for AI-driven applications?
Ampere’s custom mesh architecture was designed to handle the increasing demands of AI, which requires both high bandwidth and low latency. Since Ampere has already been pushing the boundaries on CPU density and efficiency, we had already quickly outpaced the capabilities of other mesh technologies. By reducing transaction overhead, the mesh facilitates efficient data movement across cores, lowering latency—a necessity for real-time AI tasks.
Key benefits of the Ampere custom mesh include:
This architecture allows for the data throughput needed in AI applications without sacrificing latency, meeting the growing complexity of AI.
What role does the mesh architecture play in supporting upcoming AI-specific products like AmpereOne Aurora?
AmpereOne Aurora is the next generation of AmpereOne, combining both general purpose compute cores with AI acceleration.
Ampere’s custom mesh architecture supports the scalability and performance needs of these future products. It supports connectivity between a large number of compute elements while also maintaining efficient communication between cores and memory, ensuring data transfer efficiency as systems grow.
Along with increasing memory capacity and new memory technologies, each core is provided with the bandwidth needed for demanding workloads, ensuring that Ampere’s upcoming products meet the challenges of modern cloud and AI applications.
How does software play into Ampere’s ambitions in AI? And what is the company doing to set itself apart on the software side?
In 2021, Ampere acquired OnSpecta, a leading provider of AI deployment and acceleration software, to bring its customers significantly higher AI performance. Ampere has leveraged the OnSpecta IP to create a number of custom AI software libraries designed to optimize machine learning and AI inference workloads.
One key tool, for example, is Ampere AI Optimizer (AIO). This acceleration engine is fully integrated within popular AI frameworks so that developers can seamlessly run AI inference workloads efficiently and at the highest performance on Ampere processors, without the need for GPUs.
Just as Ampere continues to iterate on its hardware offerings, we will also continue to advance our software tools to enable even more efficient and scalable AI.
How do these IP innovations come together to set Ampere apart from its competition in the AI space?
AmpereOne Aurora, our upcoming 512 core CPU with integrated AI acceleration, is a great example of how all of Ampere’s investments in IP are coming together to bring disruptive AI products to meet the needs of the industry. When you combine all of Ampere’s innovations -- our custom cores, proprietary mesh, our die-to-die interconnect across chiplets -- and add in Ampere's own AI acceleration, the result is the revolutionary AmpereOne Aurora.
This product will offer more than 3x the performance of our current AmpereOne processors, deliver the leading performance per rack for AI Compute, and even more importantly, can be air-cooled without the need for exotic data center upgrades. This means that it can be deployed in any existing data center around the world, from public cloud to enterprises, and from hyperscale data centers to the edge – the key to solving our global AI power crisis.
This is all made possible by the convergence of innovative technologies that Ampere has been working on for the last several years.