Innovation continues at a relentless pace in cloud computing. More than a decade ago, features like hardware virtualization and high-speed networking led to the birth of the public cloud. In the process of lifting and shifting workloads into the cloud, software was rearchitected to make better use of features the cloud had to offer, similar to rearranging furniture after moving into a new home.
For many years, however, the general purpose compute hardware available to the cloud did not adapt. While some legacy processors have recently made incremental advances in performance and energy efficiency, these were merely band aids, akin to adding on extra bedrooms or weatherproofing an existing house when it no longer meets your needs.
In 2020, we launched the 80-core Ampere Altra, the first Cloud Native Processor, designed specifically for this new era of cloud computing. Rather than trying to extend and retrofit existing processor architectures, Ampere built this CPU from the ground up to satisfy the high performance and power efficiency needs of scale-out cloud native workloads. This is a new approach to design, specifically architected for diverse multi-tenant environments.
In 2021, we further extended our Cloud Native Processor family with the 128-core Ampere Altra Max. It was reviewed extensively by the technical press and led across various performance and performance/watt metrics. Previously, I discussed the performance and power efficiency of Ampere Altra and Altra Max on key cloud workloads.
Following up on the growing buzz around Ampere® Altra® and Ampere® Altra® Max Cloud Native Processors, this blog highlights detailed comparisons between Ampere Altra Max and best-in-class legacy x86 CPUs from Intel and AMD on an expanded list of cloud native workloads. All the workloads discussed here are popular in the cloud (and most of them were “born there”). They are ideal proxies for diverse solution domains (e.g. front end web, in-memory caching, etc.).
NGINX – Web Serving
NGINX is a high-performance HTTP web server that is prevalent in the cloud. It uses a sophisticated event-driven architecture that allows it to scale to hundreds of thousands of concurrent connections on modern hardware. NGINX performance is determined by how well network, storage, and compute are balanced. Our tests showed a 3.2x performance advantage for Ampere Altra Max over Intel Icelake and a 1.8x advantage over AMD Milan. The energy efficiency advantage ranges from 2.1x to 3.8x!
Redis – In-Memory Caching
Redis is a popular in-memory cache that is used in applications that require high throughput under stringent Service Level Agreements (SLAs). Our benchmarking team measured the aggregate throughput of multiple Redis instances ensuring p.99 latencies were under 1 millisecond. Multiple Redis instances were run since it is mostly single-threaded by design. Our testing shows a 1.3x advantage for Ampere Altra Max over AMD Milan and close to twice the performance of Intel Icelake. Performance/Watt was up to 2.8x higher!
Memcached – In-memory Caching
Memcached is another in-memory key-value store used in the cloud that predates Redis and has been deployed for close to two decades. Our results show that Ampere Altra Max outperforms Intel Icelake and AMD Milan by 74% and 23% respectively. Again, Performance/Watt was a compelling 2.6x higher than the Intel Icelake CPU and 1.9x higher than AMD Milan.
Cassandra – Distributed Database
Apache Cassandra is a popular distributed NoSQL database focused on high availability and fault tolerance. We ran multiple instances of Cassandra in a scale out configuration with cassandra-stress as the load generator. On a write-dominant workload, fairly typical for Cassandra, Ampere Altra Max was found to generate 16% and 33% higher throughput with p.99 latencies under 10ms. Performance/Watt was close to twice that of the Intel and AMD CPUs! The pattern of Altra Max’s energy efficiency leadership at compelling performance levels should be obvious by now.
MySQL – Relational Database
MySQL continues to be the most popular open-source relational database in the cloud and is a component of the LAMP (Linux, Apache, MySQL, PHP/Perl/Python) web application software stack, a term that was coined more than twenty years ago. Even though MySQL technically is not a cloud native stack, over the years, it has adapted to the cloud paradigm. Most public clouds today offer MySQL as a fully managed web service. Using the popular sysbench benchmark utility, we measured a 36% performance advantage for Ampere Altra Max compared to Intel Icelake and a 29% advantage over AMD Milan. On energy efficiency, Ampere Altra Max was close to 2x better than both Intel and AMD.
Media Encoding with h.264
The h.264 video compression standard is used for encoding and distributing video and audio. The standard was first published in 2004 and continues to lead in overall popularity. It is mature and has been extensively optimized on legacy x86 CPUs. Despite that, the large core count and the 2x128-bit SIMD vector units per core end up giving the Ampere Altra Max a 1.2x to a 2.2x encoding performance lead over the x86 contenders. The performance/watt is 1.4 to 2.4x better than x86.
Conclusion
At Ampere, we are inventing the future of the cloud with our products. Towards that end, we have shipped two processor families in two years that target a variety of workloads and usages, bringing an entirely new set of capabilities to the data center. Ampere Altra Max is the newest cloud performance champion as demonstrated in this blog. It not only excels at raw performance against the best legacy x86 processors on the market but has an overwhelming advantage on energy efficiency! Raw performance and energy efficiency have always been at odds with each other, but you can now achieve leadership in both. To deliver the highest performance without compromising energy efficiency, a new type of innovative hardware is needed: Ampere Cloud Native Processors.
Our next blog series will address the role of Cloud Native processors in reducing the carbon footprint of datacenters. Stay tuned!