Cassandra on Azure Brief
Dpsv5 Virtual Machines Powered by Ampere Altra Processors
Ampere® Altra® processors are designed from the ground up to deliver exceptional performance for Cloud Native applications such as Apache Cassandra. With an innovative architecture that delivers high performance, linear scalability, and amazing energy efficiency, Ampere Altra allows workloads to run in a predictable manner with minimal variance under increasing loads. This enables industry leading performance/watt and a smaller carbon footprint for real-world workloads such as Cassandra.
Microsoft offers a comprehensive line of Azure Virtual Machines featuring the Ampere Altra Cloud Native processor that can run a diverse and broad set of scale-out workloads such as web servers, open-source databases, in-memory applications, big data analytics, gaming, media, and more. The Dpsv5 VMs powered by Ampere Altra processors are general-purpose VMs that provide 2 GB of memory per vCPU and a combination of vCPUs, memory, and local storage to cost-effectively run workloads that do not require larger amounts of RAM per vCPU. The Epsv5 VMs are memory-optimized VMs that provide 4 GB of memory per vCPU, which can benefit memory-intensive workloads, including open-source databases, in-memory caching applications, gaming, and data analytics engines.
Apache Cassandra is an open-source NoSQL distributed database trusted by thousands of companies for its scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it a good choice for storing mission-critical data.
In this workload brief, we compare the Ampere Altra-based Microsoft Azure Dpsv5 instances to the Intel® Xeon® Ice Lake-based Dsv5 and AMD EPYC™ Milan-based Dav5 instances in Azure running Cassandra while studying the throughput and latencies on each of these processors.
As seen the Fig.1, we measured up to a 5% performance advantage for the Dpsv5 VMs compared to the Dsv5 and 7% better than the Dasv5 VMs, all at a p.99 SLA of 3 ms.
In addition, the Dpsv5 VMs showed compelling price-performance compared to the legacy x86 Dsv5 and Dasv5 VMs - up to 31%.
Our tests were performed using the Cassandra-stress tool as a load generator for Cassandra. Each test was configured to run for 3 minutes with multiple threads and clients.
It is recommended to compile Cassandra with JDK-15 (compiled with GCC 10.2 with the architecture-appropriate flags) or newer, as recent JDKs have made significant progress towards generating optimized code for AArch64 applications.
The G1 garbage collector was used with appropriate memory and number of threads. Cassandra data was stored on an NVMe disk drive, while the commit log was stored on tmpfs.
Ubuntu 20.04 was used with Cassandra 4.0.1. For each of the tests, a similar number of Cassandra-stress clients was used to generate requests.
Since it is customary to measure throughput under a specified Service Level Agreement (SLA), a 99th percentile latency (p.99) of 3 ms was used. This ensured that 99% of the requests had a worst-case response time of 3 ms.
The test ran for 3 minutes with warmup with a 90% write/10% read ratio, which is a common usage for Cassandra, since it is optimized for write operations. An appropriate number of clients and threads to load one instance of Cassandra was initially used, while ensuring the p.99 latency was 3 ms.
Next, the number of Cassandra instances was successively increased till one or more instances violated the latency SLA. At that point, the aggregate throughput of all instances was used as the primary performance metric. The test was run three times, and minimal run-to-run variation was observed.
Distributed NoSQL databases such as Cassandra manage a large volume of data with great ease and scalability and are popular in cloud deployments. Our tests showed up to a 5% performance advantage and up to 31% price-performance advantage for Ampere Altra-based Azure Dpsv5 VMs compared to the leading x86 VMs. For cloud application developers, choosing Ampere Altra-based VMs on Azure means better performance and price-performance while reducing your carbon footprint.
For more information about Azure Virtual Machines with Ampere Altra Arm-based processors, visit the Azure blog.
All data and information contained herein is for informational purposes only and Ampere reserves the right to change it without notice. This document may contain technical inaccuracies, omissions and typographical errors, and Ampere is under no obligation to update or correct this information. Ampere makes no representations or warranties of any kind, including but not limited to express or implied guarantees of noninfringement, merchantability, or fitness for a particular purpose, and assumes no liability of any kind. All information is provided “AS IS.” This document is not an offer or a binding commitment by Ampere. Use of the products contemplated herein requires the subsequent negotiation and execution of a definitive agreement or is subject to Ampere’s Terms and Conditions for the Sale of Goods.
System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere.
Price performance was calculated using Microsoft's Virtual Machines Pricing, in September of 2022. Refer to individual tests for more information.
©2022 Ampere Computing. All Rights Reserved. Ampere, Ampere Computing, Altra and the ‘A’ logo are all registered trademarks or trademarks of Ampere Computing. Arm is a registered trademark of Arm Limited (or its subsidiaries). All other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
Ampere Computing® / 4655 Great America Parkway, Suite 601 / Santa Clara, CA 95054 / amperecomputing.com
4655 Great America Parkway
Suite 601 Santa Clara, CA 95054