This white paper aims to provide the best-known practices for using open-source cryptography libraries on Ampere processors, including the Ampere® Altra® family and the AmpereOne® family of processors.
Cryptography is the science of securing communication and data through mathematical techniques, ensuring confidentiality, integrity, and authenticity. It is widely used in web services, load balance proxies, databases etc.
Cryptography can be divided into 3 categories:
Cipher suites are sets of instructions that enable secure network connections through the Transport Layer Security (TLS), also known as Secure Sockets Layer (SSL). These suites provide a set of algorithms and protocols required to secure communications between clients and servers. During SSL connection initiation, the web server and the client perform a SSL handshake. This involves the two parties agreeing on a mutual cipher suite, which is used to negotiate a secure HTTPS connection.
Cipher suites contain four components: a key exchange algorithm, authentication, key encryption algorithm, and a message authentication algorithm. For example, the “ECDHE-ECDSA-AES256-GCM-SHA384” cipher suite indicates ECDHE for key exchange, ECDSA for authentication, AES256-GCM for encryption, and SHA364 for message integrity.
In this section, we will list the most widely used crypto libraries used in data center workloads.
In some use cases for packet processing workloads, these crypto libraries are used in conjunction with the DPDK Crypto Poll Mode Driver (PMD). Please refer to this tuning guide for details.
In this section, we will compare the performance of different libraries on Ampere processors.
Ampere® Altra® family
Ampere Altra family products are designed to meet the requirements of modern Cloud Native Computing environments with features like predictable performance and with core counts ranging from 32 to 128. These processors are Armv8.2+ ISA compatible.
All the open-source libraries listed above provide good support for the Ampere Altra family of processors.
Let’s start by comparing OpenSSL 3.3.0 and AWS-LC (master code, commit id 9921cd9) on a single core of Altra Max M128-30 using the bssl tool provided in AWS-LC.
Figure 1 and Figure 2 show AWS-LC can provide better performance for both RSA and ECDSA sign and verify.
ECDSA is preferred for asymmetric signs.
Figure 3 and Figure 4 show that AWS-LC is better than OpenSSL 3.3.0 for AES-CTR and AES-GCM with small block (16 bytes, 256 bytes, 1350 bytes); OpenSSL 3.3.0 is better for ChaCha20-Poly1350 and AES-GCM with large block(8192 bytes, 16384 block).
AES is preferred for Symmetric encryption and decryption.
AmpereOne® family of Processors
AmpereOne processors provide up to 192 cores, which are Armv8.6+ ISA compatible. The support of SHA3, SHA512 and RNG improves the performance of cryptography.
OpenSSL optimized the performance for AmpereOne in this commit, which starts from version 3.4.0. To get better performance with OpenSSL 3.2.x or 3.3.x, a backport is needed. Please refer to this repository for patches: https://github.com/AmpereComputing/openssl
The AWS-LC library does not target specific optimizations for AmpereOne. To get better performance on AmpereOne, please refer to this fork: https://github.com/AmpereComputing/aws-lc/tree/dev-ampereone
We will compare 3 different libraries:
OpenSSL 3.3.0-opt: OpenSSL 3.3.0 with the optimization for AmpereOne
AWS-LC: The AWS-LC library with commit id 9921cd9
AWS-LC-opt: AWS-LC library with AmpereOne optimizations
Figure 5 and Figure 6 show AWS-LC-opt provide the equivalent performance as OpenSSL 3.3.0 on RSA sign, but better performance for RSA verification and ECDSA algorithms.
Figure 7 and Figure 8 show AWS-LC-opt provide better performance for AES-GCM and AES-CTR than OpenSSL-3.3.0-opt. And OpenSSL-3.3.0-opt performs better with ChaCha20-Poly1305.
Per-core cipher performance is critical to faster TLS handshake performance in latency-sensitive usages like web servers. Similarly, performance and scalability of ciphers is important at the processor-level, especially as you scale out with cores.
In this section, we will use the speed test provided by OpenSSL library to show the performance scaling with core count on AmpereOne A192-32X processor and compare scalability with a similar class of processor.
The following command is used to test the AES-128-GCM throughput for 1024 bytes with different threads. “numactl” is used to affinitize the test to number of cores equaling the number of openssl threads.
numactl -C 0-N openssl speed -multi $threads --bytes 1024 --seconds 10 -evp aes-128-gcm
For reference, we compared AmpereOne A192-32X with AMD Genoa 9654; both processors have 192 threads. On AmpereOne, each thread is a physical core, but on AMD Genoa, there are 96 physical cores, each with 2 Simultaneous Multithreads (SMT). SMT can lead to poor scaling beyond 50% because of the underlying technology constraints.
Figure 9 shows linear performance scaling with core count on AmpereOne processor. And this linear performance scaling makes a 1.37x throughput on AmpereOne A192-32X compared to AMD Genoa 9654, as Figure 10 illustrates.
From a performance perspective, it is recommended to prefer ECDSA over RSA for digital signatures. ECDSA generally offers better performance and security efficiency, especially with smaller key sizes, which can lead to faster signature generation compared to RSA;
For symmetric encryption and decryption, AES-GCM should be preferred over ChaCha20-Poly1305;
When using cryptographic libraries, it is advisable to use the version of OpenSSL or AWS-LC that includes optimizations for the AmpereOne architecture. These optimizations leverage the new features on AmpereOne processors to enhance performance;
Furthermore, AWS-LC, which offers additional performance benefits with different implementations, should be preferred over OpenSSL;
Finally, the cryptographic performance on AmpereOne scales linearly with the core count. That means AmpereOne can bring more benefits when more cores are utilized.
https://www.geeksforgeeks.org/what-is-a-symmetric-encryption/
https://www.geeksforgeeks.org/asymmetric-key-cryptography/
https://www.geeksforgeeks.org/cryptography-hash-functions/
https://amperecomputing.com/tuning-guides/dpdk-cryptography-build-and-tuning-guide
https://amperecomputing.com/products/processors
https://github.com/openssl/openssl
https://github.com/aws/aws-lc/
https://github.com/ARM-software/AArch64cryptolib
https://gitlab.arm.com/arm-reference-solutions/ipsec-mb
https://github.com/AmpereComputing/openssl
https://github.com/AmpereComputing/aws-lc/tree/dev-ampereone
All data and information contained herein is for informational purposes only and Ampere reserves the right to change it without notice. This document may contain technical inaccuracies, omissions and typographical errors, and Ampere is under no obligation to update or correct this information. Ampere makes no representations or warranties of any kind, including but not limited to express or implied guarantees of noninfringement, merchantability, or fitness for a particular purpose, and assumes no liability of any kind. All information is provided “AS IS.” This document is not an offer or a binding commitment by Ampere. Use of the products contemplated herein requires the subsequent negotiation and execution of a definitive agreement or is subject to Ampere’s Terms and Conditions for the Sale of Goods.
System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere.
©2025 Ampere Computing. All Rights Reserved. Ampere, Ampere Computing, Altra and the ‘A’ logo are all registered trademarks or trademarks of Ampere Computing. Arm is a registered trademark of Arm Limited (or its subsidiaries). All other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.