Ampere Computing Logo
Contact Sales
Ampere Computing Logo
Hero Image

Cryptography Libraries on Ampere

Overview

This white paper aims to provide the best-known practices for using open-source cryptography libraries on Ampere processors, including the Ampere® Altra® family and the AmpereOne® family of processors.

Background

Cryptography is the science of securing communication and data through mathematical techniques, ensuring confidentiality, integrity, and authenticity. It is widely used in web services, load balance proxies, databases etc.


Cryptography can be divided into 3 categories:

  • Symmetric key cryptography
    Also known as secret key encryption, this method uses the same key for encryption and decryption. Popular symmetric key algorithms include Advanced Encryption Systems (AES), Data Encryption Systems (DES), ChaCha20, SM1, and SM4.
  • Asymmetric key cryptography
    This method uses two keys or a key pair: a public key and a corresponding private key. The public key is publicly distributed, and can be used by anyone to encrypt messages, but only the recipient, who holds the corresponding private key, can decrypt those messages. Popular algorithms used in asymmetric key cryptography are Rivest–Shamir–Adleman (RSA), Digital Signature Algorithm (DSA), Elliptic Curve Cryptography (ECC), Diffie-Hellman(DH), and SM2.
  • Hash functions
    Hash functions are used to transform plaintext data of any size into a unique ciphertext or ‘fingerprint’ of fixed size. It is commonly used in message authentication, data integrity check, and digital signatures. Examples include MD5, SHA-1, SHA-2, SHA-3, and SM3.

Cipher Suites

Cipher suites are sets of instructions that enable secure network connections through the Transport Layer Security (TLS), also known as Secure Sockets Layer (SSL). These suites provide a set of algorithms and protocols required to secure communications between clients and servers. During SSL connection initiation, the web server and the client perform a SSL handshake. This involves the two parties agreeing on a mutual cipher suite, which is used to negotiate a secure HTTPS connection.


Cipher suites contain four components: a key exchange algorithm, authentication, key encryption algorithm, and a message authentication algorithm. For example, the “ECDHE-ECDSA-AES256-GCM-SHA384” cipher suite indicates ECDHE for key exchange, ECDSA for authentication, AES256-GCM for encryption, and SHA364 for message integrity.

Open-Source Crypto libraries

In this section, we will list the most widely used crypto libraries used in data center workloads.


  • OpenSSL
    Repository: https://github.com/openssl/openssl/
    OpenSSL is a general-purpose cryptographic library. It is the most used crypto library and is pre-installed on most OS distributions. It implements basic crypto algorithms and supports different hardware architectures.
    The OpenSSL version provided with different OS distros may vary, including 1.1.1, 3.0.x, 3.1.x, and 3.2.x. The current LTS version is 3.0.
  • BoringSSL
    Repository: https://github.com/google/boringssl
    BoringSSL is a fork of OpenSSL that is designed to meet Google's needs. It is not intended for general use because there are no guarantees of API or ABI stability.
  • AWS-LC
    Repository: https://github.com/aws/aws-lc
    AWS-LC is a general-purpose cryptographic library maintained by the AWS Cryptography team for AWS and their customers. It is based on code from the Google BoringSSL project and the OpenSSL project. AWS-LC adds several optimizations for both x86 and Arm processors.
  • AArch64cryptolib
    Repository: https://github.com/ARM-software/AArch64cryptolib
    AArch64cryptolib is a ‘from scratch’ implementation of cryptographic primitives aiming for optimal performance on Arm A-class cores. This library currently supports AES-GCM and AES-CBC optimized code.
    OpenSSL provides AES-GCM and AES-CBC implementations, but the performance is just as good with AArch64cryptolib on Arm Neoverse N1, Arm Neoverse V1, Ampere Altra, and AmpereOne processors.
  • IPSec MB:
    Repository: https://gitlab.arm.com/arm-reference-solutions/ipsec-mb
    This is the Multi-Buffer Crypto library for IPSec on Arm64 processor. It is based on the intel-ipsec-mb library and provides the SNOW3G and ZUC algorithms on Arm64. These algorithms are widely use in Telco and 5G workloads.

In some use cases for packet processing workloads, these crypto libraries are used in conjunction with the DPDK Crypto Poll Mode Driver (PMD). Please refer to this tuning guide for details.

Performance of Crypto libraries on Ampere processors

In this section, we will compare the performance of different libraries on Ampere processors.


Ampere® Altra® family
Ampere Altra family products are designed to meet the requirements of modern Cloud Native Computing environments with features like predictable performance and with core counts ranging from 32 to 128. These processors are Armv8.2+ ISA compatible.


All the open-source libraries listed above provide good support for the Ampere Altra family of processors.


Let’s start by comparing OpenSSL 3.3.0 and AWS-LC (master code, commit id 9921cd9) on a single core of Altra Max M128-30 using the bssl tool provided in AWS-LC.


Figure 1 and Figure 2 show AWS-LC can provide better performance for both RSA and ECDSA sign and verify.


ECDSA is preferred for asymmetric signs. figure1.png


figure2.png

figure3.png


figure4.png


Figure 3 and Figure 4 show that AWS-LC is better than OpenSSL 3.3.0 for AES-CTR and AES-GCM with small block (16 bytes, 256 bytes, 1350 bytes); OpenSSL 3.3.0 is better for ChaCha20-Poly1350 and AES-GCM with large block(8192 bytes, 16384 block).


AES is preferred for Symmetric encryption and decryption.

AmpereOne® family of Processors
AmpereOne processors provide up to 192 cores, which are Armv8.6+ ISA compatible. The support of SHA3, SHA512 and RNG improves the performance of cryptography.


OpenSSL optimized the performance for AmpereOne in this commit, which starts from version 3.4.0. To get better performance with OpenSSL 3.2.x or 3.3.x, a backport is needed. Please refer to this repository for patches: https://github.com/AmpereComputing/openssl


The AWS-LC library does not target specific optimizations for AmpereOne. To get better performance on AmpereOne, please refer to this fork: https://github.com/AmpereComputing/aws-lc/tree/dev-ampereone


We will compare 3 different libraries:
OpenSSL 3.3.0-opt: OpenSSL 3.3.0 with the optimization for AmpereOne
AWS-LC: The AWS-LC library with commit id 9921cd9
AWS-LC-opt: AWS-LC library with AmpereOne optimizations


Figure 5 and Figure 6 show AWS-LC-opt provide the equivalent performance as OpenSSL 3.3.0 on RSA sign, but better performance for RSA verification and ECDSA algorithms.


Figure 7 and Figure 8 show AWS-LC-opt provide better performance for AES-GCM and AES-CTR than OpenSSL-3.3.0-opt. And OpenSSL-3.3.0-opt performs better with ChaCha20-Poly1305.


asymmetric-charts-5-6.png


figure7.png


figure8.png

Performance Scaling with core count

Per-core cipher performance is critical to faster TLS handshake performance in latency-sensitive usages like web servers. Similarly, performance and scalability of ciphers is important at the processor-level, especially as you scale out with cores.


In this section, we will use the speed test provided by OpenSSL library to show the performance scaling with core count on AmpereOne A192-32X processor and compare scalability with a similar class of processor.


The following command is used to test the AES-128-GCM throughput for 1024 bytes with different threads. “numactl” is used to affinitize the test to number of cores equaling the number of openssl threads.

numactl -C 0-N openssl speed -multi $threads --bytes 1024 --seconds 10 -evp aes-128-gcm

For reference, we compared AmpereOne A192-32X with AMD Genoa 9654; both processors have 192 threads. On AmpereOne, each thread is a physical core, but on AMD Genoa, there are 96 physical cores, each with 2 Simultaneous Multithreads (SMT). SMT can lead to poor scaling beyond 50% because of the underlying technology constraints.

Figure 9 shows linear performance scaling with core count on AmpereOne processor. And this linear performance scaling makes a 1.37x throughput on AmpereOne A192-32X compared to AMD Genoa 9654, as Figure 10 illustrates.

figure9-10.png

Summary

From a performance perspective, it is recommended to prefer ECDSA over RSA for digital signatures. ECDSA generally offers better performance and security efficiency, especially with smaller key sizes, which can lead to faster signature generation compared to RSA;


For symmetric encryption and decryption, AES-GCM should be preferred over ChaCha20-Poly1305;


When using cryptographic libraries, it is advisable to use the version of OpenSSL or AWS-LC that includes optimizations for the AmpereOne architecture. These optimizations leverage the new features on AmpereOne processors to enhance performance;


Furthermore, AWS-LC, which offers additional performance benefits with different implementations, should be preferred over OpenSSL;


Finally, the cryptographic performance on AmpereOne scales linearly with the core count. That means AmpereOne can bring more benefits when more cores are utilized.


Footnotes

All data and information contained herein is for informational purposes only and Ampere reserves the right to change it without notice. This document may contain technical inaccuracies, omissions and typographical errors, and Ampere is under no obligation to update or correct this information. Ampere makes no representations or warranties of any kind, including but not limited to express or implied guarantees of noninfringement, merchantability, or fitness for a particular purpose, and assumes no liability of any kind. All information is provided “AS IS.” This document is not an offer or a binding commitment by Ampere. Use of the products contemplated herein requires the subsequent negotiation and execution of a definitive agreement or is subject to Ampere’s Terms and Conditions for the Sale of Goods.


System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere.


©2025 Ampere Computing. All Rights Reserved. Ampere, Ampere Computing, Altra and the ‘A’ logo are all registered trademarks or trademarks of Ampere Computing. Arm is a registered trademark of Arm Limited (or its subsidiaries). All other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

Created At : May 28th 2025, 10:24:47 pm
Last Updated At : June 17th 2025, 1:09:37 pm
Ampere Logo

Ampere Computing LLC

4655 Great America Parkway Suite 601

Santa Clara, CA 95054

image
image
image
image
image
 |  |  | 
© 2025 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.
This site runs on Ampere Processors.