Ampere Computing Logo
Contact Sales
Ampere Computing Logo
Hero Image

DPDK Cryptography Build and Tuning Guide

Build and Run DPDK with Cryptography PMD

Updated 6/1/2025

One of the many use cases customers run on Ampere powered systems is packet processing workloads built on DPDK. Ampere has published a Setup and tuning guide for DPDK to assist customers with getting the best performance from these workloads. Since many customers make heavy use of encryption/decryption operations in their DPDK applications we are supplementing the existing DPDK tuning guide with additional information on crypto library support and how to build DPDK with these crypto libraries.

NOTE: These steps should happen before building DPDK library.

Summary of Poll Mode Drivers for Crypto on Ampere Processors

ARMv8 Crypto Driver

The ARMv8 crypto poll mode driver enables use of crypto extensions to ARMv8 that optimize chained operations. The core functions of this driver are written in assembly. It is published by Arm at https://github.com/ARM-software/AArch64cryptolib.git.


ARMv8 Crypto PMD supports the following algorithm pairs:

Cipher algorithms:

  • RTE_CRYPTO_CIPHER_AES_CBC

Authentication algorithms:

  • RTE_CRYPTO_AUTH_SHA1_HMAC
  • RTE_CRYPTO_AUTH_SHA256_HMAC

Build DPDK with ARMv8 crypto PMD:

Download and build AArch64 crypto library source code (Assumes current directory is /home/ampere/):


On Ampere Altra family:

git clone https://github.com/ARM-software/AArch64cryptolib.git cd AArch64cryptolib make OPT=big EXTRA_CFLAGS="-march=armv8.2-a+crypto" sudo echo “/home/ampere/AArch64cryptolib” > /etc/ld.so.conf.d/armcrypto.conf sudo ldconfig

On Ampere AmpereOne family:

git clone https://github.com/ARM-software/AArch64cryptolib.git cd AArch64cryptolib make OPT=biggereor3 EXTRA_CFLAGS="-march=armv8.6-a+crc+fp16+aes+sha3" sudo echo “/home/ampere/AArch64cryptolib” > /etc/ld.so.conf.d/armcrypto.conf sudo ldconfig

Reference: https://doc.dpdk.org/guides/cryptodevs/armv8.html

OpenSSL Crypto Driver

For the best performance, use either OpenSSL 3.2 or 1.1.1 on Ampere Altra family of processors, and OpenSSL 3.4.0 on AmpereOne family processors. Based on the results of our testing these versions provide the best performance, and versions 3.0.x and 3.1.x should be avoided due to significant performance regressions.


OpenSSL Crypto PMD supports the following algorithm pairs:

Cipher algorithms:

  • RTE_CRYPTO_CIPHER_3DES_CBC
  • RTE_CRYPTO_CIPHER_AES_CBC
  • RTE_CRYPTO_CIPHER_AES_CTR
  • RTE_CRYPTO_CIPHER_3DES_CTR
  • RTE_CRYPTO_CIPHER_DES_DOCSISBPI

Authentication algorithms:

  • RTE_CRYPTO_AUTH_AES_GMAC
  • RTE_CRYPTO_AUTH_MD5
  • RTE_CRYPTO_AUTH_SHA1
  • RTE_CRYPTO_AUTH_SHA224
  • RTE_CRYPTO_AUTH_SHA256
  • RTE_CRYPTO_AUTH_SHA384
  • RTE_CRYPTO_AUTH_SHA512
  • RTE_CRYPTO_AUTH_MD5_HMAC
  • RTE_CRYPTO_AUTH_SHA1_HMAC
  • RTE_CRYPTO_AUTH_SHA224_HMAC
  • RTE_CRYPTO_AUTH_SHA256_HMAC
  • RTE_CRYPTO_AUTH_SHA384_HMAC
  • RTE_CRYPTO_AUTH_SHA512_HMAC

AEAD algorithms:

  • RTE_CRYPTO_AEAD_AES_GCM
  • RTE_CRYPTO_AEAD_AES_CCM

Asymmetric Crypto algorithms:

  • RTE_CRYPTO_ASYM_XFORM_RSA
  • RTE_CRYPTO_ASYM_XFORM_DSA
  • RTE_CRYPTO_ASYM_XFORM_DH
  • RTE_CRYPTO_ASYM_XFORM_MODINV
  • RTE_CRYPTO_ASYM_XFORM_MODEX

Download and Install OpenSSL 3.4.0:
The OpenSSL libraries along with each OS distribution are quite different. That will make performance variance on different OS distributions. To keep the performance consistent, please download and install OpenSSL 3.4.0.


On Ampere Altra family:

wget https://github.com/openssl/openssl/archive/refs/tags/openssl-3.4.0.tar.gz tar zxf openssl-3.4.0.tar.gz cd openssl-openssl-3.4.0 ./Configure -mcpu=neoverse-n1 make -j`nproc` sudo make -j `nproc` install sudo echo “/usr/local/lib” > /etc/ld.so.conf.d/openssl.conf sudo ldconfig

On Ampere AmpereOne family:

wget https://github.com/openssl/openssl/archive/refs/tags/openssl-3.4.0.tar.gz tar zxf openssl-3.4.0.tar.gz cd openssl-openssl-3.4.0 ./Configure -mcpu=ampere1a make -j`nproc` sudo make -j `nproc` install sudo echo “/usr/local/lib” > /etc/ld.so.conf.d/openssl.conf sudo ldconfig

Reference: https://doc.dpdk.org/guides/cryptodevs/openssl.html

IPSec Multi-buffer library for Aarch64

IPSec Multi-buffer library for Aarch64 supports following algorithm pairs:

Cipher algorithm:

  • SNOW3G-UEA2
  • ZUC-EEA3
  • ZUC-EEA3-256

Authentication algorithm:

  • SNOW3G-UIA2
  • ZUC-EIA3
  • ZUC-EIA3-256

Download and build ipsec-mb library:

git clone https://gitlab.arm.com/arm-reference-solutions/ipsec-mb cd ipsec-mb make make install PREFIX=/usr/local/

Reference: https://doc.dpdk.org/guides/cryptodevs/snow3g.html

Build DPDK with Crypto Support

On CentOS

export LD_LIBRARY_PATH=/home/ampere/AArch64cryptolib:/usr/local/lib:/lib64 export PKG_CONFIG_PATH=/home/ampere/AArch64cryptolib/pkgconfig:/usr/local/lib/pkgconfig:/lib64/pkgconfig

On Ubuntu

export LD_LIBRARY_PATH=/home/ampere/AArch64cryptolib:/usr/local/lib:/usr/local/lib/aarch64-linux-gnu:/lib/aarch64-linux-gnu export PKG_CONFIG_PATH=/home/ampere/AArch64cryptolib/pkgconfig:/usr/local/lib/pkgconfig:/usr/local/lib/aarch64-linux-gnu/pkgconfig:/lib/aarch64-linux-gnu/pkgconfig

Build DPDK

wget https://fast.dpdk.org/rel/dpdk-24.07.tar.gz tar zxf dpdk-24.07.tar.gz cd dpdk-24.07 meson build ninja -C build ninja -C build install

Check the config of supported crypto device for armv8, openssl, ipsec_mb:

  • armv8, bcmfs, caam_jr, ccp, cnxk, dpaa_sec, dpaa2_sec, ipsec_mb,mlx5, nitrox, null, octeontx, openssl, scheduler, virtio,

Crypto Performance Test on Ampere Altra Q80-30

The following performance test was performed on Ampere Altra Q80-30. The performance data will be different if different SKU is used. Please refer to the later section “Tunning Guide” for hardware, BIOS, OS settings before performance testing.


Test AES-CBC-128/SHA1-HMAC Performance with single core using crypto_armv8 on Ampere Altra Q80-30:

sudo usertools/dpdk-hugepages.py --setup 10G cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_armv8 -l 0,1 -n 8 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --devtype crypto_armv8 --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 10000000 --silent --digest-sz 12 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 6.4784 3.3169 3.86 1 128 32 10000000 10000000 0 0 4.6469 4.7584 5.38 1 256 32 10000000 10000000 0 0 2.9786 6.1002 8.39 1 512 32 10000000 10000000 0 0 1.7654 7.2312 14.16 1 1024 32 10000000 10000000 0 0 0.9730 7.9705 25.69 1 2048 32 10000000 10000000 0 0 0.5129 8.4039 48.74

Test AES-CBC-128/SHA2-256-HMAC Performance with single core using crypto_armv8:

cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_armv8 -l 0,1 -n 8 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --devtype crypto_armv8 --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 10000000 --silent --digest-sz 12 --auth-algo sha2-256-hmac --cipher-algo aes-cbc --cipher-op encrypt lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 6.7249 3.4432 3.72 1 128 32 10000000 10000000 0 0 4.8760 4.9930 5.13 1 256 32 10000000 10000000 0 0 3.0952 6.3389 8.08 1 512 32 10000000 10000000 0 0 1.8318 7.5031 13.65 1 1024 32 10000000 10000000 0 0 1.0093 8.2681 24.77 1 2048 32 10000000 10000000 0 0 0.5316 8.7102 47.03

Test AES-GCM-128 Performance with single core using crypto_openssl:

cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_openssl -l 0,1 -n 8 -- --aead-key-sz 16 --buffer-sz 64,128,256,512,1024,2048 --optype aead --ptest throughput --aead-aad-sz 16 --devtype crypto_openssl --aead-op encrypt --burst-sz 32 --total-ops 10000000 --silent --digest-sz 16 --aead-algo aes-gcm --aead-iv-sz 12 lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 5.0681 2.5949 4.93 1 128 32 10000000 10000000 0 0 4.5814 4.6914 5.46 1 256 32 10000000 10000000 0 0 3.6966 7.5706 6.76 1 512 32 10000000 10000000 0 0 2.7922 11.4367 8.95 1 1024 32 10000000 10000000 0 0 1.8881 15.4671 13.24 1 2048 32 10000000 10000000 0 0 1.1478 18.8056 21.78

Test AES-CTR/AES-CMAC Performance with single core using crypto_openssl:

cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_openssl -l 0,1 -n 8 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 32 --cipher-key-sz 16 --devtype crypto_openssl --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 10000000 --digest-sz 12 --auth-algo aes-cmac --cipher-algo aes-ctr --cipher-op encrypt lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 3.0675 1.5706 8.15 1 128 32 10000000 10000000 0 0 2.6728 2.7370 9.35 1 256 32 10000000 10000000 0 0 2.0764 4.2524 12.04 1 512 32 10000000 10000000 0 0 1.4550 5.9599 17.18 1 1024 32 10000000 10000000 0 0 0.8887 7.2800 28.13 1 2048 32 10000000 10000000 0 0 0.5055 8.2824 49.45

Test snow3g-uea2 cipher-only with single core using crypto_snow3g:

cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_snow3g -l 0,1 -n 8 -- --devtype crypto_snow3g --ptest throughput --pool-sz 16384 --total-ops 10000000 --burst-sz 32 --optype cipher-only --cipher-algo snow3g-uea2 --cipher-iv-sz 16 --auth-op generate --cipher-key-sz 16 --buffer-sz 64,128,256,512,1024,2048 --cipher-op encrypt lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 3.7096 1.8993 6.74 1 128 32 10000000 10000000 0 0 2.8556 2.9242 8.75 1 256 32 10000000 10000000 0 0 1.9718 4.0383 12.68 1 512 32 10000000 10000000 0 0 1.2173 4.9859 20.54 1 1024 32 10000000 10000000 0 0 0.6901 5.6535 36.23 1 2048 32 10000000 10000000 0 0 0.3693 6.0503 67.70

Crypto Performance Test on AmpereOne A192-32X

The following performance test was performed on AmpereOne A192-32X. The performance data will be different if a different processor model is used.

Please refer to the section “Tuning Guide” for hardware, BIOS, OS settings before performance testing.


Test AES-CBC-128/SHA1-HMAC performance with single core using crypto_armv8:

sudo usertools/dpdk-hugepages.py --setup 10G cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_armv8 -l 0,1 -n 8 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --devtype crypto_armv8 --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 10000000 --silent --digest-sz 12 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 8.1328 4.1640 122.96 1 128 32 10000000 10000000 0 0 5.7694 5.9079 173.33 1 256 32 10000000 10000000 0 0 3.4485 7.0625 289.98 1 512 32 10000000 10000000 0 0 1.9994 8.1894 500.16 1 1024 32 10000000 10000000 0 0 1.0866 8.9013 920.31 1 2048 32 10000000 10000000 0 0 0.5679 9.3045 1760.87

Test AES-GCM-128 Performance with single core using crypto_openssl:

cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_openssl -l 0,1 -n 8 -- --aead-key-sz 16 --buffer-sz 64,128,256,512,1024,2048 --optype aead --ptest throughput --aead-aad-sz 16 --devtype crypto_openssl --aead-op encrypt --burst-sz 32 --total-ops 10000000 --silent --digest-sz 16 --aead-algo aes-gcm --aead-iv-sz 12 lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 5.5482 2.8407 180.24 1 128 32 10000000 10000000 0 0 5.0311 5.1518 198.76 1 256 32 10000000 10000000 0 0 4.1310 8.4603 242.07 1 512 32 10000000 10000000 0 0 3.5078 14.3677 285.08 1 1024 32 10000000 10000000 0 0 2.5100 20.5618 398.41 1 2048 32 10000000 10000000 0 0 1.5984 26.1889 625.61

Test AES-CTR/AES-CMAC Performance with single core using crypto_openssl:

cd build/app ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_openssl -l 0,1 -n 8 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 32 --cipher-key-sz 16 --devtype crypto_openssl --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 10000000 --digest-sz 12 --auth-algo aes-cmac --cipher-algo aes-ctr --cipher-op encrypt lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 1 64 32 10000000 10000000 0 0 3.7295 1.9095 268.14 1 128 32 10000000 10000000 0 0 3.1749 3.2511 314.97 1 256 32 10000000 10000000 0 0 2.4479 5.0133 408.51 1 512 32 10000000 10000000 0 0 1.7072 6.9927 585.75 1 1024 32 10000000 10000000 0 0 1.0658 8.7312 938.24 1 2048 32 10000000 10000000 0 0 0.6075 9.9540 1645.97

Performance Scaling with Core Counts

The crypto throughput on Ampere processor is linear with the core count. Here is an example of AES-GCM-128 throughput with Buffer size=1024 at different core counts on Altra Q80-30:

Core CountThroughput(Gbps)
115.41
231.12
461.91
8124.18
16248.10

And snow3g-uea2 cipher-only throughput with Buffer size=1024 at different core counts:

Core CountThroughput(Gbps)
15.65
211.31
422.60
845.28
1690.31

Run l2fwd with Crypto

DPDK provides an example application l2fwd-crypto which can do L2 forwarding with crypto. To perform this test, please follow the DPDK-setup-and-tuning-guide and setup the Pktgen-dpdk as a packet generator.


Forwarding with AES-GCM-128bit crypto, 1 port, 1 core, pktsize=1024B on Altra Q80-30:

./build/l2fwd-crypto -l 10-15 -n 8 -a 0000:01:00.0 --vdev crypto_openssl -- -p 0x1 --chain AEAD --aead_op ENCRYPT --aead_algo aes-gcm -T 1 Statistics for port 0 ------------------------------ Packets sent: 1339751 Packets received: 1339780 Packets dropped: 0 Crypto statistics ================================== Statistics for cryptodev 0 ------------------------- Packets enqueued: 1339780 Packets dequeued: 1339751 Packets errors: 0

Forwarding with AES-CBC/SHA1-HMAC crypto, 1 port, 1 core, pktsize=1024B:

./build/l2fwd-crypto -l 10-15 -n 8 -a 0000:01:00.0 --vdev crypto_armv8 -- -p 0x1 --chain CIPHER_HASH --cipher_op ENCRYPT --cipher_algo aes-cbc --cipher_key 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f --auth_op GENERATE --auth_algo sha1-hmac --auth_key 10:11:12:13:14:15:16:17:18:19:1a:1b:1c:1d:1e:1f -T 1 Statistics for port 0 ------------------------------ Packets sent: 869828 Packets received: 869856 Packets dropped: 0 Crypto statistics ================================== Statistics for cryptodev 0 ------------------------- Packets enqueued: 869856 Packets dequeued: 869828 Packets errors: 0

Tunning Guide

Hardware Configure

  • 1 DIMM Per Channel memory population is recommended;

BIOS Settings

  • Advanced->ACPI Settings->Enable CPPC [Disabled]
  • Advanced->ACPI Settings->Enable LPI [Disabled]
  • Chipset->CPU Configuration->ANC mode [Monolithic]
  • Chipset->CPU Configuration-> SLC Replacement Policy [Enhanced Least Recently Used]
  • Chipset->CPU Configuration->L1/L2 Prefetch [Enabled]
  • Chipset->CPU Configuration->SLC as L3$ [Disabled]

OS Settings

  • Set frequency governor to performance mode
  • Use a proper GCC which supports Altra or AmpereOne and recommended build options.
  • Set Hugepage. Example on CentOS with 64k kernel page:
    • echo 100 > /sys/devices/system/node/node0/hugepages/hugepages-524288kB/nr_hugepages

Library Version Selection

  • Checkout the latest library code for AArch64cryptolib, ipsec-mb;
  • Use OpenSSL library with version >= 3.2.0 or 1.1.1 on Ampere Altra and version >=3.4.0 on AmpereOne;
  • DPDK version >= v24.07 is recommended for AmpereOne family.
Created At : April 29th 2024, 4:25:17 pm
Last Updated At : May 30th 2025, 7:20:24 pm
Ampere Logo

Ampere Computing LLC

4655 Great America Parkway Suite 601

Santa Clara, CA 95054

image
image
image
image
image
 |  |  | 
© 2025 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.
This site runs on Ampere Processors.