Ampere Computing Logo
Contact Sales
Ampere Computing Logo
Customer reference board (CRB) platforms from Ampere

Samsung FIO Performance

Solution Brief

Ampere - Empowering What’s Next

A growing number of use cases require high-capacity and high-performance storage of data. Today, we can find high-capacity/ high performance storage solution among with other solutions like HPC, AI/ML, CDN - Media Streaming, and Analytic solutions.

When designing large-scale, high-performance storage, solution architects often choose between performance which is performance requirement on a given capacity versus TCO which is a major operational expense cost associated with storage servers. Up until now, system performance today is still limited to server level, not at the storage drives when dealing with performance requirement on a given capacity, so the temporary workaround - not ideal solution - is to use smaller capacity drives which means it will increase number of servers and TCO, especially when solution architects decide to use x86 platforms as their design of choices.

The real ideal solution when dealing with high-capacity, high performance for large scale storage solution is to use Ampere Altra family processors because:

The Ampere Altra (80 cores) and Ampere Altra Max (128 cores) AArch64 processors are complete system-on-chip (SOC) solutions built for large-scale, high-performance storage. In addition to incorporating many high-performance cores, Ampere’s innovative architecture delivers predictable high performance, linear scaling, and high energy efficiency. More importantly, the high IO bandwidth provides direct connections to multiple PCIe devices like NVMe and SSD drives, which are essential for heavy workload applications of all sizes from edge to the cloud.

In this Flexible I/O performance solution brief, we will go over the Flexible I/O performance data using Samsung PM1733a SSD drives running on Ampere Altra and Altra Max 2P Mt. Jade platform.

Key Benefits

Ampere Altra Family Deliver Disruptive Value for Large-Scale, High-Performance Storage Solution

  • Higher core counts as compared to competitors.
  • Predictable and linear workload performance.
  • Low power consumption - reduces carbon footprint.
  • Provide higher IOPS - ~30M IOPS as compared to competitors.
Flexible I/O Test Setup

Flexible I-O Test Setup.jpg

Samsung PM1733a NVMe SSD Flexible I/O Configuration
[global] name=random rw=randread bs=4K direct=1 numjobs=16 runtime=600 ioengine=libaio iodepth=64 norandommap group_reporting randrepeat=1 random_generator=tausworthe64 [global] name=randomwrite rw=randwrite bs=4K direct=1 numjobs=16 ramp_time=20 runtime=600 ioengine=libaio iodepth=64 norandommap group_reporting randrepeat=1 random_generator=tausworthe64Ampere Altra [global] name=sequence rw=read bs=128K direct=1 numjobs=4 runtime=600 ioengine=libaio iodepth=64 norandommap group_reporting randrepeat=1 random_generator=tausworthe64 [global] name=sequence rw=write bs=128K direct=1 numjobs=4 runtime=600 ioengine=libaio iodepth=64 norandommap group_reporting randrepeat=1 random_generator=tausworthe64
Ampere Altra Processor 
  • 80 64-bit CPU cores up to 3.30 GHz 
  • 64 KB L1 I-cache, 64 KB L1 D-cache per core 
  • 1 MB L2 cache per core 
  • 32 MB System Level Cache (SLC) 
  • 2x full-width (128b) SIMD 
  • Coherent mesh-based interconnect 

Memory

  • 8x 72-bit DDR4-3200 channels 
  • ECC and DDR4 RAS 
  • Up to 16 DIMMs and 4 TB addressable memory 

Connectivity

  • 128 lanes of PCIe Gen4 
  • Coherent multi-socket support 
  • 4 x16 CCIX lanes 

Technology & Functionality

  • Arm v8.2+, SBSA Level 4 
  • Advanced Power Management 

Performance

SPECrate®2017 Integer Estimated: 300 

Random Read/Write (MBps), bs=4k, jobs=16, iodepth=64, 10min

Note that our Random Write numbers are much higher than Samsung specs out, and potentially could go down as Samsung specification ~170k IOPS for Random Write if run for long run.

Sequential Read/Write (MBps), bs=128k, jobs=4, iodepth=64, 10min
CPU Utilization (%)
Ampere Altra Max Processor
  • 128 Arm v8.2+ 64-bit CPU cores up to 3.0 GHz
  • 64 KB L1 I-cache, 64 KB L1 D-cache per core
  • 1 MB L2 cache per core
  • 16 MB System Level Cache (SLC)
  • 2x full-width (128b) SIMD
  • Coherent mesh-based interconnect – Distributed snoop filtering

Memory

  • 8x 72-bit DDR4-3200 channels
  • ECC, Symbol-based ECC, and DDR4 RAS features
  • Up to 16 DIMMs and 4 TB/socket

Connectivity

  • 128 lanes of PCIe Gen4 
  • Coherent multi-socket support 
  • 4 x16 CCIX lanes 

Technology & Functionality

  • Arm v8.2+, SBSA Level 4 
  • Advanced Power Management 

Performance

SPECrate® 2017_int_base: 359

Random Read/Write (kIOPS), Altra Max

Note that our Random Write numbers are much higher than Samsung specs out, and potentially could go down as Samsung specification ~170k IOPS for Random Write if run for long run.

Sequential Read/Write (MBps), Altra Max
CPU Utilization (%)
Altra Max vs. Altra FIO Benchmark
Random Read (klOPS), Altra vs. Altra Max
CPU Utilization of Random Read, Altra vs. Altra Max
Benchmarking Results and Conclusions

We ran a series of Flexible I/O tests to characterize Samsung drives Flexible I/O performance on Altra and Altra Max processor. We used Mt. Jade reference platform equipped with two Ampere Altra family and supporting up to 24x U.2 form-factor drives. The tests used the Ampere Altra Q80-30 processor which supports a 3.0Ghz operating frequency, and Ampere Altra Max M128-30 processor which supports a 3.0Ghz operating frequency.

Working with a full complement of 24x Samsung SSDs with NVMe, we can drive sustained peak load into the drives. The Flexible I/O tests showed Ampere Altra Max processor can saturate 24x high-performance, high-capacity Samsung drives at more than ~30M IOPS for read performance, and Ampere Altra processor can saturate 24x high-performance, high-capacity Samsung drives at ~24M IOPS for read performance.

These are very impressive numbers!

Footnotes

All data and information contained herein is for informational purposes only and Ampere reserves the right to change it without notice. This document may contain technical inaccuracies, omissions and typographical errors, and Ampere is under no obligation to update or correct this information. Ampere makes no representations or warranties of any kind, including but not limited to express or implied guarantees of noninfringement, merchantability, or fitness for a particular purpose, and assumes no liability of any kind. All information is provided “AS IS.” This document is not an offer or a binding commitment by Ampere. Use of the products contemplated herein requires the subsequent negotiation and execution of a definitive agreement or is subject to Ampere’s Terms and Conditions for the Sale of Goods.

System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere.

©2022 Ampere Computing. All Rights Reserved. Ampere, Ampere Computing, Altra and the ‘A’ logo are all registered trademarks or trademarks of Ampere Computing. Arm is a registered trademark of Arm Limited (or its subsidiaries). All other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

Ampere Computing® / 4655 Great America Parkway, Suite 601 / Santa Clara, CA 95054 / amperecomputing.com

Created At : April 7th 2023, 10:11:01 am
Last Updated At : July 31st 2023, 5:07:32 pm
Ampere Logo

Ampere Computing LLC

4655 Great America Parkway Suite 601

Santa Clara, CA 95054

image
image
image
image
image
 |  |  | 
© 2024 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.
This site runs on Ampere Processors.