Best performance for your AI workloads
Get the best price/performance benefits in the cloud and better value for AI inferencing compute
"Using Ampere A1 instances on OCI with integrated Ampere Optimized AI library, we managed to right-size compute providing price-performance advantage on deep learning inferencing relative to GPUs and to other CPUs. We found an order of magnitude or more reduction in cloud resource costs, measured at 4 operating points for 2 cloud vendors, while avoiding operational complexity for changes in model serving resource needs and cloud offerings."
-Madhuri Yechuri, CEO, Elotl
"Switching to Ampere-optimized Tensorflow running on OCI A1 instances has enabled us to achieve a 75 percent cost saving for the training of the algorithms for our plastics and fabrics identification machines, while lowering our CO2 emissions - thanks to Ampere Altra’s high energy efficiency."
-Martin Holicky, CEO, Matoha
Ampere Optimized AI Frameworks deliver a significant inference performance improvement to applications developed on all major AI frameworks. Ampere AI currently supports the following frameworks:
All Docker images can be conveniently downloaded from the Ampere Computing AI Docker Hub. The software is free and runs seamlessly on all Ampere products.
High Performance: Up to 4X performance advantage compared to CPUs built on x86 architecture
Energy Efficiency: 2.8x less power use than x86 processors and 3x smaller footprint. Learn More
Scalability: Optimized architecture, core counts surpassing x86 processors of AMD and Intel, and improved memory bandwidth accelerated by Ampere Optimized AI Frameworks.
Compatibility: Robust ecosystem offering extensive support from leading AI frameworks, libraries, and software tools, facilitating effortless integration.
System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere. The system configurations and components used in our testing are detailed here
This seamless integration to any AI framework accelerates inference without any accuracy loss, conversions, or model retraining. The architecture is diagrammed in the figure below.
Ampere helps customers achieve superior performance for AI workloads by integrating optimized inference layers into common AI frameworks.
Framework Integration Layer: Provides full compatibility with popular developer frameworks. Software works with the trained networks “as is”. No conversions or are required.
Model Optimization Layer: Implements techniques such as structural network enhancements, changes to the processing order for efficiency, and data flow optimizations, without accuracy degradation.
Hardware Acceleration Layer: Includes a “just-in-time”, optimization compiler that utilizes a small number of Microkernels optimized for Ampere processors. This approach allows the inference engine to deliver high-performance on all frameworks.
Up to 5x performance acceleration
Seamless “out-of-the-box” deployment
All model types are supported
Native support of FP16 data format
Easily accessible publicly hosted Docker images allowing for instantaneous deployment
Free of charge
AML is a collection of optimized AI models pretrained on standard datasets. The library contains scripts running the most common AI tasks. The models are available for Ampere customers to quickly and seamlessly build into their applications.
AML Benefits Include:
Benchmarking AI architecture with different frameworks
Accuracy testing of AI models on application-specific data
Comparison of AI architectures
Conducting tests on AI architectures
Ampere hardware uniquely offers native support of the FP16 data format providing nearly 2X speedup over FP32 with almost no accuracy loss for most AI models.
FP16, or "half-precision floating point," represents numbers using 16 bits, making computations faster and requiring less memory compared to the FP32 (single precision) data format. The FP16 data format is widely adopted in AI applications, specifically for AI inference workloads. It offers distinct advantages, especially in tasks like neural network inference, which require intensive computations and real-time responsiveness. Utilizing FP16 enables accelerated processing of AI models, resulting in enhanced performance, optimized memory usage, and improved energy efficiency without compromising on accuracy.
Ampere Altra and Ampere Altra Max, with high performance Ampere optimized frameworks, offers the best-in-class Artificial Intelligence inference performance for all AI applications developed in the most popular frameworks including PyTorch, Tensorflow, and ONNXRuntime. Ampere Model Library (AML) offers pretrained models to help accelerate AI development.
Ampere Altra Systems
Ampere Altra and Ampere Altra Max. These systems are flexible enough to meet the needs of any cloud deployment and come packed with Ampere's 80-core Altra or 128-core Altra Max processors
Microsoft offers a comprehensive line of Azure Virtual Machines that can run a diverse and broad set of Linux workloads such as web servers, open-source databases, in-memory applications, big data analytics, gaming, media, and more.
Equinix Metal, an on-demand digital infrastructure platform, has created Gen3 configs with Ampere Altra for common workloads which are available in minutes on bare metal
Whether your business is early in its journey or well on its way to digital transformation, Google Cloud can help you solve your toughest challenges.
Hewlett Packard Enterprise
The new HPE ProLiant RL300 Gen11 server is the first in a series of HPE ProLiant RL Gen11 servers that deliver next-generation compute performance with higher power efficiency using Ampere® Altra® and Ampere ® Altra® Max cloud-native processors.
OCI Ampere A1
Ampere Altra and Oracle Cloud combine predictable performance, near-linear scaling, and secure architecture with the best price-performance in the market in the following shapes: