Company
Solutions
EN
EN
EnglishChinese
Ampere Computing Logo
Ampere Computing Logo
Solutions
Solutions Home
SolutionsCloud Native SolutionsTuning Guides OverviewTutorials OverviewWorkload Briefs OverviewWhere to Try
Developers
Developers CenterDesigning Cloud ApplicationsBuilding Cloud ApplicationsDeploying Cloud ApplicationsUsing Your DataAmpere Ready SoftwareWorking with Open SourceCommunity Forum
Cloud Native Processor Solutions

Ampere AI

Ampere Optimized Frameworks
Ampere Model Library (AML)

AI Solutions
Key Benefits
Downloads
How It Works
FAQs
Resources
Tutorials
Documentations
Publications
Testing & Regression
Frameworks
Recommended Systems
AI Solutions

Solutions for AI on Ampere Altra

Ampere AI delivers world class AI Inference solutions! Ampere Optimized AI delivers a significant inference performance benefit to any existing model that runs on our supported frameworks out of the box. Ampere AI currently supports the following frameworks available for free download here or at some of our supporting partners:

  • TensorFlow
  • PyTorch
  • ONNX

Ampere hardware supports native FP16 data format providing nearly 2X speedup over FP32 with almost no accuracy loss for most AI models.

Ampere provides easy-to-use Docker containers that include Computer Vision and Natural Language Processing model examples and benchmarks that enable developers to get started quickly. Download one of our Docker containers today to experience the benefits of our best-in-class performance. Read more about our solutions below.

AI BriefAI Inference on Azure Brief
Throughput Performance on TensorFlow 2.7

Key Benefits

Ampere AI optimized frameworks + Ampere Altra Max deliver disruptive value for MLPerf and more:

  • Predictable Performance: Up to 5X Higher Throughput than AWS Graviton using FP16!
  • Predictable Performance: More than 2X Higher Throughput than x86 competition using FP16

System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere. The system configurations and components used in our testing are detailed here

Downloads

Ampere Optimized AI Software

Ampere Altra and Ampere Altra Max, with high performance Ampere optimized frameworks, offers the best-in-class Artificial Intelligence inference performance for frameworks including Tensorflow, PyTorch and ONNX Runtime. Ampere Model Library (AML) offers pretrained models to help accelerate AI development.

Ampere Optimized PyTorch
Ampere's inference acceleration engine is fully integrated with Pytorch framework. Pytorch models and software written with Pytorch API can run as-is, without any modifications.
Pytorch_logo
Ampere Optimized TensorFlow
Ampere's inference acceleration engine is fully integrated with Tensorflow framework. Tensorflow models and software written with Tensorflow API can run as-is, without any modifications.
TensorFlow_logo
Ampere Optimized ONNX Runtime
Ampere's inference acceleration engine is fully integrated with ONNX Runtime framework. ONNX models and software written with ONNX Runtime API can run as-is, without any modifications.
Ampere Optimized ONNX Runtime
Ampere Model Library (AML)
Ampere Model Library (AML) is a collection of AI model architectures that handle the industry's most demanding workloads. Access the AML open GitHub repository to validate the excellent performance of the Ampere AI with optimized frameworks on our Ampere Altra family of cloud-native processors.
GitHub-Small.png
How It Works

Ampere Optimized Framework Components

With an inference engine, Ampere optimized frameworks offer significant benefits. Click here to view the demo.

AmpereAILayer.jpg

Ampere helps customers achieve superior performance for AI workloads by integrating optimized inference layers into common AI frameworks.

This seamless integration to any AI framework accelerates inference without any accuracy loss, conversions, or model retraining. The architecture is diagrammed in the figure above. The main components are as follows:

  • Framework Integration Layer: Provides full compatibility with popular developer frameworks. Software works with the trained networks “as is”. No conversions or approximations are required.
  • Model Optimization Layer: Implements techniques such as structural network enhancements, changes to the processing order for efficiency, and data flow optimizations, without accuracy degradation.
  • Hardware Acceleration Layer: Includes a “just-in-time”, optimization compiler that utilizes a small number of Microkernels optimized for Ampere processors. This approach allows the inference engine to deliver high-performance and support multiple frameworks.
FAQs

Ampere AI FAQ

Resources

Tutorials

Documentations

Publications

TESTING AND REGRESSION

Solutions and Regression Testing

Frameworks

Regression of the currently available Ampere Optimized AI images

Ampere AI logo
Ampere AI
100% Verified / 0% Unverified
Learn More
Recommended Systems
Created At : March 15th 2022, 6:04:52 am
Last Updated At : April 4th 2023, 6:52:12 am

Ampere Computing

4655 Great America Parkway

Suite 601 Santa Clara, CA 95054

Tel: +1-669-770-3700

info[at]amperecomputing.com

About
image
image
image
image
© 2023 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.