NGINX on Google Cloud Workload Brief
TAU T2A Virtual Machines Powered by Ampere Altra Processors
Ampere® Altra® processors are designed from the ground up to deliver exceptional performance for Cloud Native applications such as NGINX. With an innovative architecture that delivers high performance, linear scalability, and amazing energy efficiency, Ampere Altra allows workloads to run in a predictable manner with minimal variance under increasing loads. This enables industry leading performance/watt and a smaller carbon footprint for real-world workloads such as NGINX.
Google Cloud offers the cost-optimized Tau T2A VMs powered by Ampere Altra processors for scale-out Cloud Native workloads in multiple predetermined VM shapes – up to 48 vCPUS per VM, 4 GB of memory per vCPU, up to 32 Gbps networking bandwidth, and a wide range of network-attached storage options. These VMs are suitable for scale-out workloads such as web servers, containerized microservices, data-logging processing, media transcoding, and Java applications.
NGINX is an open-source, high-performance, HTTP server that can also be used as a reverse proxy, load balancer, mail proxy, and HTTP cache. It uses a sophisticated event-driven architecture that allows it to scale to hundreds of thousands of concurrent connections on modern hardware. NGINX is the most popular web server among high-traffic websites as of 2021, with a 33.8% market share, according to W3Techs.
The Google T2A VMs powered by Ampere Altra processors deliver a compelling performance in a variety of NGINX web server configurations, including a TLS-enabled server configuration used in our tests. The web server sends a static file for every request from the load generator over HTTPS. Our metric for performance is the throughput (requests/seconds) with the p.99 latency under 5 ms.
The Ampere Altra-based Google Cloud T2A VMs outperform their legacy x86 VM counterparts. The performance results shown in Figure 1 displays that the T2A VMs are 15% more performant than the N2 VMs and roughly on par with the N2d VMs.
Price-performance represents a big portion of the Total Cost of Ownership (TCO) for cloud developers and is an important consideration for large-scale deployments in the cloud. The T2A VMs offer 45% better price-performance compared to the N2 VMs and 11% compared to the N2D VMs.
Figure 3
In our tests, NGINX was hosted on 32 vCPU VMs. We compared the Ampere Altra-based GCP T2A VMs to the Intel® Xeon® Ice Lake-based N2 and the AMD EPYC™ Milan-based N2D VMs. Like most open-source software, NGINX is natively supported on AArch64 and we used the corresponding package manager provided by the OS to install it on all the server VMs. To a client request, the NGINX server serves a static file compressed with gzip over HTTPS. WRK is used as a load generator running on a separate 32 vCPU VM. Performance is measured in requests per second under a Service Level Agreement (SLA). We use no more than 5 ms for the 99th percentile latency (p.99) as the SLA. We configured the thread count and concurrency levels to achieve maximum throughput while keeping the p.99 latency under 5 ms. The client and server VMs were configured to be part of the same subnet to achieve the best network throughput.
T2A Standard 8 | N2 Standard 8 | N2D Standard 8 | |
---|---|---|---|
Number of vCPUs | 32 | 32 | 32 |
Hourly cost | $1.232 | $1.553888 | $1.351872 |
Operating System | Debian 11 | Debian 11 | Debian 11 |
Kernel | 5.15.0-0.bpo.3-cloud-arm64 | 5.10.0-14-cloud-amd64 | 5.10.0-14-cloud-amd64 |
Memory | 64GB | 64GB | 64GB |
Disk | 300GB | 300GB | 300GB |
NGINX version | 1.18.0 | 1.18.0 | 1.18.0 |
WRK Version | 4.2.0 | 4.2.0 | 4.2.0 |
Static File Size | 53KB | 53KB | 53KB |
NGINX configuration used for our testing:
# For more information on configuration, see:
# * Official English Documentation: http://nginx.org/en/docs/
# * Official Russian Documentation: http://nginx.org/ru/docs/
user nginx;
worker_processes auto;
worker_rlimit_nofile 104857600;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
events {
use epoll;
accept_mutex off;
worker_connections 10240;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
#access_log /var/log/nginx/access.log main;
access_log off;
open_file_cache max=10240000 inactive=60s;
open_file_cache_valid 80s;
open_file_cache_min_uses 1;
keepalive_requests 100000000000;
keepalive_timeout 300s;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
types_hash_max_size 4096;
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name _;
root /usr/share/nginx/html;
ssl_certificate "/etc/pki/tls/certs/NGINX_TEST_SSL.crt";
ssl_certificate_key "/etc/pki/tls/private/NGINX_TEST_SSL.key";
ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3;
ssl_ciphers "AES128+SHA256 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS !MEDIUM !RC4";
ssl_prefer_server_ciphers on;
# Load configuration files for the default server block.
include /etc/nginx/default.d/*.conf;
error_page 404 /404.html;
location = /40x.html {
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
}
The configuration file below shows the gzip compression settings we have used with NGINX.
gzip on;
gzip_min_length 100;
gzip_buffers 8 32k;
gzip_types text/plain text/css application/x-javascript text/xml application/xml text/javascript;
gzip_vary on;
On the client side, the wrk command line used is shown below. We varied $thread and $connection to arrive at the best throughput under the 5 ms p.99 latency SLA.
./wrk-4.2.0/wrk -t$thread -c$connection -H 'Accept-Encoding: gzip' -d10s https://${NGINX_HOST}:443/${filename} --latency
The Google Cloud Tau T2A VMs powered by Ampere Altra processors are an excellent choice for Cloud Native workloads such as NGINX due to their innovative Cloud Native design and fantastic price-performance. For cloud application developers, transitioning NGINX applications from legacy x86 VMs to Ampere Altra VMs is seamless today due to the maturity of the AArch64 software ecosystem. Overall, great performance and compelling price-performance, all while reducing your carbon footprint. For more information about the Google Tau T2D Virtual Machines with Ampere Altra processors, visit the Google Cloud blog.
All data and information contained herein is for informational purposes only and Ampere reserves the right to change it without notice. This document may contain technical inaccuracies, omissions and typographical errors, and Ampere is under no obligation to update or correct this information. Ampere makes no representations or warranties of any kind, including but not limited to express or implied guarantees of noninfringement, merchantability, or fitness for a particular purpose, and assumes no liability of any kind. All information is provided “AS IS.” This document is not an offer or a binding commitment by Ampere. Use of the products contemplated herein requires the subsequent negotiation and execution of a definitive agreement or is subject to Ampere’s Terms and Conditions for the Sale of Goods.
System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere.
©2022 Ampere Computing. All Rights Reserved. Ampere, Ampere Computing, Altra and the ‘A’ logo are all registered trademarks or trademarks of Ampere Computing. Arm is a registered trademark of Arm Limited (or its subsidiaries). All other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
Ampere Computing® / 4655 Great America Parkway, Suite 601 / Santa Clara, CA 95054 / amperecomputing.com