MongoDB Tuning Guide
For Ampere Altra Processors
MongoDB is a popular source-available cross-platform document-oriented NoSQL database program. Its flexible data model enables the storage of unstructured data with full indexing support and replication. According to DB-Engines, MongoDB was the 5th most popular database as of January 2023. It is written in C++ and designed to provide scalable high-performance data storage solutions for web applications.
The purpose of this guide is to describe techniques to run MongoDB in an optimal manner on Ampere® Altra® processors.
Running an application in a performant manner starts with building it correctly and using the appropriate compiler flags. When running on Ampere® Altra® processors, we recommend building from source with the GCC compiler version 10 or newer. Newer compilers tend to have better support for new processor features and incorporate more advanced code generation techniques.
CentOS Stream 8 is used as the operating system for our testing.
Download and install GCC 11 from SCL repository:
yum -y install scl-utils scl-utils-build yum -y install gcc-toolset-11.aarch64 scl enable gcc-toolset-11 bash
For other operating systems like Ubuntu 22.04 LTS and Debian, GCC 11 is available and can be installed directly from the respective repositories.
MongoDB can be installed from repositories that the OS package manager offers or can be built directly from source. A comprehensive MongoDB installation guide can be found in the official documentation. We recommend installing from source for flexibility, control, and the ability to configure specific modules.
To build MongoDB optimized for Ampere® Altra® processors family, additional compile flags which can leverage hardware features can be added during compilation stage. MongoDB source code for compilation can be obtained from MongoDB download page. The stable version MongoDB 6.0.3 is used in this guide. Installation from the source requires certain libraries and additional modules that will be compiled into the binary.
Execute the following steps to install the dependencies.
yum -y install libcurl-devel python39 python39-devel openssl-devel yum -y install zlib-devel git wget xz-devel yum -y groupinstall "Development Tools"
To support https connections, download the latest code for these additional sources from the respective git repositories.
git clone [https://github.com/mongodb/mongo](https://github.com/mongodb/mongo) git checkout -b myr6.0.3.rc2 r6.0.3-rc2
Python 3.7+ is required, and several Python modules must be installed, run:
python3 -m pip install -r etc/pip/compile-requirements.txt
Compilation
diff a/src/mongo/db/stats/counters.h b/src/mongo/db/stats/counters.h 224a225,226 > static_assert(sizeof(decltype(_together)) <= stdx::hardware_constructive_interference_size, > "cache line spill");
python3 buildscripts/scons.py supports many compile options, such as CC, CFLAGS and so on.
# get help of scons, such as define CXX=<g++ path>, CC=<gcc path> python3 buildscripts/scons.py -h # Note: configure g++ and gcc path # --force-jobs is CPU core number python3 buildscripts/scons.py --force-jobs=8040 DESTDIR=<MongoDB_Install_Dir >
MongoDB Configuration
In this guide, MongoDB is configured to use the WiredTiger storage engine and snappy as the block and journal compressor. Please refer to the mongodb.conf file shown in the appendix to configure the server.
#start the server $MongoDB_Install_Dir/bin/mongod --config mongod_conf --storageEngine wiredTiger #stop the server $MongoDB_Install_Dir/bin --config mongod_conf --shutdown
There are hundreds of settings that can alter the functionality and performance of MongoDB. What is listed below are just some of the more common knobs that can be used. The MongoDB documentation is the recommended resource to understand all the settings.
cacheSizeGB
Defines the maximum size of the internal cache that WiredTiger will use for all data.
Increasing the cacheSizeGB can reduce the impact of disk io and improve the read or write performance.
Use the “db.serverStatus().wiredTiger.cache” command and check the “maximum bytes configured” which is the maximum cache size configured by cacheSizeGB or default setting, and “bytes currently in the cache” which indicates the size of the data currently in the cache.
Eviction tuning
When an application approaches the maximum cache size, WiredTiger begins eviction to stop memory use from growing too large, approximating a least-recently-used algorithm. “eviction=(threads_min=X)” is the minimum number of WiredTiger eviction worker threads running. Must be a value between 1 and 20.
“eviction=(threads_max=X)” is the maximum number of WiredTiger eviction worker threads running. Must be a value between 1 and 20. This should match the threads_min setting for MongoDB.
#get db.adminCommand({getParameter: 1, wiredTigerEngineRuntimeConfig: "eviction"}) { wiredTigerEngineRuntimeConfig: 'eviction=(threads_min=4,threads_max=8)', ok: 1 } #set db.adminCommand({setParameter: 1, wiredTigerEngineRuntimeConfig: "eviction=(threads_min=4,threads_max=8)"})
concurrentTransactions
iredTiger uses tickets to control the number of read/write operations simultaneously processed by the storage engine. The default value is 128 and works well for most cases. If the number of tickets falls to 0, all subsequent operations are queued, waiting for tickets. Long-running operations might cause the number of tickets available to decrease, reducing the concurrency of your system. For example, increase configuration can increasing concurrency.
#Read current value db.serverStatus().wiredTiger.concurrentTransactions { write: { out: 0, available: 128, totalTickets: 128 }, read: { out: 0, available: 128, totalTickets: 128 } } #change value db.adminCommand({setParameter: 1, wiredTigerConcurrentWriteTransactions: 256}) { was: 0, ok: 1 } db.adminCommand({setParameter: 1, wiredTigerConcurrentReadTransactions: 256}) { was: 0, ok: 1 }
journalCompressor
Specifies the type of compression to use to compress WiredTiger journal data. Compression minimizes storage use at the expense of additional CPU.
blockCompressor
Specifies the default compression for collection data. You can override this on a per-collection basis when creating collections. Compression minimizes storage use at the expense of additional CPU.
64K PAGESIZE
Kernel PAGESIZE is recommended as 64K. It can be determined using the command “getconf PAGESIZE”.PAGESIZE is size of a memory page, in bytes and configured when compile kernel. Using a larger page size can decrease the hardware latency of translating a virtual page address to a physical page address. This decrease in latency is due to improving the efficiency of hardware translation caches like a processor’s translation lookaside buffer (TLB). Because a hardware translation cache only has a limited number of entries, using larger page sizes increases the amount of virtual memory that can be translated by each entry in the cache. This increases the amount of memory that can be accessed by an application without incurring hardware translation delays.
Transparent Huge Pages
Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages. However, database workloads often perform poorly with THP enabled, because they tend to have sparse rather than contiguous memory access patterns. When running MongoDB on Linux, THP should be disabled for best performance.
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Most UNIX-like operating systems, including Linux and macOS, provide ways to limit and control the usage of system resources such as threads, files, and network connections on a per-process and per-user basis. These "ulimits" prevent single users from using too many system resources. Sometimes, these limits have low default values that can cause a number of issues in the course of normal MongoDB operation.
To configure ulimit value for these versions, create a file named /etc/security/limits.d/99-mongodb-nproc.conf with new values to increase the process limit. For recommended values,
echo "* soft fsize unlimited" | sudo tee -a /etc/security/limits.conf echo "* hard fsize unlimited" | sudo tee -a /etc/security/limits.conf echo "* soft cpu unlimited" | sudo tee -a /etc/security/limits.conf echo "* hard cpu unlimited" | sudo tee -a /etc/security/limits.conf echo "* soft as unlimited" | sudo tee -a /etc/security/limits.conf echo "* hard as unlimited" | sudo tee -a /etc/security/limits.conf echo "* soft memlock unlimited" | sudo tee -a /etc/security/limits.conf echo "* hard memlock unlimited" | sudo tee -a /etc/security/limits.conf echo "* soft nofile 64000" | sudo tee -a /etc/security/limits.conf echo "* hard nofile 64000" | sudo tee -a /etc/security/limits.conf echo "* soft nproc 64000" | sudo tee -a /etc/security/limits.conf echo "* hard nproc 64000" | sudo tee -a /etc/security/limits.conf
Configure sufficient file handles (fs.file-max), kernel pid limit (kernel.pid_max), maximum threads per process (kernel.threads-max), and maximum number of memory map areas per process (vm.max_map_count) for your deployment. For large systems, the following values provide a good starting point:
sysctl -w vm.max_map_count = 98000 sysctl -w kernel.pid_max = 64000 sysctl -w kernel.threads-max = 64000 sysctl -w vm.max_map_count=128000 sysctl -w net.core.somaxconn=65535
Start tuned and use throughput-performance profile
tuned-adm profile throughput-performance
MongoDB conf file
processManagement: fork: true net: bindIp: %SERVER% port: %PORT% storage: dbPath: %DATA_ROOT%/%PORT% engine: wiredTiger wiredTiger: engineConfig: journalCompressor: snappy cacheSizeGB: 30 collectionConfig: blockCompressor: snappy systemLog: destination: file path: "%DATA_ROOT%/%PORT%/mongod.log" logAppend: true storage: journal: enabled: true