On this page本页内容
mongod
process that stopped running unexpectedly?keepalive
time affect MongoDB Deployments?This document provides answers to common diagnostic questions and issues.
If you don’t find the answer you’re looking for, check the complete list of FAQs or post your question to the MongoDB Community.
mongod
process that stopped running unexpectedly?¶If mongod
shuts down unexpectedly on a UNIX or UNIX-based platform, and if mongod
fails to log a shutdown or error message, then check your system logs for messages pertaining to MongoDB. For example, for logs located in /var/log/messages
, use the following commands:
keepalive
time affect MongoDB Deployments?¶If you experience network timeouts or socket errors in communication between clients and servers, or between members of a sharded cluster or replica set, check the TCP keepalive value for the affected systems.
Many operating systems set this value to 7200
seconds (two hours) by default. For MongoDB, you will generally experience better results with a shorter keepalive value, on the order of 120
seconds (two minutes).
If your MongoDB deployment experiences keepalive-related issues, you must alter the keepalive value on all affected systems. This includes all machines running mongod
or mongos
processes and all machines hosting client processes that connect to MongoDB.
Or:
The value is measured in seconds.
Note
Although the setting name includes ipv4
, the tcp_keepalive_time
value applies to both IPv4 and IPv6.
tcp_keepalive_time
value, you can use one of the following commands, supplying a <value> in seconds:
Or:
These operations do not persist across system reboots. To persist the setting, add the following line to /etc/sysctl.conf
, supplying a <value> in seconds, and reboot the machine:
Keepalive values greater than 300
seconds, (5 minutes) will be overridden on mongod
and mongos
sockets and set to 300
seconds.
The registry value is not present by default. The system default, used if the value is absent, is 7200000
milliseconds or 0x6ddd00
in hexadecimal.
KeepAliveTime
value, use the following command in an Administrator Command Prompt, where <value>
is expressed in hexadecimal (e.g. 120000
is 0x1d4c0
):
Windows users should consider the Windows Server Technet Article on KeepAliveTime for more information on setting keepalive for MongoDB deployments on Windows systems. Keepalive values greater than or equal to 600000 milliseconds (10 minutes) will be ignored by mongod
and mongos
.
The value is measured in milliseconds.
net.inet.tcp.keepidle
value, you can use the following command, supplying a <value> in milliseconds:
This operation does not persist across system reboots, and must be set each time your system reboots. See your operating system’s documentation for instructions on setting this value persistently. Keepalive values greater than or equal to 600000
milliseconds (10 minutes) will be ignored by mongod
and mongos
.
Note
In macOS 10.15 Catalina, Apple no longer allows for configuration of the net.inet.tcp.keepidle
option.
You will need to restart mongod
and mongos
processes for new system-wide keepalive settings to take effect.
If you see a very large number of connection and re-connection messages in your MongoDB log, then clients are frequently connecting and disconnecting to the MongoDB server. This is normal behavior for applications that do not use request pooling, such as CGI. Consider using FastCGI, an Apache Module, or some other kind of persistent application server to decrease the connection overhead.
If these connections do not impact your performance you can use the run-time quiet
option or the command-line option --quiet
to suppress these messages from the log.
Starting in version 4.0, MongoDB offers free Cloud monitoring for standalones and replica sets. Free monitoring provides information about your deployment, including:
For more information, see Free Monitoring.
The MongoDB Cloud Manager and Ops Manager, an on-premise solution available in MongoDB Enterprise Advanced include monitoring functionality, which collects data from running MongoDB deployments and provides visualization and alerts based on that data.
For more information, see also the MongoDB Cloud Manager documentation and Ops Manager documentation.
A full list of third-party tools is available as part of the Monitoring for MongoDB documentation.
No.
If the cache does not have enough space to load additional data, WiredTiger evicts pages from the cache to free up space.
Note
The storage.wiredTiger.engineConfig.cacheSizeGB
limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache.
To accommodate the additional consumers of RAM, you may have to decrease WiredTiger internal cache size.
The default WiredTiger internal cache size value assumes that there is a single mongod
instance per machine. If a single machine contains multiple MongoDB instances, then you should decrease the setting to accommodate the other mongod
instances.
If you run mongod
in a container (e.g. lxc
, cgroups
, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB
to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container. See memLimitMB
.
To see statistics on the cache and eviction, use the serverStatus
command. The wiredTiger.cache
field holds the information on the cache and eviction.
For an explanation of some key cache and eviction statistics, such as wiredTiger.cache.bytes currently in the cache
and wiredTiger.cache.tracked dirty bytes in the cache
, see wiredTiger.cache
.
To adjust the size of the WiredTiger internal cache, see storage.wiredTiger.engineConfig.cacheSizeGB
and --wiredTigerCacheSizeGB
. Avoid increasing the WiredTiger internal cache size above its default value.
With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache.
Starting in MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:
For example, on a system with a total of 4GB of RAM the WiredTiger cache will use 1.5GB of RAM (0.5 * (4 GB - 1 GB) = 1.5 GB
). Conversely, a system with a total of 1.25 GB of RAM will allocate 256 MB to the WiredTiger cache because that is more than half of the total RAM minus one gigabyte (0.5 * (1.25 GB - 1 GB) = 128 MB < 256 MB
).
Note
In some instances, such as when running in a container, the database can have memory constraints that are lower than the total system memory. In such instances, this memory limit, rather than the total system memory, is used as the maximum RAM available.
To see the memory limit, see hostInfo.system.memLimitMB
.
By default, WiredTiger uses Snappy block compression for all collections and prefix compression for all indexes. Compression defaults are configurable at a global level and can also be set on a per-collection and per-index basis during collection and index creation.
Different representations are used for data in the WiredTiger internal cache versus the on-disk format:
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes.
To adjust the size of the WiredTiger internal cache, see storage.wiredTiger.engineConfig.cacheSizeGB
and --wiredTigerCacheSizeGB
. Avoid increasing the WiredTiger internal cache size above its default value.
Note
The storage.wiredTiger.engineConfig.cacheSizeGB
limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache.
To accommodate the additional consumers of RAM, you may have to decrease WiredTiger internal cache size.
The default WiredTiger internal cache size value assumes that there is a single mongod
instance per machine. If a single machine contains multiple MongoDB instances, then you should decrease the setting to accommodate the other mongod
instances.
If you run mongod
in a container (e.g. lxc
, cgroups
, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB
to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container. See memLimitMB
.
To view statistics on the cache and eviction rate, see the wiredTiger.cache
field returned from the serverStatus
command.
The two most important factors in maintaining a successful sharded cluster are:
You can prevent most issues encountered with sharding by ensuring that you choose the best possible shard key for your deployment and ensure that you are always adding additional capacity to your cluster well before the current resources become saturated. Continue reading for specific issues you may encounter in a production environment.
Your cluster must have sufficient data for sharding to make sense. Sharding works by migrating chunks between the shards until each shard has roughly the same number of chunks.
The default chunk size is 64 megabytes. MongoDB will not begin migrations until the imbalance of chunks in the cluster exceeds the migration threshold. This behavior helps prevent unnecessary chunk migrations, which can degrade the performance of your cluster as a whole.
If you have just deployed a sharded cluster, make sure that you have enough data to make sharding effective. If you do not have sufficient data to create more than eight 64 megabyte chunks, then all data will remain on one shard. Either lower the chunk size setting, or add more data to the cluster.
As a related problem, the system will split chunks only on inserts or updates, which means that if you configure sharding and do not continue to issue insert and update operations, the database will not create any chunks. You can either wait until your application inserts data or split chunks manually.
Finally, if your shard key has a low cardinality, MongoDB may not be able to create sufficient splits among the data.
In some situations, a single shard or a subset of the cluster will receive a disproportionate portion of the traffic and workload. In almost all cases this is the result of a shard key that does not effectively allow write scaling.
It’s also possible that you have “hot chunks.” In this case, you may be able to solve the problem by splitting and then migrating parts of these chunks.
In the worst case, you may have to consider re-sharding your data and choosing a different shard key to correct this pattern.
If you have just deployed your sharded cluster, you may want to consider the troubleshooting suggestions for a new cluster where data remains on a single shard.
If the cluster was initially balanced, but later developed an uneven distribution of data, consider the following possible causes:
If migrations impact your cluster or application’s performance, consider the following options, depending on the nature of the impact:
It’s also possible that your shard key causes your application to direct all writes to a single shard. This kind of activity pattern can require the balancer to migrate most data soon after writing it. Consider redeploying your cluster with a shard key that provides better write scaling.