On this page本页内容
MongoDB uses the shard key associated to the collection to partition the data into chunks. A chunk consists of a subset of sharded data. Each chunk has a inclusive lower and exclusive upper range based on the shard key.
MongoDB splits chunks when they grow beyond the configured chunk size. Both inserts and updates can trigger a chunk split.
The smallest range a chunk can represent is a single unique shard key value. A chunk that only contains documents with a single shard key value cannot be split.
numInitialChunks
option to specify a different number of initial chunks. This initial creation and distribution of chunks allows for faster setup of sharding.See also参阅
The default chunk size in MongoDB is 64 megabytes. You can increase or reduce the chunk size. Consider the implications of changing the default chunk size:
mongos
) layer.Maximum Number of Documents Per Chunk to Migrate
.existing collection
. Post-sharding, chunk size does not constrain collection size.For many deployments, it makes sense to avoid frequent and potentially spurious migrations at the expense of a slightly less evenly distributed data set.
Changing the chunk size affects when chunks split but there are some limitations to its effects.
Splitting is a process that keeps chunks from growing too large. When a chunk grows beyond a specified chunk size, or if the number of documents in the chunk exceeds Maximum Number of Documents Per Chunk to Migrate
, MongoDB splits the chunk based on the shard key values the chunk represent. A chunk may be split into multiple chunks where necessary. Inserts and updates may trigger splits. Splits are an efficient meta-data change. To create splits, MongoDB does not migrate any data or affect the shards.
Splits may lead to an uneven distribution of the chunks for a collection across the shards. In such cases, the balancer redistributes chunks across shards. See Cluster Balancer for more details on balancing chunks across shards.
MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards. Migrations may be either:
For more information on the sharded cluster balancer, see Sharded Cluster Balancer.
The balancer is a background process that manages chunk migrations. If the difference in number of chunks between the largest and smallest shard exceed the migration thresholds, the balancer begins migrating chunks across the cluster to ensure an even distribution of data.
You can manage certain aspects of the balancer. The balancer also respects any zones created as a part of configuring zones in a sharded cluster.
See Sharded Cluster Balancer for more information on the balancer.
In some cases, chunks can grow beyond the specified chunk size but cannot undergo a split. The most common scenario is when a chunk represents a single shard key value. Since the chunk cannot split, it continues to grow beyond the chunk size, becoming a jumbo chunk. These jumbo chunks can become a performance bottleneck as they continue to grow, especially if the shard key value occurs with high frequency.
Starting in version 4.4, MongoDB provides the refineCollectionShardKey
command. Refining a collection’s shard key allows for a more fine-grained data distribution and can address situations where the existing key insufficient cardinality leads to jumbo chunks.
For more information, see:
moveChunk
directory¶In MongoDB 2.6 and MongoDB 3.0, sharding.archiveMovedChunks
is enabled by default. All other MongoDB versions have this disabled by default. With sharding.archiveMovedChunks
enabled, the source shard archives the documents in the migrated chunks in a directory named after the collection namespace under the moveChunk
directory in the storage.dbPath
.
If some error occurs during a migration, these files may be helpful in recovering documents affected during the migration.
Once the migration has completed successfully and there is no need to recover documents from these files, you may safely delete these files. Or, if you have an existing backup of the database that you can use for recovery, you may also delete these files after migration.
To determine if all migrations are complete, run sh.isBalancerRunning()
while connected to a mongos
instance.