Data Modeling Introduction数据建模简介

On this page本页内容

The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the database engine, and the data retrieval patterns.数据建模的关键挑战是平衡应用程序的需求、数据库引擎的性能特征和数据检索模式。When designing data models, always consider the application usage of the data (i.e. queries, updates, and processing of the data) as well as the inherent structure of the data itself.在设计数据模型时,始终考虑数据的应用程序使用(即数据的查询、更新和处理)以及数据本身的固有结构。

Flexible Schema灵活的架构

Unlike SQL databases, where you must determine and declare a table’s schema before inserting data, MongoDB’s collections, by default, does not require its documents to have the same schema.与SQL数据库不同,在插入数据之前必须确定并声明表的模式,默认情况下,MongoDB的集合不要求其文档具有相同的模式。That is:即:

This flexibility facilitates the mapping of documents to an entity or an object.这种灵活性有助于将文档映射到实体或对象。Each document can match the data fields of the represented entity, even if the document has substantial variation from other documents in the collection.每个文档都可以匹配所表示实体的数据字段,即使该文档与集合中的其他文档有很大的差异。

In practice, however, the documents in a collection share a similar structure, and you can enforce document validation rules for a collection during update and insert operations.但实际上,集合中的文档共享类似的结构,您可以在更新和插入操作期间对集合强制执行文档验证规则See Schema Validation for details.有关详细信息,请参见架构验证

Document Structure文件结构

The key decision in designing data models for MongoDB applications revolves around the structure of documents and how the application represents relationships between data.为MongoDB应用程序设计数据模型的关键决策是围绕文档结构和应用程序如何表示数据之间的关系。MongoDB allows related data to be embedded within a single document.MongoDB允许相关数据嵌入到单个文档中。

Embedded Data嵌入式数据

Embedded documents capture relationships between data by storing related data in a single document structure.嵌入式文档通过在单个文档结构中存储相关数据来捕获数据之间的关系。MongoDB documents make it possible to embed document structures in a field or array within a document.MongoDB文档可以将文档结构嵌入文档中的字段或数组中。These denormalized data models allow applications to retrieve and manipulate related data in a single database operation.这些非规范化的数据模型允许应用程序在单个数据库操作中检索和操作相关数据。

Data model with embedded fields that contain all related information.

For many use cases in MongoDB, the denormalized data model is optimal.对于MongoDB中的许多用例,非规范化数据模型是最优的。

See Embedded Data Models for the strengths and weaknesses of embedding documents.有关嵌入文档的优点和缺点,请参见嵌入数据模型

References引用

References store the relationships between data by including links or references from one document to another.引用通过包含从一个文档到另一个文档的链接或引用来存储数据之间的关系。Applications can resolve these references to access the related data.应用程序可以解析这些引用以访问相关数据。Broadly, these are normalized data models.一般来说,这些都是标准化的数据模型。

Data model using references to link documents. Both the ``contact`` document and the ``access`` document contain a reference to the ``user`` document.

See Normalized Data Models for the strengths and weaknesses of using references.有关使用参考文献的优缺点,请参见标准化数据模型

Atomicity of Write Operations写操作的原子性

Single Document Atomicity单文档原子性

In MongoDB, a write operation is atomic on the level of a single document, even if the operation modifies multiple embedded documents within a single document.在MongoDB中,写入操作在单个文档的级别上是原子的,即使该操作修改单个文档中的多个嵌入文档。

A denormalized data model with embedded data combines all related data in a single document instead of normalizing across multiple documents and collections.嵌入数据的非规范化数据模型将所有相关数据合并到一个文档中,而不是跨多个文档和集合进行规范化。This data model facilitates atomic operations.这种数据模型有助于原子操作。

When a single write operation (e.g. db.collection.updateMany()) modifies multiple documents, the modification of each document is atomic, but the operation as a whole is not atomic.当一次写操作(例如db.collection.updateMany())修改多个文档,每个文档的修改都是原子的,但整个操作不是原子的。

When performing multi-document write operations, whether through a single write operation or multiple write operations, other operations may interleave.在执行多文档写入操作时,无论是通过单个写入操作还是通过多个写入操作,其他操作都可能交错。

For situations that require atomicity of reads and writes to multiple documents (in a single or multiple collections), MongoDB supports multi-document transactions:对于需要对多个文档(在单个或多个集合中)进行原子性读写的情况,MongoDB支持多文档事务:

  • In version 4.0, MongoDB supports multi-document transactions on replica sets.在版本4.0中,MongoDB支持副本集上的多文档事务。
  • In version 4.2, MongoDB introduces distributed transactions, which adds support for multi-document transactions on sharded clusters and incorporates the existing support for multi-document transactions on replica sets.在版本4.2中,MongoDB引入了分布式事务,它增加了对分片集群上多文档事务的支持,并合并了对副本集上多文档事务的现有支持。

For details regarding transactions in MongoDB, see the Transactions page.有关MongoDB中事务的详细信息,请参阅事务页面。

Multi-Document Transactions多文档事务处理

For situations that require atomicity of reads and writes to multiple documents (in a single or multiple collections), MongoDB supports multi-document transactions:对于需要对多个文档(在单个或多个集合中)进行原子性读写的情况,MongoDB支持多文档事务:

  • In version 4.0, MongoDB supports multi-document transactions on replica sets.在版本4.0中,MongoDB支持副本集上的多文档事务。
  • In version 4.2, MongoDB introduces distributed transactions, which adds support for multi-document transactions on sharded clusters and incorporates the existing support for multi-document transactions on replica sets.在版本4.2中,MongoDB引入了分布式事务,它增加了对分片集群上多文档事务的支持,并合并了对副本集上多文档事务的现有支持。

For details regarding transactions in MongoDB, see the Transactions page.有关MongoDB中事务的详细信息,请参阅事务页面。

Important

In most cases, multi-document transaction incurs a greater performance cost over single document writes, and the availability of multi-document transactions should not be a replacement for effective schema design.在大多数情况下,多文档事务比单文档写入带来更大的性能成本,并且多文档事务的可用性不应取代有效的模式设计。For many scenarios, the denormalized data model (embedded documents and arrays) will continue to be optimal for your data and use cases.对于许多场景,非规范化的数据模型(嵌入式文档和数组)将继续适合您的数据和用例。That is, for many scenarios, modeling your data appropriately will minimize the need for multi-document transactions.也就是说,对于许多场景,适当地建模数据将最大限度地减少对多文档事务的需要。

For additional transactions usage considerations (such as runtime limit and oplog size limit), see also Production Considerations.有关其他事务使用注意事项(如运行时限制和oplog大小限制),请参阅生产注意事项

Data Use and Performance数据使用和性能

When designing a data model, consider how applications will use your database.在设计数据模型时,请考虑应用程序将如何使用数据库。For instance, if your application only uses recently inserted documents, consider using Capped Collections.例如,如果应用程序只使用最近插入的文档,请考虑使用Capped集合Or if your application needs are mainly read operations to a collection, adding indexes to support common queries can improve performance.或者,如果应用程序主要需要对集合执行读取操作,则添加索引以支持常见查询可以提高性能。

See Operational Factors and Data Models for more information on these and other operational considerations that affect data model designs.有关影响数据模型设计的这些和其他操作注意事项的更多信息,请参见操作因素和数据模型

Learn More了解更多

MongoDB.live 2020 PresentationsMongoDB.live2020演示文稿

To learn how to incorporate the flexible data model into your schema, see the following presentations from MongoDB.live 2020:要了解如何将灵活的数据模型合并到您的模式中,请参阅MongoDB.live2020:

Application Modernization Guide应用程序现代化指南

For more information on data modeling with MongoDB, download the MongoDB Application Modernization Guide.有关MongoDB数据建模的更多信息,请下载MongoDB应用程序现代化指南

The download includes the following resources:下载内容包括以下资源:

  • Presentation on the methodology of data modeling with MongoDBMongoDB数据建模方法研究
  • White paper covering best practices and considerations for migrating to MongoDB from an RDBMS data model白皮书介绍了从RDBMS数据模型迁移到MongoDB的最佳实践和注意事项
  • Reference MongoDB schema with its RDBMS equivalent引用MongoDB模式及其等价的RDBMS
  • Application Modernization scorecard应用程序现代化记分卡