Analyze Your Data Schema分析数据架构

The Schema tab provides an overview of the data type and shape of the fields in a particular collection. “架构”选项卡提供特定集合中字段的数据类型和形状的概述。Databases and collections are visible in the left-side navigation.数据库和集合在左侧导航中可见。

The overview is based on sampling the documents in the collection. 概述基于对集合中的文档进行采样The schema overview may include additional data about the contents of the fields, such as the minimum and maximum values of dates and integers, the frequency of occurrence of particular values, and the cardinality of the data.模式概述可以包括关于字段内容的附加数据,例如日期和整数的最小值和最大值、特定值的出现频率以及数据的基数。

MongoDB has a flexible schema model, which means that some fields may contain different types of data from one document to the next.MongoDB有一个灵活的架构模型,这意味着某些字段可能包含从一个文档到下一个文档的不同类型的数据。

Example
A field named address may contain strings and integers in some documents, objects in others, or some combination of all three.名为address的字段可能在某些文档中包含字符串和整数,在其他文档中包含对象,或者三者的组合。

In the case of heterogenous fields, the Schema tab shows a breakdown of the various data types contained within the field with the percentage of each data type represented.对于异构字段,“架构”选项卡显示字段中包含的各种数据类型的明细,以及表示的每种数据类型的百分比。

Example示例

The Schema tab shows size information about the test.restaurants collection at the top, including the total number of documents in the collection, the average document size, and the total disk space occupied by the collection.“架构”选项卡在顶部显示有关test.restaurants集合的大小信息,包括集合中的文档总数、平均文档大小以及集合占用的总磁盘空间。

The following fields are shown with details:以下字段显示了详细信息:

  • The _id field is an ObjectId. _id字段是一个ObjectIdEach ObjectId contains a timestamp, so Compass displays the range of creation times for the sampled documents.每个ObjectId都包含一个时间戳,因此Compass显示采样文档的创建时间范围。
  • The address field contains four nested fields. address字段包含四个嵌套字段。You can expand the field panel to see analyses of each of the nested fields.可以展开“字段”面板以查看每个嵌套字段的分析。
  • The borough field contains a string indicating the borough in which the restaurant is located. borough字段包含一个字符串,指示餐厅所在的自治区。The cardinality is low enough that Compass can provide a graded bar of the field contents, with the most-frequently occurring string on the left.基数足够低,Compass可以提供字段内容的分级条,最频繁出现的字符串位于左侧。
  • The categories field contains arrays of strings. categories字段包含字符串数组。The analysis shows the minimum, maximum, and average array lengths.分析显示了最小、最大和平均阵列长度。
Example of a collection's schema

Using the query bar in the Schema tab, you can create a query filter to limit your result set. 使用“架构”选项卡中的查询栏,可以创建查询筛选器以限制结果集。Click the Options button to specify query options, such as the particular fields to display and the number of results to return.单击“选项”按钮指定查询选项,例如要显示的特定字段和要返回的结果数。

Query bar schema view
Tip

In the Schema tab, you can also use the Query Builder to enter a query into the query bar.在“架构”选项卡中,还可以使用“查询生成器”在查询栏中输入查询。

For each field, Compass displays summary information about the data type or types the field contains and the range of values. 对于每个字段,Compass显示有关字段包含的数据类型和值范围的摘要信息。Depending on the data type and the level of cardinality, Compass displays histograms, graded bars, geographical maps, and sample data to provide a sense of the shape and scope of the data contained in each field.根据数据类型和基数级别,Compass显示直方图、分级条、地理地图和样本数据,以提供每个字段中包含的数据的形状和范围。

Below is an example of the data type summary for a field called last_login which contains data of type date.下面是名为last_login的字段的数据类型摘要示例,该字段包含date类型的数据。

Example of a field with a single data type

For fields that contain multiple data types, Compass displays a percentage breakdown of the various data types across documents. 对于包含多种数据类型的字段,Compass会显示文档中各种数据类型的百分比细分。In the example below, the chart shows the contents of a field called phone_no in which 81% of documents are of type string, and the remaining 19% are of type number.在下面的示例中,图表显示了一个名为phone_no的字段的内容,其中81%的文档类型为string,其余19%的文档类型为number

Example of percentage breakdown for data types

If a collection contains documents in which not all fields contain a value, the missing values display as undefined. 如果集合包含并非所有字段都包含值的文档,则缺少的值将显示为undefinedIn the example below, the field age has no recorded value in 40% of the sampled documents.在下面的示例中,40%的采样文档中没有记录字段age值。

Example of sparcely applied data type

Strings can appear in three different ways. 字符串可以以三种不同的方式显示。If there are entirely unique strings in a field, Compass shows a random selection of string values from the specified field. 如果字段中有完全唯一的字符串,Compass将显示从指定字段中随机选择的字符串值。Click the circular refresh icon to see a new set of randomly selected values from the field.单击“循环刷新”图标以查看从字段中随机选择的一组新值。

Example of string data types

If there are only a few different string values, Compass shows the strings in a single graded bar which shows the percentage of the population of the string values.如果只有几个不同的字符串值,Compass将在单个分级条中显示字符串,该分级条显示字符串值的总体百分比。

Example of few string data types

If there are multiple string values with some duplicates, Compass shows a histogram indicating the frequency of each string found within the field.如果有多个字符串值和一些重复项,Compass将显示一个直方图,指示在字段中找到的每个字符串的频率。

Example of string data types as a histogram
Note

Move the mouse over each bar to display a tooltip which shows the value of the string.在每个栏上移动鼠标以显示显示字符串值的工具提示。

Numbers are similar to strings in their representation. 数字的表示形式类似于字符串。Unique numbers are shown in the following manner:唯一编号按以下方式显示:

Example of number data type

Duplicate numbers are shown in a histogram that indicates their frequency:重复的数字显示在直方图中,表示其频率:

Example of duplicate number data types

Fields that represent dates (and fields that contain the ObjectID data type, which includes a timestamp) are shown across multiple bar charts. 表示日期的字段(以及包含ObjectID数据类型(包括时间戳)的字段)跨多个条形图显示。The two charts on the top row represent the day of the week and time of day of the timestamp value.顶行上的两个图表表示时间戳值的星期几和当天时间。

The single chart on the bottom shows the first and last timestamp value, and the vertical lines represent the distribution of the timestamp across the range of first to last.底部的单个图表显示了第一个和最后一个时间戳值,垂直线表示时间戳在第一个到最后一个范围内的分布。

Example of Date data types

Fields that contain a sub-document or an array are displayed with a small triangle next to them and a visual representation of the data contained within the sub-document or array.包含子文档或数组的字段旁边会显示一个小三角形,以及子文档或数组中包含的数据的可视表示形式。

Example of fields with embedded documents or arrays

Click on the triangle to expand the field and view the embedded documents:单击三角形展开字段并查看嵌入的文档:

Expanding the embedded documents

Fields that contain GeoJSON data or [longitude,latitude] arrays are displayed with interactive maps. 包含GeoJSON数据或[longitude,latitude]数组的字段将与交互式地图一起显示。For more information on interacting with location data in Compass, see Analyze Location Data.有关与Compass中的位置数据交互的详细信息,请参阅分析位置数据

Example of GeoJSON data types
Note

Third party mapping services are not available in Compass Isolated Edition.Compass单机版中不提供第三方地图服务。

If a field has mixed types, you can view different charts of each type by clicking on the type field. 如果字段具有混合类型,则可以通过单击type字段来查看每种类型的不同图表。In the example below, the age field shows the values that are strings:在下面的示例中,age字段显示了字符串值:

Example of a field with mixed types

Clicking on the number type causes the chart to show its numeric data:单击number类型可使图表显示其数字数据:

Example that shows numeric data for number type

In the Schema tab, you can type the filter manually into the query bar or generate the filter with the Compass query builder. 在“架构”选项卡中,可以在查询栏中手动键入筛选器,或使用Compass查询生成器生成筛选器。The query builder allows you to select data elements from one or more fields in your schema and construct a query matching the selected elements.查询生成器允许您从架构中的一个或多个字段中选择数据元素,并构造与所选元素匹配的查询。

Tip

You can compose the initial query filter by using the clickable query builder and then manually edit the generated filter to your exact requirements.您可以使用可单击的查询生成器组合初始查询筛选器,然后根据您的具体需求手动编辑生成的筛选器。

The following procedure describes the steps involved in building a complex query with the query bar.以下过程描述了使用查询栏构建复杂查询所涉及的步骤。

1

In the Schema view, you can click on a chart value to build a query. 在“架构”视图中,您可以单击图表值来构建查询。For example, the following image shows the query filter built by clicking the EWR value for the departureAirportFsCode field.例如,下图显示了通过单击departureAirportFsCode字段的EWR值构建的查询过滤器。

Example of a created filter
2

To select multiple values for a field, click and drag the cursor over a selection of values, or press shift+click on the desired values.要为字段选择多个值,请单击并将光标拖动到选定的值上,或按shift键并单击所需的值。

Exmaple of selecting multimple values for a field
3

For example, the following image shows shows the compound query built by selecting a value in the flightId field.例如,下图显示了通过在flightId字段中选择一个值构建的复合查询。

Example of a compound query
4

To deselect a previously selected value, shift+click on the selected value:要取消选择以前选定的值,按住shift键并单击选定的值:

Example of removing a value from a filter
5

To run the query, click Analyze. 要运行查询,请单击“分析”。Click Reset to clear your query.单击“重置”以清除查询。

In the Schema tab, you can use interactive maps to filter and analyze location data. 在“架构”选项卡中,可以使用交互式地图过滤和分析位置数据。If your field contains GeoJSON data or [longitude,latitude] arrays, the Schema tab displays a map containing the points from the field. 如果您的字段包含GeoJSON数据[longitude,latitude]数组,则“架构”选项卡将显示包含字段中的点的地图。The data type for location fields is coordinates.位置字段的数据类型为coordinates

Image showing example field with location data

You can apply a filter to the map to only analyze a specific range of points. 可以对贴图应用过滤器,以仅分析特定范围的点。To define a location filter:要定义位置过滤器,请执行以下操作:

  1. Click the Circle button at the top-right of the map.单击地图右上角的圆形按钮。
  2. Click and drag on the map to draw a circle containing the area of the map you want to analyze.在地图上单击并拖动以绘制包含要分析的地图区域的圆。
  3. Repeat this process as desired to include additional areas of the map in the schema analysis.根据需要重复此过程,以便在模式分析中包括地图的其他区域。
Image showing map with filter circles drawn

The query bar updates as you draw location filters to show the exact coordinates used in the $geoWithin query applied to the schema analysis.当您绘制位置过滤器时,查询栏将更新,以显示应用于架构分析的$geoWithin查询中使用的精确坐标。

If you specify multiple location filters, the query becomes an $or query with multiple $geoWithin operators.如果指定多个位置过滤器,则查询将成为具有多个$geoWithin运算符的$or查询。

To move or resize a location filter, click on the right side of the map. 要移动或调整位置过滤器的大小,请单击地图右侧的You will enter the filter editing mode, which looks like this:您将进入过滤器编辑模式,如下所示:

Image showing map filter editing
To move a filter移动过滤器
Click and drag the square in the center of the circle.单击并拖动圆中心的正方形。
To resize a filter调整过滤器大小的步骤
Click and drag the square at the edge of the circle.单击并拖动圆边缘的正方形。

After modifying your filters, click Save.修改过滤器后,单击“保存”。

To delete a location filter from the map:要从地图中删除位置过滤器,请执行以下操作:

  1. Click on the right side of the map.点击地图右侧的
  2. Either click:单击以下任一项:

    • A location filter to delete that filter.用于删除该筛选器的位置筛选器。
    • Clear All to delete all location filters.“清除全部”以删除所有位置过滤器。
  3. Click Save.单击“保存”。

If the analysis of your schema times out, it might be because the collection you are analyzing is very large, causing MongoDB to stop the operation before the analysis is complete. 如果模式分析超时,可能是因为您正在分析的集合非常大,导致MongoDB在分析完成之前停止操作。Increase the value of MAX TIME MS to allow the operation time to complete.增加MAX TIME MS的值,以允许操作时间完成。

To increase the value of MAX TIME MS:要增加MAX TIME MS的值,请执行以下操作:

  1. In the query bar, expand Options.在查询栏中,展开“选项”。

    The Options button is on the right side of the query bar, next to the Analyze button.
  2. Increase the value of MAX TIME MS to accommodate your collection. 增加“最大时间MS”的值以适应您的集合。MAX TIME MS defaults to 60000 milliseconds, or 60 seconds, but large collections might take tens of seconds to analyze.“最大时间MS”默认为60000毫秒或60秒,但大型集合可能需要数十秒才能分析。

Once you have increased the value of MAX TIME MS, retry your schema analysis by clicking Analyze.一旦增加了“MAX TIME MS”的值,请单击“分析”重试架构分析。