3.8 Metadata for the Document Using <meta>使用<meta>的文档元数据
The meta
element allows you to write additional information or data about the HTML document in the head section between <head>
and </head>
. meta
元素允许您在<head>
和</head>
之间的头部写入有关HTML文档的其他信息或数据。These can be instructions for the web browser, the web server, or a web crawler (also spider, searchbot, bot, or search robot). 这些可以是网络浏览器、网络服务器或网络爬虫(也可以是蜘蛛、搜索机器人、机器人或搜索机器人)的指令。Even though the use of those 尽管这些元元素的使用是可选的,但它们经常被指定。meta
elements is optional, they often get specified. It’s quite difficult, especially for beginners, to keep track of the many existing HTML attributes and the possible attribute values you can use with the 特别是对于初学者来说,很难跟踪许多现有的HTML属性以及可以与meta
element. meta
元素一起使用的可能属性值。Many of these additional details aren’t standardized at all.许多附加细节根本没有标准化。
Web Crawler网络爬虫
A web crawler is an application that searches the internet and analyzes entire websites. 网络爬虫是一种搜索互联网并分析整个网站的应用程序。There are different types of web crawlers on the go that collect different types of information. 有不同类型的网络爬虫在移动中收集不同类型的信息。Search engines also use a web crawler to analyze websites. Basically, the principle is quite similar to web browsing, where hyperlinks take you from one web page to other URLs. 搜索引擎也使用网络爬虫来分析网站。基本上,其原理与网页浏览非常相似,超链接将您从一个网页带到其他网址。A web crawler stores these URLs and visits these pages one by one. The websites are evaluated via indexing to make searching for the relevant data possible.网络爬虫存储这些URL并逐一访问这些页面。通过索引对网站进行评估,以便搜索相关数据。
3.8.1 The Most Commonly Used Metadata最常用的元数据
A meta
element is usually composed of at least two attributes. meta
元素通常至少由两个属性组成。Either the attributes consist of a 属性由name
/content
combination or an http-equiv
/content
combination. name
/content
组合或http-equiv
/content
组合组成。In addition, a special version exists for character encoding.此外,还有一个用于字符编码的特殊版本。
“name/content” Combinations: Freely Definable Metadata组合:可自由定义的元数据
The 包含HTML属性meta
element containing the HTML attribute name
can basically contain any information in the HTML attribute content
. name
的meta
元素基本上可以包含HTML属性content
中的任何信息。Theoretically, you could assign any value to the contents of 理论上,您可以自己为name
yourself. name
的内容分配任何值。Nevertheless, some default metadata for the 然而,HTML中已经定义了name
attribute value has been defined in HTML. name
属性值的一些默认元数据。However, these 但是,这些name
/content
combinations aren’t intended for personal information, but should only contain information about the HTML document. name
/content
组合不用于个人信息,而应仅包含有关HTML文档的信息。A simple example might look as follows:一个简单的例子可能如下:
...
<head>
<title>Freely definable metadata</title>
<meta name="author" content="John Doe">
<meta name="keywords" content="metadata, meta, html">
<meta charset="UTF-8">
</head>
...
Here, you can see two typical 在这里,您可以看到两种典型的name
/content
combinations. name
/content
组合。The first example defines the author of the web page, while the second pair defines 第一个示例定义了网页的作者,而第二对示例定义了搜索引擎的keywords
for the search engines. keywords
。You could use any number of other 您可以在此处使用任意数量的其他meta
elements here.meta
元素。
“http-equiv/content” Combinations: HTTP Equivalents组合:HTTP等价物
The specifications with 带有http-equiv(也称为pragma指令)的规范旨在使web服务器进行通信。http-equiv
(also called the pragma directive) were intended for the web server to communicate. The web server should read this information and then take the read information into account when responding to the client (web browser) and use it in the HTTP response header. web服务器应读取此信息,然后在响应客户端(web浏览器)时考虑读取的信息,并在HTTP响应标头中使用它。However, web servers don’t actually parse HTML documents, so again it’s up to the browser how this information gets processed. Let’s look at a simple example:然而,web服务器实际上并不解析HTML文档,因此如何处理这些信息再次取决于浏览器。让我们来看一个简单的例子:
<!doctype html>
<html lang="en">
<head>
<title>HTTP equivalents</title>
<meta http-equiv="refresh" content="5">
<meta charset="UTF-8">
</head>
<body>
<p>Page gets refreshed every 5 seconds.</p>
</body>
</html>
Listing 3.7 /examples/chapter003/3_8_1/index.html
The refresh
value for the http-equiv
attribute and the value 5
for the content
attribute allow you to make the web browser refresh the web page every five seconds.http-equiv
属性的refresh
值和content
属性的值5允许您使web浏览器每五秒刷新一次网页。
Setting the Character Encoding for the HTML Document设置HTML文档的字符编码
In addition to the 除了name
/content
and http-equiv
/content pairs
, there’s a third option that allows you to specify the character encoding (more easily). name
/content
和http-equiv
/content pairs
之外,还有第三个选项允许您指定字符编码(更容易)。Generally, you should use this information when creating a web page that’s written in a language other than English. 通常,在创建用英语以外的语言编写的网页时,您应该使用此信息。This is the line with the 这是你在本书的每个例子中使用的meta
element that you use in every example of the book:meta
元素行:
<meta charset="UTF-8">
This will ensure that special characters such as German umlauts and some other special characters are also displayed correctly, thanks to the UTF-8 character set standard. 这将确保特殊字符(如德语变音和其他一些特殊字符)也能正确显示,这要归功于UTF-8字符集标准。Besides the internet, modern operating systems also use UTF-8, and unless you have a reason to use a different character set, you should always work with UTF-8.除了互联网,现代操作系统也使用UTF-8,除非你有理由使用不同的字符集,否则你应该始终使用UTF-8。
3.8.2 Setting the Viewport设置视口
Let’s jump ahead to the viewport now, as a correct setting will prevent a responsive website from being displayed in a small view on the mobile device. 现在让我们跳到视口,因为正确的设置将阻止响应式网站在移动设备上以小视图显示。The viewport is the area of the browser window where the web content gets displayed. Without any special precautions, web pages on a smartphone’s mobile browser would be scaled down until they fit completely on the screen. 视口是浏览器窗口中显示web内容的区域。如果没有任何特殊的预防措施,智能手机移动浏览器上的网页将被缩小,直到完全放在屏幕上。This allows visitors to keep an overview and zoom into the page.这允许访问者保持概述并放大页面。
If you want to create modern websites today, then taking into account the different device sizes and a responsive web design is part of the development process. 如果你想在今天创建现代网站,那么考虑不同的设备尺寸和响应式网页设计是开发过程的一部分。When creating responsive web pages, you must prevent this automatic downsizing. You can do this via a 在创建响应式网页时,必须防止这种自动缩小。你可以通过一个元元素来实现这一点,如下所示:meta
element like the following:
<meta name="viewport" content="width=device-width">
It tells the browser to use the actual width of the device rather than an imaginary width. 它告诉浏览器使用设备的实际宽度,而不是想象的宽度。You can see the result of this line in a responsive web page in Figure 3.7, where the automatic resizing function was implemented on the left-hand side and the viewport with the 您可以在图3.7中的响应式网页中看到此行的结果,其中左侧实现了自动调整大小功能,右侧使用了带有meta
tag was used on the right.meta
标记的视口。
Figure 3.7
A Responsive Website: (Left) without a Meta Viewport and (Right) with a Meta Viewport响应式网站:(左)没有元视口,(右)有元视口
I’ll describe the viewport and responsive web design separately in Chapter 13. 我将在第13章中分别描述视口和响应式网页设计。Without explaining it in more detail here, the following 在此不作更详细的解释,以下meta
element has become accepted for it in the meantime:meta
元素已被接受:
<meta name="viewport"
content="width=device-width, initial-scale=1.0, shrink-to-fit=no">
By using 通过使用initial-scale=1.0
, you can make sure that the browser displays the page with the normal zoom level, and with shrink-to-fit=no
, you instruct the Safari browser on the iPad not to shrink
even in split view.initial-scale=1.0
,您可以确保浏览器以正常缩放级别显示页面,并且使用收缩适应=否,您可以指示iPad上的Safari浏览器即使在拆分视图中也不要收缩。
3.8.3 Specifying Useful Metadata for a Web Crawler为Web爬网程序指定有用的元数据
This section provides a brief description of some metadata for search engine robots (web crawlers). However, you must be aware that this information is only a recommendation for the web crawlers. 本节简要介绍了搜索引擎机器人(网络爬虫)的一些元数据。但是,您必须知道,此信息只是对网络爬虫的建议。Whether the search bots adhere to it is out of your hands. At least these attribute values were partly (co)designed by Google, Yahoo, and Microsoft, so these publishers will probably stick to them. 搜索机器人是否遵守它不在你的掌握之中。至少这些属性值部分是由谷歌、雅虎和微软共同设计的,所以这些出版商可能会坚持使用它们。If you want to include information for the web crawler as metadata, you must assign the 如果要将网络爬虫的信息作为元数据包含在内,则必须将robots
value to the name
attribute. robots
值分配给name
属性。In the 在content
attribute, you write (or suggest) what the web crawler has to do when it visits the web page, for example:content
属性中,您可以编写(或建议)网络爬虫访问网页时必须执行的操作,例如:
<meta name="robots" content="index,follow">
This allows the search robot to include the web page in the search engine 这允许搜索机器人将网页包含在搜索引擎索引中,并跟踪页面上的超链接。index
and to follow
the hyperlinks on the page. However, you can usually omit this information because this is the usual behavior of a web crawler.但是,您通常可以省略此信息,因为这是网络爬虫的常见行为。
If you don’t want the page to be indexed or the hyperlinks to be followed, you can use the attribute values 如果您不希望对页面进行索引或跟踪超链接,可以在内容中使用属性值noindex
and/or nofollow
in content
:noindex
和/或nofollow
:
<meta name="robots" content="noindex">
Here, you indicate that your website shouldn’t be included in the search engine index (在这里,您表示您的网站不应包含在搜索引擎索引(noindex
), so that the page can’t be found via a search engine. noindex
)中,因此无法通过搜索引擎找到该页面。If you want the page to be included in the search engine index, but don’t want the hyperlinks to be followed, you merely need to use the attribute value 如果你希望页面包含在搜索引擎索引中,但不希望超链接被跟踪,你只需要在nofollow
in content
.content
中使用属性值nofollow
。
3.8.4 Useful Metadata for Search Engines搜索引擎有用的元数据
Especially for search engines, two 特别是对于搜索引擎来说,两个name
values are important, namely keywords
and description
. name
值很重要,即keywords
和description
。However, the 然而,keywords
value has lost importance because it was misused in the past to feed search engines with many misleading keywords (keyword stuffing) to be listed as close to the top as possible in the search. keywords
值已经失去了重要性,因为过去它被滥用,为搜索引擎提供了许多误导性关键字(关键字填充),这些关键字在搜索中尽可能靠近顶部列出。In the meantime, the search engines are again indexing the content of a website in a more targeted manner and tend to leave the keywords unnoticed (or less noticed). 与此同时,搜索引擎再次以更有针对性的方式对网站内容进行索引,并倾向于让关键字不被注意(或不太被注意)。If you still want to specify keywords, you must separate the individual keywords in 如果仍要指定关键字,则必须在content
separated by commas, as the following example shows:content
中用逗号分隔各个关键字,如下例所示:
<meta name="keywords" content="html, meta, keywords">
Here, for example, 例如,在这里,html
, meta
, and keywords
were used as keywords for the website.html
、meta
和keywords
被用作网站的关键字。
What’s more interesting, however, is the description text of the website. Although this text will probably not be considered directly in the search results, the description is, in addition to the title, the first thing a user sees listed in the search engine as information from your website. 然而,更有趣的是网站的描述文本。虽然这段文字可能不会直接出现在搜索结果中,但除了标题外,描述是用户在搜索引擎中看到的第一件来自您网站的信息。You should keep the description as short and precise as possible and use a maximum of 150 to 250 characters (depending on the search engine). 您应该尽可能保持描述简短准确,最多使用150到250个字符(取决于搜索引擎)。A text that’s too long will be shortened.过长的文本将被缩短。
Here’s an example of such a description:以下是一个这样的描述示例:
...
<head>
<title>Description text for search engines</title>
<meta charset="UTF-8">
<meta name="description"
content="A description should be as
short and precise as possible. Here
you should summarize in 2-3 sentences
what this page is about. Characters
exceeding the limit will be shortened.">
</head>
<body>
...
</body>
...
In Google, for example, this description text is usually listed as shown in Figure 3.8.例如,在Google中,此描述文本通常如图3.8所示。
Figure 3.8
Along with the <title> Element, the Description Text Is Often One of the First Features to Appear in a Search Engine与<title>
元素一起,描述文本通常是搜索引擎中首先出现的特征之一
If you don’t specify a description with a 如果不使用meta
element, this text will get generated from the parts of the page content. meta
元素指定描述,则此文本将从页面内容的各个部分生成。However, it isn’t possible to predict exactly what this description will look like and what kind of text will be used for it. For this reason, you should definitely take the description into your own hands instead of leaving it up to the algorithm of a search engine.然而,不可能准确预测这个描述会是什么样子,以及会使用什么样的文本。因此,你绝对应该把描述掌握在自己手中,而不是把它留给搜索引擎的算法。
The First Impression Is Important第一印象很重要
Although it isn’t as important as it was in the early days of the internet, metadata still plays a significant role in search engine coverage. 尽管元数据的重要性不如互联网早期,但它在搜索引擎覆盖率中仍然发挥着重要作用。You should therefore always pay attention to the 因此,您应该始终注意title
element and the description (name="description"
) because these elements are often the first things that website visitors get in return from search engines when the page is listed in a search.title
元素和描述(name="description"
),因为当页面在搜索中列出时,这些元素通常是网站访问者从搜索引擎获得的第一个回报。
3.8.5 Useful Metadata for the Web BrowserWeb浏览器的有用元数据
If you want to refresh the content of a web page after a certain time or redirect it to another URL, you can use the 如果你想在一段时间后刷新网页的内容或将其重定向到另一个URL,你可以为此使用带有http-equiv
attribute with the refresh
value for this purpose. refresh
值的http-equiv
属性。The content
attribute enables you to set the time by when the update or redirection should take place.content
属性使您能够设置更新或重定向发生的时间。
You can force a refresh of the web page as follows:您可以按如下方式强制刷新网页:
<meta http-equiv="refresh" content="30">
This would refresh the currently loaded web page every 30 seconds.这将每30秒刷新一次当前加载的网页。
The redirection to another website can be set up in a similar manner:重定向到另一个网站可以用类似的方式设置:
<meta http-equiv="refresh"
content="5; URL=http://domain.com/">
This causes the browser to switch to the domain.com URL after five seconds. 这会导致浏览器在五秒钟后切换到domain.com URL。You could also use zero seconds here, but this way, you can at least let the user know in the HTML document body why they are being redirected and where.您也可以在这里使用零秒,但这样,您至少可以在HTML文档正文中让用户知道他们被重定向的原因和位置。
Stop Using the Automatic Redirection Feature停止使用自动重定向功能
Automatic redirection can be helpful if the address of the web project has changed. However, some browsers ignore this redirection depending on their settings. 如果web项目的地址已更改,自动重定向可能会有所帮助。但是,一些浏览器会根据其设置忽略此重定向。In addition, you should also note that the search engines ignore this redirection. In this context, it’s often better to define a hyperlink with information in the HTML document body to the new URL with an explanatory note. 此外,您还应该注意,搜索引擎会忽略此重定向。在这种情况下,最好在HTML文档正文中定义一个超链接,其中包含指向新URL的信息,并附上注释。In addition, when you use the time 此外,当您使用时间0时,页面的访问者可能很难使用浏览器的后退按钮,因为这会一次又一次地向前推。0
, it could be difficult for the visitor of the page to use the back button of the browser because this would throw them forward again and again. Alternatively, a redirection can also be created on the server. For example, if you have access to the configuration file .htaccess (for Apache web server) or web.config (for IIS), you can configure redirecting there. 或者,也可以在服务器上创建重定向。例如,如果您有权访问配置文件.htaccess(用于Apache web服务器)或webconfig(用于IIS),则可以在那里配置重定向。Automatic redirection has been classified as deprecated by the W3C anyway, which is why you should refrain from using it for future web projects. But because redirects are still commonly used, I included the topic here.无论如何,自动重定向已被W3C列为不推荐使用,这就是为什么你应该避免在未来的web项目中使用它。但由于重定向仍然被广泛使用,我在这里加入了这个主题。
As mentioned previously, you can also use the old character encoding specification:如前所述,您还可以使用旧的字符编码规范:
<meta http-equiv="content-type"
content="text/html; charset=utf-8" />
This specification corresponds to the more recent specification introduced in HTML:此规范对应于HTML中引入的最新规范:
<meta charset="UTF-8">
The additional use of the old specification has the advantage that it will also be understood by older browsers that don’t know 额外使用旧规范的优点是,它也会被不知道<meta charset="UTF-8">
.<meta charset="UTF-8">
的旧浏览器理解。
3.8.6 Using General Metadata使用通用元数据
In addition, there’s a considerable amount of general metadata, such as the author of the HTML document or the date and time the document was edited. 此外,还有大量的通用元数据,例如HTML文档的作者或文档编辑的日期和时间。This is helpful, for example, when several people work on one HTML project. You can specify all this information as a 例如,当几个人在一个HTML项目上工作时,这很有帮助。您可以将所有这些信息指定为name
/content
combination. Let’s look at some examples:name
/content
组合。让我们来看一些例子:
<meta name="author" content="John Doe">
<meta name="date" content=" 2021-01-15T12:00:00+01:00">
Here, the author of the web page (这里标明了网页的作者(author
) and the date of the last change (date
) were indicated. author
)和最后更改的日期(date
)。If you want to provide personal information for the readers about the current HTML document, you shouldn’t do that via metadata, but directly in the HTML document in a readable manner. 如果你想为读者提供有关当前HTML文档的个人信息,你不应该通过元数据来实现,而应该以可读的方式直接在HTML文档中实现。The metadata is only useful when someone looks at the source code of the document or when it’s read by a software. 元数据只有在有人查看文档的源代码或被软件读取时才有用。There’s also other general metadata such as 还有其他通用元数据,如generator
, which provides information on the software that was used to create the website. generator
,它提供了用于创建网站的软件的信息。Additionally, you can use 此外,如果网页属于特定的web平台,或者网页中正在运行特定的web应用程序,您可以使用应用application-name
to make special specifications if the web page belongs to a specific web platform or if a specific web application is running in the web page.application-name
来制定特殊规范。
Further Research on the Internet互联网的进一步研究
An overview of the standard metadata can be found on the following website: www. whatwg.org/specs/web-apps/current-work/multipage/semantics.html#standard-metadata-names. 标准元数据的概述可以在以下网站上找到:www. whatwg.org/specs/web-apps/current-work/multipage/semantics.html#standard-metadata-names。Proposed or future metadata, if any, can be found on this website: http://wiki.whatwg.org/wiki/MetaExtensions.拟议或未来的元数据(如果有的话)可以在本网站上找到:http://wiki.whatwg.org/wiki/MetaExtensions。
3.8.7 My Recommendation: This Metadata Belongs in the Basic HTML Framework我的建议:此元数据属于基本HTML框架
As you’ve now been introduced to a number of different types of metadata, you’ll probably wonder which type of metadata will be useful for your own website. 由于您现在已经了解了许多不同类型的元数据,您可能会想知道哪种类型的元数据对您自己的网站有用。This is ultimately up to you, but personally I always use at least the character encoding for UTF-8, a page description, and the viewport in the 这最终取决于你,但就我个人而言,我总是至少使用UTF-8的字符编码、页面描述和head
element:head
元素中的视口:
...
<head>
<title>German umlauts</title>
...
<meta charset="UTF-8" />
<meta name="description" content="A description should preferably be as
short and precise as possible. Here
you should summarize in 2-3 sentences
what this page is about. Characters
exceeding the limit will be shortened."/>
<meta name="viewport"
content="width=device-width,initial-scale=1.0, shrink-to-fit=no" />
</head>
...
Examples of the Book Remain Shortened这本书的例子仍然被缩短了
In the examples for the book, I mostly used only the character encoding to keep the source code clearer. 在本书的示例中,我主要只使用了字符编码,以使源代码更清晰。The viewport is useful if you create a responsive website, which is what you usually want. 如果您创建了一个响应式网站,视口非常有用,这是您通常想要的。The page description is important when you publish the website and want search engines to use the text as a short summary of the contents.当您发布网站并希望搜索引擎使用文本作为内容的简短摘要时,页面描述很重要。
3.8.8 HTML Attributes for the <meta> Element<meta>元素的HTML属性
Table 3.7 provides an overview of the HTML attributes for the 表3.7概述了meta
element.meta
元素的HTML属性。
|
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table 3.7
Attributes for the <meta> Element<meta>
元素的属性