Puppeteer

Build status npm puppeteer package

Guides | API | FAQ | Contributing | Troubleshooting

Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer是一个Node.js库,它提供了一个高级API来通过DevTools协议控制Chrome/Chromium。Puppeteer runs in headless mode by default, but can be configured to run in full (non-headless) Chrome/Chromium.Puppeteer默认在无头模式下运行,但可以配置为在完全(无头)Chrome/Chromium下运行。

What can I do?我能做什么?

Most things that you can do manually in the browser can be done using Puppeteer! 您可以在浏览器中手动执行的大多数操作都可以使用Puppeteer完成!Here are a few examples to get you started:以下是一些示例,可以帮助您开始:

Getting Started入门

Installation安装

To use Puppeteer in your project, run:要在项目中使用Puppeteer,请运行:

npm i puppeteer
# or `yarn add puppeteer`
# or `pnpm i puppeteer`

When you install Puppeteer, it automatically downloads a recent version of Chromium (~170MB macOS, ~282MB Linux, ~280MB Windows) that is guaranteed to work with Puppeteer. 当您安装Puppeteer时,它会自动下载最新版本的Chromium(170MB macOS,282MB Linux,280MB Windows),该版本保证可以与Puppeter一起使用For a version of Puppeteer without installation, see puppeteer-core.对于没有安装的木偶版本,请参阅puppeteer-core

Configuring 正在配置Puppeteer

Puppeteer uses several defaults that can be customized through configuration files.Puppeteer使用了几个可以通过配置文件定制的默认值。

For example, to change the default cache directory Puppeteer uses to install browsers, you can add a .puppeteerrc.cjs (or puppeteer.config.cjs) at the root of your application with the contents例如,要更改Puppeteer用于安装浏览器的默认缓存目录,可以在应用程序的根目录添加一个.puppeteerrc.cjs(或puppeteer.config.cjs

const {join} = require('path');

/**
 * @type {import("puppeteer").Configuration}
 */
module.exports = {
  // Changes the cache location for Puppeteer.
  cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};

After adding the configuration file, you will need to remove and reinstall puppeteer for it to take effect.添加配置文件后,您需要删除并重新安装puppeteer才能使其生效。

See Configuring Puppeteer for more information.有关详细信息,请参阅配置Puppeteer

puppeteer-core

Every release since v1.7.0 we publish two packages:自v1.7.0以来的每个版本,我们都发布两个包:

puppeteer is a product for browser automation. puppeteer是一款用于浏览器自动化的产品When installed, it downloads a version of Chromium, which it then drives using puppeteer-core. 安装后,它会下载一个版本的Chromium,然后使用puppeteer-core驱动它。Being an end-user product, puppeteer automates several workflows using reasonable defaults that can be customized.作为最终用户产品,puppeteer使用可定制的合理默认值自动执行多个工作流。

puppeteer-core is a library to help drive anything that supports DevTools protocol. 是一个帮助驱动任何支持DevTools协议的Being a library, puppeteer-core is fully driven through its programmatic interface implying no defaults are assumed and puppeteer-core will not download Chromium when installed.作为一个库,puppeteer-core完全通过其编程接口驱动,这意味着没有默认设置,puppeteer-core在安装时不会下载Chromium。

You should use puppeteer-core if you are connecting to a remote browser or managing browsers yourself. 如果您正在连接到远程浏览器自己管理浏览器,则应使用puppeteer-coreIf you are managing browsers yourself, you will need to call puppeteer.launch with an an explicit executablePath (or channel if it's installed in a standard location).如果您自己管理浏览器,则需要使用显式executablePath(或channel,如果它安装在标准位置)调用puppeteer.launch

When using puppeteer-core, remember to change the import:使用puppeteer-core时,请记住更改导入:

import puppeteer from 'puppeteer-core';

Usage用法

Puppeteer follows the latest maintenance LTS version of Node.Puppeteer遵循Node的最新维护LTS版本。

Puppeteer will be familiar to people using other browser testing frameworks. 使用其他浏览器测试框架的人将熟悉Puppeter。You launch/connect a browser, create some pages, and then manipulate them with Puppeteer's API.启动/连接浏览器创建一些页面,然后使用Puppeteer的API对其进行操作。

For more in-depth usage, check our guides and examples.有关更深入的用法,请查看指南示例

Example示例

The following example searches developers.google.com/web for articles tagged "Headless Chrome" and scrape results from the results page.下面的示例在developers.google.com/web上搜索标记为“Headless Chrome”的文章,并从结果页面中获取结果。

import puppeteer from 'puppeteer';

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://developers.google.com/web/');

  // Type into search box.在搜索框中键入。
  await page.type('.devsite-search-field', 'Headless Chrome');

  // Wait for suggest overlay to appear and click "show all results".等待建议覆盖显示,然后单击“显示所有结果”。
  const allResultsSelector = '.devsite-suggest-all-results';
  await page.waitForSelector(allResultsSelector);
  await page.click(allResultsSelector);

  // Wait for the results page to load and display the results.等待结果页面加载并显示结果。
  const resultsSelector = '.gsc-results .gs-title';
  await page.waitForSelector(resultsSelector);

  // Extract the results from the page.从页面中提取结果。
  const links = await page.evaluate(resultsSelector => {
    return [...document.querySelectorAll(resultsSelector)].map(anchor => {
      const title = anchor.textContent.split('|')[0].trim();
      return `${title} - ${anchor.href}`;
    });
  }, resultsSelector);

  // Print all the files.打印所有文件。
  console.log(links.join('\n'));

  await browser.close();
})();

Default runtime settings默认运行时设置

1. Uses Headless mode使用无头模式

Puppeteer launches Chromium in headless mode. Puppeteer以无头模式启动Chromium。To launch a full version of Chromium, set the headless option when launching a browser:要启动完整版本的Chromium,请在启动浏览器时设置headless选项:

const browser = await puppeteer.launch({headless: false}); // default is true

2. Runs a bundled version of Chromium运行Chromium的捆绑版本

By default, Puppeteer downloads and uses a specific version of Chromium so its API is guaranteed to work out of the box. 默认情况下,Puppeteer下载并使用特定版本的Chromium,因此其API保证可以开箱即用。To use Puppeteer with a different version of Chrome or Chromium, pass in the executable's path when creating a Browser instance:要将Puppeteer与不同版本的Chrome或Chromium一起使用,请在创建Browser实例时传入可执行文件的路径:

const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});

You can also use Puppeteer with Firefox Nightly (experimental support). 您还可以将Puppeteer与Firefox Nightly一起使用(实验支持)。See Puppeteer.launch for more information.有关详细信息,请参阅Puppeteer.launch

See this article for a description of the differences between Chromium and Chrome. 请参阅本文了解Chromium和Chrome之间的区别。This article这篇文章 describes some differences for Linux users.描述了Linux用户的一些差异。

3. Creates a fresh user profile创建新的用户配置文件

Puppeteer creates its own browser user profile which it cleans up on every run.Puppeteer创建自己的浏览器用户配置文件,每次运行时都会清理该文件

Using Docker使用Docker

See our guide on using Docker.请参阅Docker使用指南

Using Chrome Extensions使用Chrome扩展

See our guide on using Chrome extensions.请参阅Chrome扩展使用指南

Resources资源

Contributing贡献

Check out our contributing guide to get an overview of Puppeteer development.查看贡献指南,了解Puppeteer发展概况。

FAQ

Our FAQ has migrated to our site.常见问题解答已迁移到此网站