[关闭]
@Andream 2017-12-10T19:48:56.000000Z 字数 69169 阅读 1990
Released API: v0.12.0 | v0.11.0 | v0.10.2 | v0.10.1 | v0.10.0 | v0.9.0

Puppeteer API v0.13.0

Table of Contents

Overview

Puppeteer is a Node library which provides a high-level API to control Chromium over the DevTools Protocol.

Puppeteer API is hierarchical and mirrors browser structure. On the following diagram, faded entities are not currently represented in Puppeteer.

puppeteer overview

(Diagram source: link)

Environment Variables

Puppeteer looks for certain environment variables to aid its operations. These variables can either be set in the environment or in the npm config.

class: Puppeteer

Puppeteer module provides a method to launch a Chromium instance.
The following is a typical example of using a Puppeteer to drive automation:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. await page.goto('https://www.google.com');
  5. // other actions...
  6. await browser.close();
  7. });

puppeteer.connect(options)

This methods attaches Puppeteer to an existing Chromium instance.

puppeteer.executablePath()

puppeteer.launch([options])

The method launches a browser instance with given arguments. The browser will be closed when the parent node.js process is closed.

NOTE Puppeteer works best with the version of Chromium it is bundled with. There is no guarantee it will work with any other version. Use executablePath option with extreme caution. If Google Chrome (rather than Chromium) is preferred, a Chrome Canary or Dev Channel build is suggested.

class: Browser

A Browser is created when Puppeteer connects to a Chromium instance, either through puppeteer.launch or puppeteer.connect.

An example of using a Browser to create a Page:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. await page.goto('https://example.com');
  5. await browser.close();
  6. });

An example of disconnecting from and reconnecting to a Browser:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. // Store the endpoint to be able to reconnect to Chromium
  4. const browserWSEndpoint = browser.wsEndpoint();
  5. // Disconnect puppeteer from Chromium
  6. browser.disconnect();
  7. // Use the endpoint to reestablish a connection
  8. const browser2 = await puppeteer.connect({browserWSEndpoint});
  9. // Close Chromium
  10. await browser2.close();
  11. });

event: 'disconnected'

Emitted when puppeteer gets disconnected from the browser instance. This might happen because one of the following:
- browser closed or crashed
- browser.disconnect method was called

event: 'targetchanged'

Emitted when the url of a target changes.

event: 'targetcreated'

Emitted when a target is created, for example when a new page is opened by window.open or browser.newPage.

event: 'targetdestroyed'

Emitted when a target is destroyed, for example when a page is closed.

browser.close()

Closes Chromium and all of its pages (if any were opened). The browser object itself is considered disposed and cannot be used anymore.

browser.disconnect()

Disconnects Puppeteer from the browser, but leaves the Chromium process running. After calling disconnect, the browser object is considered disposed and cannot be used anymore.

browser.newPage()

browser.pages()

browser.targets()

browser.version()

NOTE the format of browser.version() might change with future releases of Chromium.

browser.wsEndpoint()

Browser websocket endpoint which can be used as an argument to
puppeteer.connect. The format is ws://${host}:${port}/devtools/browser/<id>

You can find the webSocketDebuggerUrl from http://${host}:${port}/json/version. Learn more about the devtools protocol and the browser endpoint.

class: Page

Page provides methods to interact with a single tab in Chromium. One Browser instance might have multiple Page instances.

This example creates a page, navigates it to a URL, and then saves a screenshot:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. await page.goto('https://example.com');
  5. await page.screenshot({path: 'screenshot.png'});
  6. await browser.close();
  7. });

event: 'console'

Emitted when JavaScript within the page calls one of console API methods, e.g. console.log or console.dir. Also emitted if the page throws an error or a warning.

The arguments passed into console.log appear as arguments on the event handler.

An example of handling console event:

  1. page.on('console', msg => {
  2. for (let i = 0; i < msg.args.length; ++i)
  3. console.log(`${i}: ${msg.args[i]}`);
  4. });
  5. page.evaluate(() => console.log('hello', 5, {foo: 'bar'}));

event: 'dialog'

Emitted when a JavaScript dialog appears, such as alert, prompt, confirm or beforeunload. Puppeteer can respond to the dialog via Dialog's accept or dismiss methods.

event: 'error'

Emitted when the page crashes.

NOTE error event has a special meaning in Node, see error events for details.

event: 'frameattached'

Emitted when a frame is attached.

event: 'framedetached'

Emitted when a frame is detached.

event: 'framenavigated'

Emitted when a frame is navigated to a new url.

event: 'load'

Emitted when the JavaScript load event is dispatched.

event: 'metrics'

Emitted when the JavaScript code makes a call to console.timeStamp. For the list
of metrics see page.metrics.

event: 'pageerror'

Emitted when an uncaught exception happens within the page.

event: 'request'

Emitted when a page issues a request. The request object is read-only.
In order to intercept and mutate requests, see page.setRequestInterception.

event: 'requestfailed'

Emitted when a request fails, for example by timing out.

event: 'requestfinished'

Emitted when a request finishes successfully.

event: 'response'

Emitted when a response is received.

page.$(selector)

The method runs document.querySelector within the page. If no element matches the selector, the return value resolve to null.

Shortcut for page.mainFrame().$(selector).

page.$$(selector)

The method runs document.querySelectorAll within the page. If no elements match the selector, the return value resolve to [].

Shortcut for page.mainFrame().$$(selector).

page.$$eval(selector, pageFunction[, ...args])

This method runs document.querySelectorAll within the page and passes it as the first argument to pageFunction.

If pageFunction returns a Promise, then page.$$eval would wait for the promise to resolve and return its value.

Examples:

  1. const divsCounts = await page.$$eval('div', divs => divs.length);

page.$eval(selector, pageFunction[, ...args])

This method runs document.querySelector within the page and passes it as the first argument to pageFunction. If there's no element matching selector, the method throws an error.

If pageFunction returns a Promise, then page.$eval would wait for the promise to resolve and return its value.

Examples:

  1. const searchValue = await page.$eval('#search', el => el.value);
  2. const preloadHref = await page.$eval('link[rel=preload]', el => el.href);
  3. const html = await page.$eval('.main-container', e => e.outerHTML);

Shortcut for page.mainFrame().$eval(selector, pageFunction).

page.addScriptTag(options)

Adds a <script> tag into the page with the desired url or content.

Shortcut for page.mainFrame().addScriptTag(options).

page.addStyleTag(options)

Adds a <link rel="stylesheet"> tag into the page with the desired url or a <style type="text/css"> tag with the content.

Shortcut for page.mainFrame().addStyleTag(options).

page.authenticate(credentials)

Provide credentials for http authentication.

To disable authentication, pass null.

page.bringToFront()

Brings page to front (activates tab).

page.click(selector[, options])

This method fetches an element with selector, scrolls it into view if needed, and then uses page.mouse to click in the center of the element.
If there's no element matching selector, the method throws an error.

page.close()

page.content()

Gets the full HTML contents of the page, including the doctype.

page.cookies(...urls)

If no URLs are specified, this method returns cookies for the current page URL.
If URLs are specified, only cookies for those URLs are returned.

page.deleteCookie(...cookies)

page.emulate(options)

Emulates given device metrics and user agent. This method is a shortcut for calling two methods:
- page.setUserAgent(userAgent)
- page.setViewport(viewport)

To aid emulation, puppeteer provides a list of device descriptors which can be obtained via the require('puppeteer/DeviceDescriptors') command.
Below is an example of emulating an iPhone 6 in puppeteer:

  1. const puppeteer = require('puppeteer');
  2. const devices = require('puppeteer/DeviceDescriptors');
  3. const iPhone = devices['iPhone 6'];
  4. puppeteer.launch().then(async browser => {
  5. const page = await browser.newPage();
  6. await page.emulate(iPhone);
  7. await page.goto('https://www.google.com');
  8. // other actions...
  9. await browser.close();
  10. });

List of all available devices is available in the source code: DeviceDescriptors.js.

page.emulateMedia(mediaType)

page.evaluate(pageFunction, ...args)

If the function, passed to the page.evaluate, returns a Promise, then page.evaluate would wait for the promise to resolve and return its value.

If the function passed into page.evaluate returns a non-Serializable value, then page.evaluate resolves to undefined.

  1. const result = await page.evaluate(() => {
  2. return Promise.resolve(8 * 7);
  3. });
  4. console.log(result); // prints "56"

A string can also be passed in instead of a function.

  1. console.log(await page.evaluate('1 + 2')); // prints "3"

ElementHandle instances can be passed as arguments to the page.evaluate:

  1. const bodyHandle = await page.$('body');
  2. const html = await page.evaluate(body => body.innerHTML, bodyHandle);
  3. await bodyHandle.dispose();

Shortcut for page.mainFrame().evaluate(pageFunction, ...args).

page.evaluateHandle(pageFunction, ...args)

If the function, passed to the page.evaluateHandle, returns a Promise, then page.evaluateHandle would wait for the promise to resolve and return its value.

  1. const aWindowHandle = await page.evaluateHandle(() => Promise.resolve(window));
  2. aWindowHandle; // Handle for the window object.

A string can also be passed in instead of a function.

  1. const aHandle = await page.evaluateHandle('document'); // Handle for the 'document'.

JSHandle instances can be passed as arguments to the page.evaluateHandle:

  1. const aHandle = await page.evaluateHandle(() => document.body);
  2. const resultHandle = await page.evaluateHandle(body => body.innerHTML, aHandle);
  3. console.log(await resultHandle.jsonValue());
  4. await resultHandle.dispose();

Shortcut for page.mainFrame().executionContext().evaluateHandle(pageFunction, ...args).

page.evaluateOnNewDocument(pageFunction, ...args)

Adds a function which would be invoked in one of the following scenarios:
- whenever the page is navigated
- whenever the child frame is attached or navigated. In this case, the function is invoked in the context of the newly attached frame

The function is invoked after the document was created but before any of its scripts were run. This is useful to amend JavaScript environment, e.g. to seed Math.random.

page.exposeFunction(name, puppeteerFunction)

The method adds a function called name on the page's window object.
When called, the function executes puppeteerFunction in node.js and returns a Promise which resolves to the return value of puppeteerFunction.

If the puppeteerFunction returns a Promise, it will be awaited.

NOTE Functions installed via page.exposeFunction survive navigations.

An example of adding an md5 function into the page:

  1. const puppeteer = require('puppeteer');
  2. const crypto = require('crypto');
  3. puppeteer.launch().then(async browser => {
  4. const page = await browser.newPage();
  5. page.on('console', msg => console.log(msg.text));
  6. await page.exposeFunction('md5', text =>
  7. crypto.createHash('md5').update(text).digest('hex')
  8. );
  9. await page.evaluate(async () => {
  10. // use window.md5 to compute hashes
  11. const myString = 'PUPPETEER';
  12. const myHash = await window.md5(myString);
  13. console.log(`md5 of ${myString} is ${myHash}`);
  14. });
  15. await browser.close();
  16. });

An example of adding a window.readfile function into the page:

  1. const puppeteer = require('puppeteer');
  2. const fs = require('fs');
  3. puppeteer.launch().then(async browser => {
  4. const page = await browser.newPage();
  5. page.on('console', msg => console.log(msg.text));
  6. await page.exposeFunction('readfile', async filePath => {
  7. return new Promise((resolve, reject) => {
  8. fs.readFile(filePath, 'utf8', (err, text) => {
  9. if (err)
  10. reject(err);
  11. else
  12. resolve(text);
  13. });
  14. });
  15. });
  16. await page.evaluate(async () => {
  17. // use window.readfile to read contents of a file
  18. const content = await window.readfile('/etc/hosts');
  19. console.log(content);
  20. });
  21. await browser.close();
  22. });

page.focus(selector)

This method fetches an element with selector and focuses it.
If there's no element matching selector, the method throws an error.

page.frames()

page.goBack(options)

Navigate to the previous page in history.

page.goForward(options)

Navigate to the next page in history.

page.goto(url, options)

The page.goto will throw an error if:
- there's an SSL error (e.g. in case of self-signed certificates).
- target URL is invalid.
- the timeout is exceeded during navigation.
- the main resource failed to load.

NOTE page.goto either throw or return a main resource response. The only exception is navigation to about:blank, which would succeed and return null.

NOTE Headless mode doesn't support navigating to a PDF document. See the upstream issue.

page.hover(selector)

This method fetches an element with selector, scrolls it into view if needed, and then uses page.mouse to hover over the center of the element.
If there's no element matching selector, the method throws an error.

page.keyboard

page.mainFrame()

Page is guaranteed to have a main frame which persists during navigations.

page.metrics()

NOTE All timestamps are in monotonic time: monotonically increasing time in seconds since an arbitrary point in the past.

page.mouse

page.pdf(options)

NOTE Generating a pdf is currently only supported in Chrome headless.

page.pdf() generates a pdf of the page with print css media. To generate a pdf with screen media, call page.emulateMedia('screen') before calling page.pdf():

  1. // Generates a PDF with 'screen' media type.
  2. await page.emulateMedia('screen');
  3. await page.pdf({path: 'page.pdf'});

The width, height, and margin options accept values labeled with units. Unlabeled values are treated as pixels.

A few examples:
- page.pdf({width: 100}) - prints with width set to 100 pixels
- page.pdf({width: '100px'}) - prints with width set to 100 pixels
- page.pdf({width: '10cm'}) - prints with width set to 10 centimeters.

All possible units are:
- px - pixel
- in - inch
- cm - centimeter
- mm - millimeter

The format options are:
- Letter: 8.5in x 11in
- Legal: 8.5in x 14in
- Tabloid: 11in x 17in
- Ledger: 17in x 11in
- A0: 33.1in x 46.8in
- A1: 23.4in x 33.1in
- A2: 16.5in x 23.4in
- A3: 11.7in x 16.5in
- A4: 8.27in x 11.7in
- A5: 5.83in x 8.27in
- A6: 4.13in x 5.83in

page.queryObjects(prototypeHandle)

The method iterates JavaScript heap and finds all the objects with the given prototype.

  1. // Create a Map object
  2. await page.evaluate(() => window.map = new Map());
  3. // Get a handle to the Map object prototype
  4. const mapPrototype = await page.evaluateHandle(() => Map.prototype);
  5. // Query all map instances into an array
  6. const mapInstances = await page.queryObjects(mapPrototype);
  7. // Count amount of map objects in heap
  8. const count = await page.evaluate(maps => maps.length, mapInstances);
  9. await mapInstances.dispose();
  10. await mapPrototype.dispose();

Shortcut for page.mainFrame().executionContext().queryObjects(prototypeHandle).

page.reload(options)

page.screenshot([options])

page.select(selector, ...values)

Triggers a change and input event once all the provided options have been selected.
If there's no <select> element matching selector, the method throws an error.

  1. page.select('select#colors', 'blue'); // single selection
  2. page.select('select#colors', 'red', 'green', 'blue'); // multiple selections

Shortcut for page.mainFrame.select()

page.setContent(html)

page.setCookie(...cookies)

page.setExtraHTTPHeaders(headers)

The extra HTTP headers will be sent with every request the page initiates.

NOTE page.setExtraHTTPHeaders does not guarantee the order of headers in the outgoing requests.

page.setJavaScriptEnabled(enabled)

NOTE changing this value won't affect scripts that have already been run. It will take full effect on the next navigation.

page.setOfflineMode(enabled)

page.setRequestInterception(value)

Activating request interception enables request.abort, request.continue and
request.respond methods.

An example of a naïve request interceptor that aborts all image requests:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. await page.setRequestInterception(true);
  5. page.on('request', interceptedRequest => {
  6. if (interceptedRequest.url.endsWith('.png') || interceptedRequest.url.endsWith('.jpg'))
  7. interceptedRequest.abort();
  8. else
  9. interceptedRequest.continue();
  10. });
  11. await page.goto('https://example.com');
  12. await browser.close();
  13. });

NOTE Enabling request interception disables page caching.

page.setUserAgent(userAgent)

page.setViewport(viewport)

NOTE in certain cases, setting viewport will reload the page in order to set the isMobile or hasTouch properties.

In the case of multiple pages in a single browser, each page can have its own viewport size.

page.tap(selector)

This method fetches an element with selector, scrolls it into view if needed, and then uses page.touchscreen to tap in the center of the element.
If there's no element matching selector, the method throws an error.

page.title()

Shortcut for page.mainFrame().title().

page.touchscreen

page.tracing

page.type(selector, text[, options])

Sends a keydown, keypress/input, and keyup event for each character in the text.

To press a special key, like Control or ArrowDown, use keyboard.press.

  1. page.type('#mytextarea', 'Hello'); // Types instantly
  2. page.type('#mytextarea', 'World', {delay: 100}); // Types slower, like a user

page.url()

This is a shortcut for page.mainFrame().url()

page.viewport()

page.waitFor(selectorOrFunctionOrTimeout[, options[, ...args]])

This method behaves differently with respect to the type of the first parameter:
- if selectorOrFunctionOrTimeout is a string, then the first argument is treated as a selector to wait for and the method is a shortcut for page.waitForSelector
- if selectorOrFunctionOrTimeout is a function, then the first argument is treated as a predicate to wait for and the method is a shortcut for page.waitForFunction().
- if selectorOrFunctionOrTimeout is a number, then the first argument is treated as a timeout in milliseconds and the method returns a promise which resolves after the timeout
- otherwise, an exception is thrown

Shortcut for [page.mainFrame().waitFor(selectorOrFunctionOrTimeout[, options[, ...args]])](#framewaitforselectororfunctionortimeout-options-args).

page.waitForFunction(pageFunction[, options[, ...args]])

The waitForFunction can be used to observe viewport size change:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. const watchDog = page.waitForFunction('window.innerWidth < 100');
  5. page.setViewport({width: 50, height: 50});
  6. await watchDog;
  7. await browser.close();
  8. });

Shortcut for [page.mainFrame().waitForFunction(pageFunction[, options[, ...args]])](#framewaitforfunctionpagefunction-options-args).

page.waitForNavigation(options)

page.waitForSelector(selector[, options])

Wait for the selector to appear in page. If at the moment of calling
the method the selector already exists, the method will return
immediately. If the selector doesn't appear after the timeout milliseconds of waiting, the function will throw.

This method works across navigations:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. let currentURL;
  5. page
  6. .waitForSelector('img')
  7. .then(() => console.log('First URL with image: ' + currentURL));
  8. for (currentURL of ['https://example.com', 'https://google.com', 'https://bbc.com'])
  9. await page.goto(currentURL);
  10. await browser.close();
  11. });

Shortcut for page.mainFrame().waitForSelector(selector[, options]).

class: Keyboard

Keyboard provides an api for managing a virtual keyboard. The high level api is keyboard.type, which takes raw characters and generates proper keydown, keypress/input, and keyup events on your page.

For finer control, you can use keyboard.down, keyboard.up, and keyboard.sendCharacter to manually fire events as if they were generated from a real keyboard.

An example of holding down Shift in order to select and delete some text:

  1. await page.keyboard.type('Hello World!');
  2. await page.keyboard.press('ArrowLeft');
  3. await page.keyboard.down('Shift');
  4. for (let i = 0; i < ' World'.length; i++)
  5. await page.keyboard.press('ArrowLeft');
  6. await page.keyboard.up('Shift');
  7. await page.keyboard.press('Backspace');
  8. // Result text will end up saying 'Hello!'

An example of pressing A

  1. await page.keyboard.down('Shift');
  2. await page.keyboard.press('KeyA');
  3. await page.keyboard.up('Shift');

NOTE On MacOS, keyboard shortcuts like ⌘ A -> Select All do not work. See #1313

keyboard.down(key[, options])

Dispatches a keydown event.

If key is a single character and no modifier keys besides Shift are being held down, a keypress/input event will also generated. The text option can be specified to force an input event to be generated.

If key is a modifier key, Shift, Meta, Control, or Alt, subsequent key presses will be sent with that modifier active. To release the modifier key, use keyboard.up.

After the key is pressed once, subsequent calls to keyboard.down will have repeat set to true. To release the key, use keyboard.up.

NOTE Modifier keys DO effect keyboard.down. Holding down Shift will type the text in upper case.

keyboard.press(key[, options])

If key is a single character and no modifier keys besides Shift are being held down, a keypress/input event will also generated. The text option can be specified to force an input event to be generated.

NOTE Modifier keys DO effect elementHandle.press. Holding down Shift will type the text in upper case.

Shortcut for keyboard.down and keyboard.up.

keyboard.sendCharacter(char)

Dispatches a keypress and input event. This does not send a keydown or keyup event.

  1. page.keyboard.sendCharacter('嗨');

NOTE Modifier keys DO NOT effect keyboard.sendCharacter. Holding down Shift will not type the text in upper case.

keyboard.type(text, options)

Sends a keydown, keypress/input, and keyup event for each character in the text.

To press a special key, like Control or ArrowDown, use keyboard.press.

  1. page.keyboard.type('Hello'); // Types instantly
  2. page.keyboard.type('World', {delay: 100}); // Types slower, like a user

NOTE Modifier keys DO NOT effect keyboard.type. Holding down Shift will not type the text in upper case.

keyboard.up(key)

Dispatches a keyup event.

class: Mouse

mouse.click(x, y, [options])

Shortcut for mouse.move, mouse.down and mouse.up.

mouse.down([options])

Dispatches a mousedown event.

mouse.move(x, y, [options])

Dispatches a mousemove event.

mouse.up([options])

Dispatches a mouseup event.

class: Touchscreen

touchscreen.tap(x, y)

Dispatches a touchstart and touchend event.

class: Tracing

You can use tracing.start and tracing.stop to create a trace file which can be opened in Chrome DevTools or timeline viewer.

  1. await page.tracing.start({path: 'trace.json'});
  2. await page.goto('https://www.google.com');
  3. await page.tracing.stop();

tracing.start(options)

Only one trace can be active at a time per browser.

tracing.stop()

class: Dialog

Dialog objects are dispatched by page via the 'dialog' event.

An example of using Dialog class:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. page.on('dialog', async dialog => {
  5. console.log(dialog.message());
  6. await dialog.dismiss();
  7. await browser.close();
  8. });
  9. page.evaluate(() => alert('1'));
  10. });

dialog.accept([promptText])

dialog.defaultValue()

dialog.dismiss()

dialog.message()

dialog.type

Dialog's type, can be one of alert, beforeunload, confirm or prompt.

class: ConsoleMessage

ConsoleMessage objects are dispatched by page via the 'console' event.

consoleMessage.args

consoleMessage.text

consoleMessage.type

One of the following values: 'log', 'debug', 'info', 'error', 'warning', 'dir', 'dirxml', 'table', 'trace', 'clear', 'startGroup', 'startGroupCollapsed', 'endGroup', 'assert', 'profile', 'profileEnd', 'count', 'timeEnd'.

class: Frame

At every point of time, page exposes its current frame tree via the page.mainFrame() and frame.childFrames() methods.

Frame object's lifecycle is controlled by three events, dispatched on the page object:
- 'frameattached' - fired when the frame gets attached to the page. A Frame can be attached to the page only once.
- 'framenavigated' - fired when the frame commits navigation to a different URL.
- 'framedetached' - fired when the frame gets detached from the page. A Frame can be detached from the page only once.

An example of dumping frame tree:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. await page.goto('https://www.google.com/chrome/browser/canary.html');
  5. dumpFrameTree(page.mainFrame(), '');
  6. await browser.close();
  7. function dumpFrameTree(frame, indent) {
  8. console.log(indent + frame.url());
  9. for (let child of frame.childFrames())
  10. dumpFrameTree(child, indent + ' ');
  11. }
  12. });

frame.$(selector)

The method queries frame for the selector. If there's no such element within the frame, the method will resolve to null.

frame.$$(selector)

The method runs document.querySelectorAll within the frame. If no elements match the selector, the return value resolve to [].

frame.$$eval(selector, pageFunction[, ...args])

This method runs document.querySelectorAll within the frame and passes it as the first argument to pageFunction.

If pageFunction returns a Promise, then frame.$$eval would wait for the promise to resolve and return its value.

Examples:

  1. const divsCounts = await frame.$$eval('div', divs => divs.length);

frame.$eval(selector, pageFunction[, ...args])

This method runs document.querySelector within the frame and passes it as the first argument to pageFunction. If there's no element matching selector, the method throws an error.

If pageFunction returns a Promise, then frame.$eval would wait for the promise to resolve and return its value.

Examples:

  1. const searchValue = await frame.$eval('#search', el => el.value);
  2. const preloadHref = await frame.$eval('link[rel=preload]', el => el.href);
  3. const html = await frame.$eval('.main-container', e => e.outerHTML);

frame.addScriptTag(options)

Adds a <script> tag into the page with the desired url or content.

frame.addStyleTag(options)

Adds a <link rel="stylesheet"> tag into the page with the desired url or a <style type="text/css"> tag with the content.

frame.childFrames()

frame.evaluate(pageFunction, ...args)

If the function, passed to the frame.evaluate, returns a Promise, then frame.evaluate would wait for the promise to resolve and return its value.

If the function passed into frame.evaluate returns a non-Serializable value, then frame.evaluate resolves to undefined.

  1. const result = await frame.evaluate(() => {
  2. return Promise.resolve(8 * 7);
  3. });
  4. console.log(result); // prints "56"

A string can also be passed in instead of a function.

  1. console.log(await frame.evaluate('1 + 2')); // prints "3"

ElementHandle instances can be passed as arguments to the frame.evaluate:

  1. const bodyHandle = await frame.$('body');
  2. const html = await frame.evaluate(body => body.innerHTML, bodyHandle);
  3. await bodyHandle.dispose();

frame.executionContext()

frame.isDetached()

Returns true if the frame has been detached, or false otherwise.

frame.name()

Returns frame's name attribute as specified in the tag.

If the name is empty, returns the id attribute instead.

NOTE This value is calculated once when the frame is created, and will not update if the attribute is changed later.

frame.parentFrame()

frame.select(selector, ...values)

Triggers a change and input event once all the provided options have been selected.
If there's no <select> element matching selector, the method throws an error.

  1. frame.select('select#colors', 'blue'); // single selection
  2. frame.select('select#colors', 'red', 'green', 'blue'); // multiple selections

frame.title()

frame.url()

Returns frame's url.

frame.waitFor(selectorOrFunctionOrTimeout[, options[, ...args]])

This method behaves differently with respect to the type of the first parameter:
- if selectorOrFunctionOrTimeout is a string, then the first argument is treated as a selector to wait for and the method is a shortcut for frame.waitForSelector
- if selectorOrFunctionOrTimeout is a function, then the first argument is treated as a predicate to wait for and the method is a shortcut for frame.waitForFunction().
- if selectorOrFunctionOrTimeout is a number, then the first argument is treated as a timeout in milliseconds and the method returns a promise which resolves after the timeout
- otherwise, an exception is thrown

frame.waitForFunction(pageFunction[, options[, ...args]])

The waitForFunction can be used to observe viewport size change:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. const watchDog = page.mainFrame().waitForFunction('window.innerWidth < 100');
  5. page.setViewport({width: 50, height: 50});
  6. await watchDog;
  7. await browser.close();
  8. });

frame.waitForSelector(selector[, options])

Wait for the selector to appear in page. If at the moment of calling
the method the selector already exists, the method will return
immediately. If the selector doesn't appear after the timeout milliseconds of waiting, the function will throw.

This method works across navigations:

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. let currentURL;
  5. page.mainFrame()
  6. .waitForSelector('img')
  7. .then(() => console.log('First URL with image: ' + currentURL));
  8. for (currentURL of ['https://example.com', 'https://google.com', 'https://bbc.com'])
  9. await page.goto(currentURL);
  10. await browser.close();
  11. });

class: ExecutionContext

The class represents a context for JavaScript execution. Examples of JavaScript contexts are:
- each frame has a separate execution context
- all kind of workers have their own contexts

executionContext.evaluate(pageFunction, ...args)

If the function, passed to the executionContext.evaluate, returns a Promise, then executionContext.evaluate would wait for the promise to resolve and return its value.

  1. const executionContext = page.mainFrame().executionContext();
  2. const result = await executionContext.evaluate(() => Promise.resolve(8 * 7));
  3. console.log(result); // prints "56"

A string can also be passed in instead of a function.

  1. console.log(await executionContext.evaluate('1 + 2')); // prints "3"

JSHandle instances can be passed as arguments to the executionContext.evaluate:

  1. const oneHandle = await executionContext.evaluateHandle(() => 1);
  2. const twoHandle = await executionContext.evaluateHandle(() => 2);
  3. const result = await executionContext.evaluate((a, b) => a + b, oneHandle, twoHandle);
  4. await oneHandle.dispose();
  5. await twoHandle.dispose();
  6. console.log(result); // prints '3'.

executionContext.evaluateHandle(pageFunction, ...args)

If the function, passed to the executionContext.evaluateHandle, returns a Promise, then executionContext.evaluteHandle would wait for the promise to resolve and return its value.

  1. const context = page.mainFrame().executionContext();
  2. const aHandle = await context.evaluateHandle(() => Promise.resolve(self));
  3. aHandle; // Handle for the global object.

A string can also be passed in instead of a function.

  1. const aHandle = await context.evaluateHandle('1 + 2'); // Handle for the '3' object.

JSHandle instances can be passed as arguments to the executionContext.evaluateHandle:

  1. const aHandle = await context.evaluateHandle(() => document.body);
  2. const resultHandle = await context.evaluateHandle(body => body.innerHTML, aHandle);
  3. console.log(await resultHandle.jsonValue()); // prints body's innerHTML
  4. await aHandle.dispose();
  5. await resultHandle.dispose();

executionContext.queryObjects(prototypeHandle)

The method iterates JavaScript heap and finds all the objects with the given prototype.

  1. // Create a Map object
  2. await page.evaluate(() => window.map = new Map());
  3. // Get a handle to the Map object prototype
  4. const mapPrototype = await page.evaluateHandle(() => Map.prototype);
  5. // Query all map instances into an array
  6. const mapInstances = await page.queryObjects(mapPrototype);
  7. // Count amount of map objects in heap
  8. const count = await page.evaluate(maps => maps.length, mapInstances);
  9. await mapInstances.dispose();
  10. await mapPrototype.dispose();

class: JSHandle

JSHandle represents an in-page JavaScript object. JSHandles can be created with the page.evaluateHandle method.

  1. const windowHandle = await page.evaluateHandle(() => window);
  2. // ...

JSHandle prevents references JavaScript objects from garbage collection unless the handle is disposed. JSHandles are auto-disposed when their origin frame gets navigated or the parent context gets destroyed.

JSHandle instances can be used as arguments in page.$eval(), page.evaluate() and page.evaluateHandle methods.

jsHandle.asElement()

Returns either null or the object handle itself, if the object handle is an instance of ElementHandle.

jsHandle.dispose()

The jsHandle.dispose method stops referencing the element handle.

jsHandle.executionContext()

Returns execution context the handle belongs to.

jsHandle.getProperties()

The method returns a map with property names as keys and JSHandle instances for the property values.

  1. const handle = await page.evaluateHandle(() => ({window, document}));
  2. const properties = await handle.getProperties();
  3. const windowHandle = properties.get('window');
  4. const documentHandle = properties.get('document');
  5. await handle.dispose();

jsHandle.getProperty(propertyName)

Fetches a single property from the referenced object.

jsHandle.jsonValue()

Returns a JSON representation of the object. If the object has a
toJSON
function, it will not be called.

NOTE The method will return an empty JSON if the referenced object is not stringifiable. It will throw an error if the object has circular references.

class: ElementHandle

NOTE Class ElementHandle extends JSHandle.

ElementHandle represents an in-page DOM element. ElementHandles can be created with the page.$ method.

  1. const puppeteer = require('puppeteer');
  2. puppeteer.launch().then(async browser => {
  3. const page = await browser.newPage();
  4. await page.goto('https://google.com');
  5. const inputElement = await page.$('input[type=submit]');
  6. await inputElement.click();
  7. // ...
  8. });

ElementHandle prevents DOM element from garbage collection unless the handle is disposed. ElementHandles are auto-disposed when their origin frame gets navigated.

ElementHandle instances can be used as arguments in page.$eval() and page.evaluate() methods.

elementHandle.$(selector)

The method runs element.querySelector within the page. If no element matches the selector, the return value resolve to null.

elementHandle.$$(selector)

The method runs element.querySelectorAll within the page. If no elements match the selector, the return value resolve to [].

elementHandle.asElement()

elementHandle.boundingBox()

This method returns the bounding box of the element (relative to the main frame), or null if the element is not visible.

elementHandle.click([options])

This method scrolls element into view if needed, and then uses page.mouse to click in the center of the element.
If the element is detached from DOM, the method throws an error.

elementHandle.dispose()

The elementHandle.dispose method stops referencing the element handle.

elementHandle.executionContext()

elementHandle.focus()

Calls focus on the element.

elementHandle.getProperties()

The method returns a map with property names as keys and JSHandle instances for the property values.

  1. const listHandle = await page.evaluateHandle(() => document.body.children);
  2. const properties = await listHandle.getProperties();
  3. const children = [];
  4. for (const property of properties.values()) {
  5. const element = property.asElement();
  6. if (element)
  7. children.push(element);
  8. }
  9. children; // holds elementHandles to all children of document.body

elementHandle.getProperty(propertyName)

Fetches a single property from the objectHandle.

elementHandle.hover()

This method scrolls element into view if needed, and then uses page.mouse to hover over the center of the element.
If the element is detached from DOM, the method throws an error.

elementHandle.jsonValue()

Returns a JSON representation of the object. The JSON is generated by running JSON.stringify on the object in page and consequent JSON.parse in puppeteer.

NOTE The method will throw if the referenced object is not stringifiable.

elementHandle.press(key[, options])

Focuses the element, and then uses keyboard.down and keyboard.up.

If key is a single character and no modifier keys besides Shift are being held down, a keypress/input event will also be generated. The text option can be specified to force an input event to be generated.

NOTE Modifier keys DO effect elementHandle.press. Holding down Shift will type the text in upper case.

elementHandle.screenshot([options])

This method scrolls element into view if needed, and then uses page.screenshot to take a screenshot of the element.
If the element is detached from DOM, the method throws an error.

elementHandle.tap()

This method scrolls element into view if needed, and then uses touchscreen.tap to tap in the center of the element.
If the element is detached from DOM, the method throws an error.

elementHandle.toString()

elementHandle.type(text[, options])

Focuses the element, and then sends a keydown, keypress/input, and keyup event for each character in the text.

To press a special key, like Control or ArrowDown, use elementHandle.press.

  1. elementHandle.type('Hello'); // Types instantly
  2. elementHandle.type('World', {delay: 100}); // Types slower, like a user

An example of typing into a text field and then submitting the form:

  1. const elementHandle = await page.$('input');
  2. await elementHandle.type('some text');
  3. await elementHandle.press('Enter');

elementHandle.uploadFile(...filePaths)

This method expects elementHandle to point to an input element.

class: Request

Whenever the page sends a request, the following events are emitted by puppeteer's page:
- 'request' emitted when the request is issued by the page.
- 'response' emitted when/if the response is received for the request.
- 'requestfinished' emitted when the response body is downloaded and the request is complete.

If request fails at some point, then instead of 'requestfinished' event (and possibly instead of 'response' event), the 'requestfailed' event is emitted.

If request gets a 'redirect' response, the request is successfully finished with the 'requestfinished' event, and a new request is issued to a redirected url.

request.abort([errorCode])

Aborts request. To use this, request interception should be enabled with page.setRequestInterception.
Exception is immediately thrown if the request interception is not enabled.

request.continue([overrides])

Continues request with optional request overrides. To use this, request interception should be enabled with page.setRequestInterception.
Exception is immediately thrown if the request interception is not enabled.

request.failure()

The method returns null unless this request was failed, as reported by
requestfailed event.

Example of logging all failed requests:

  1. page.on('requestfailed', request => {
  2. console.log(request.url + ' ' + request.failure().errorText);
  3. });

request.headers

request.method

Contains the request's method (GET, POST, etc.)

request.postData

Contains the request's post body, if any.

request.resourceType

Contains the request's resource type as it was perceived by the rendering engine.
ResourceType will be one of the following: document, stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other.

request.respond(response)

Fulfills request with given response. To use this, request interception should
be enabled with page.setRequestInterception. Exception is thrown if
request interception is not enabled.

An example of fulfilling all requests with 404 responses:

  1. await page.setRequestInterception(true);
  2. page.on('request', request => {
  3. request.respond({
  4. status: 404,
  5. contentType: 'text/plain',
  6. body: 'Not Found!'
  7. });
  8. });

NOTE Mocking responses for dataURL requests is not supported.
Calling request.respond for a dataURL request is a noop.

request.response()

request.url

Contains the URL of the request.

class: Response

Response class represents responses which are received by page.

response.buffer()

response.headers

response.json()

This method will throw if the response body is not parsable via JSON.parse.

response.ok

Contains a boolean stating whether the response was successful (status in the range 200-299) or not.

response.request()

response.status

Contains the status code of the response (e.g., 200 for a success).

response.text()

response.url

Contains the URL of the response.

class: Target

target.page()

If the target is not of type "page", returns null.

target.type()

Identifies what kind of target this is. Can be "page", "service_worker", or "other".

target.url()

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注