[关闭]
@songying 2018-07-26T09:43:42.000000Z 字数 3309 阅读 1527

requests-html库

python库


  1. from requests_html import HTMLSession

基本使用

  1. session = HTMLSession()
  2. r = session.get('https://python.org/')

功能函数

requests_html.user_agent

  1. requests_html.user_agent(style=None)

Returns an apparently legit user-agent, if not requested one of a specific style. Defaults to a Chrome-style User-Agent.

main class

HTML, Element, HTMLSession

HTML类

表示一个HTML文本,等待被parse。

  1. class requests_html.HTML(*, session: Union[_ForwardRef('HTTPSession'), _ForwardRef('AsyncHTMLSession')] = None, url: str = 'https://example.org/', html: Union[str, bytes], default_encoding: str = 'utf-8')
  • url: The URL from which the HTML originated, used for absolute_links
  • html : HTML from which to base the parsing upon (optional).
  • default_encoding – Which encoding to default to.

1. 属性

2. 方法

find()

给定一个CSS Selector,

  1. find(selector: str = '*', *, containing: Union[str, typing.List[str]] = None, clean: bool = False, first: bool = False, _encoding: str = None) Union[typing.List[_ForwardRef('Element')], _ForwardRef('Element')]
  • selector: CSS Selector to use.
  • clean: Whether or not to sanitize the found HTML of and tags.
  • containing : If specified, only return elements that contain the provided text.
  • first: Whether or not to return just the first result.
  • _encoding: The encoding format.

reader()

search_all()

xpath()

类 Element

表示一个HTML的element。

  1. class requests_html.Element(*, element, url: str, default_encoding: str = None)
  • element : The element from which to base the parsing upon.
  • url :The URL from which the HTML originated, used for absolute_links.
  • default_encoding: Which encoding to default to.

1. 属性

2. 方法

类 HTMLSession

  1. class requests_html.HTMLSession(mock_browser=True)

方法

request()

Makes an HTTP Request, with mocked User–Agent headers. Returns a class:HTTPResponse .

  1. request(*args, **kwargs)

send()

  • send(request, **kwargs)
  • 返回值: requests.Response
  1. send(request, **kwargs)

get()

  • Sends a GET request. Returns Response object.
  • 返回类型: requests.Response
  1. get(url, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.
  • Sends a HEAD request. Returns Response object.
  • 返回类型: requests.Response
  1. head(url, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.

post()

  • Sends a POST request. Returns Response object.
  • 返回类型: requests.Response
  1. post(url, data=None, json=None, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.

put()

  • Sends a PUT request. Returns Response object.
  • 返回类型: requests.Response
  1. put(url, data=None, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.

options()

  • 发出一个OPTIONS request, 返回一个Response对象。
  • 返回类型: requests.Response
  1. options(url, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.

patch()

  • 发出一个PATCH request, 返回一个Response对象。
  • 返回类型: requests.Response
  1. patch(url, data=None, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.

delete()

  • 发出一个DELETE request, 返回一个Response对象。
  • 返回类型: requests.Response
  1. delete(url, **kwargs)
  • url:
  • **kwargs: Optional arguments that request takes.

close()

If a browser was created close it first.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注