@songying
2018-07-26T09:43:42.000000Z
字数 3309
阅读 1527
python库
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://python.org/')
requests_html.user_agent
requests_html.user_agent(style=None)
Returns an apparently legit user-agent, if not requested one of a specific style. Defaults to a Chrome-style User-Agent.
HTML, Element, HTMLSession
表示一个HTML文本,等待被parse。
class requests_html.HTML(*, session: Union[_ForwardRef('HTTPSession'), _ForwardRef('AsyncHTMLSession')] = None, url: str = 'https://example.org/', html: Union[str, bytes], default_encoding: str = 'utf-8')
- url: The URL from which the HTML originated, used for absolute_links
- html : HTML from which to base the parsing upon (optional).
- default_encoding – Which encoding to default to.
absolute_links: All found links on page, in absolute form
base_url: The base URL for the page.
给定一个CSS Selector,
find(selector: str = '*', *, containing: Union[str, typing.List[str]] = None, clean: bool = False, first: bool = False, _encoding: str = None) → Union[typing.List[_ForwardRef('Element')], _ForwardRef('Element')]
- selector: CSS Selector to use.
- clean: Whether or not to sanitize the found HTML of and tags.
- containing : If specified, only return elements that contain the provided text.
- first: Whether or not to return just the first result.
_encoding
: The encoding format.
表示一个HTML的element。
class requests_html.Element(*, element, url: str, default_encoding: str = None)
- element : The element from which to base the parsing upon.
- url :The URL from which the HTML originated, used for absolute_links.
- default_encoding: Which encoding to default to.
class requests_html.HTMLSession(mock_browser=True)
Makes an HTTP Request, with mocked User–Agent headers. Returns a class:HTTPResponse .
request(*args, **kwargs)
- send(request, **kwargs)
- 返回值: requests.Response
send(request, **kwargs)
- Sends a GET request. Returns Response object.
- 返回类型: requests.Response
get(url, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
- Sends a HEAD request. Returns Response object.
- 返回类型: requests.Response
head(url, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
- Sends a POST request. Returns Response object.
- 返回类型: requests.Response
post(url, data=None, json=None, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
- Sends a PUT request. Returns Response object.
- 返回类型: requests.Response
put(url, data=None, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
- 发出一个OPTIONS request, 返回一个Response对象。
- 返回类型: requests.Response
options(url, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
- 发出一个PATCH request, 返回一个Response对象。
- 返回类型: requests.Response
patch(url, data=None, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
- 发出一个DELETE request, 返回一个Response对象。
- 返回类型: requests.Response
delete(url, **kwargs)
- url:
- **kwargs: Optional arguments that request takes.
If a browser was created close it first.