metagpt源码 (PlaywrightWrapper类)

前提条件,安装Playwright, 教程见 Getting started - Library
主要命令:

pip install playwright
playwright install

1. PlaywrightWrapper 源码

#!/usr/bin/env python
# -*- coding: utf-8 -*-from __future__ import annotationsimport asyncio
import sys
from pathlib import Path
from typing import Literal, Optionalfrom playwright.async_api import async_playwright
from pydantic import BaseModel, Field, PrivateAttrfrom metagpt.logs import logger
from metagpt.utils.parse_html import WebPageclass PlaywrightWrapper(BaseModel):"""Wrapper around Playwright.To use this module, you should have the `playwright` Python package installed and ensure thatthe required browsers are also installed. You can install playwright by running the command`pip install metagpt[playwright]` and download the necessary browser binaries by running thecommand `playwright install` for the first time."""browser_type: Literal["chromium", "firefox", "webkit"] = "chromium"launch_kwargs: dict = Field(default_factory=dict)proxy: Optional[str] = Nonecontext_kwargs: dict = Field(default_factory=dict)_has_run_precheck: bool = PrivateAttr(False)def __init__(self, **kwargs):super().__init__(**kwargs)launch_kwargs = self.launch_kwargsif self.proxy and "proxy" not in launch_kwargs:args = launch_kwargs.get("args", [])if not any(str.startswith(i, "--proxy-server=") for i in args):launch_kwargs["proxy"] = {"server": self.proxy}if "ignore_https_errors" in kwargs:self.context_kwargs["ignore_https_errors"] = kwargs["ignore_https_errors"]async def run(self, url: str, *urls: str) -> WebPage | list[WebPage]:async with async_playwright() as ap:browser_type = getattr(ap, self.browser_type)await self._run_precheck(browser_type)browser = await browser_type.launch(**self.launch_kwargs)_scrape = self._scrapeif urls:return await asyncio.gather(_scrape(browser, url), *(_scrape(browser, i) for i in urls))return await _scrape(browser, url)async def _scrape(self, browser, url):context = await browser.new_context(**self.context_kwargs)page = await context.new_page()async with page:try:await page.goto(url)await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")html = await page.content()inner_text = await page.evaluate("() => document.body.innerText")except Exception as e:inner_text = f"Fail to load page content for {e}"html = ""return WebPage(inner_text=inner_text, html=html, url=url)async def _run_precheck(self, browser_type):if self._has_run_precheck:returnexecutable_path = Path(browser_type.executable_path)if not executable_path.exists() and "executable_path" not in self.launch_kwargs:kwargs = {}if self.proxy:kwargs["env"] = {"ALL_PROXY": self.proxy}await _install_browsers(self.browser_type, **kwargs)if self._has_run_precheck:returnif not executable_path.exists():parts = executable_path.partsavailable_paths = list(Path(*parts[:-3]).glob(f"{self.browser_type}-*"))if available_paths:logger.warning("It seems that your OS is not officially supported by Playwright. ""Try to set executable_path to the fallback build version.")executable_path = available_paths[0].joinpath(*parts[-2:])self.launch_kwargs["executable_path"] = str(executable_path)self._has_run_precheck = Truedef _get_install_lock():global _install_lockif _install_lock is None:_install_lock = asyncio.Lock()return _install_lockasync def _install_browsers(*browsers, **kwargs) -> None:async with _get_install_lock():browsers = [i for i in browsers if i not in _install_cache]if not browsers:returnprocess = await asyncio.create_subprocess_exec(sys.executable,"-m","playwright","install",*browsers,# "--with-deps",stdout=asyncio.subprocess.PIPE,stderr=asyncio.subprocess.PIPE,**kwargs,)await asyncio.gather(_log_stream(process.stdout, logger.info),_log_stream(process.stderr, logger.warning),)if await process.wait() == 0:logger.info("Install browser for playwright successfully.")else:logger.warning("Fail to install browser for playwright.")_install_cache.update(browsers)async def _log_stream(sr, log_func):while True:line = await sr.readline()if not line:returnlog_func(f"[playwright install browser]: {line.decode().strip()}")_install_lock: asyncio.Lock = None
_install_cache = set()

2. 测试

async def webTest():websearch = PlaywrightWrapper()result = await websearch.run(url="https://playwright.dev/")print(result)if __name__ == "__main__":asyncio.run(webTest())

输出:

inner_text='Skip to main content\nPlaywright\nDocs\nAPI\nNode.js\nCommunity\nSearch\nK\nPlaywright enables reliable end-to-end testing for modern web apps.\nGET STARTED\nStar\n67k+\n\n\n\n\n\nAny browser • Any platform • One API\n\nCross-browser. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox.\n\nCross-platform. Test on Windows, Linux, and macOS, locally or on CI, headless or headed.\n\nCross-language. Use the Playwright API in TypeScript, JavaScript, Python, .NET, Java.\n\nTest Mobile Web. Native mobile emulation of Google Chrome for Android and Mobile Safari. The same rendering engine works on your Desktop and in the Cloud.\n\nResilient • No flaky tests\n\nAuto-wait. Playwright waits for elements to be actionable prior to performing actions. It also has a rich set of introspection events. The combination of the two eliminates the need for artificial timeouts - the primary cause of flaky tests.\n\nWeb-first assertions. Playwright assertions are created specifically for the dynamic web. Checks are automatically retried until the necessary conditions are met.\n\nTracing. Configure test retry strategy, capture execution trace, videos, screenshots to eliminate flakes.\n\nNo trade-offs • No limits\n\nBrowsers run web content belonging to different origins in different processes. Playwright is aligned with the modern browsers architecture and runs tests out-of-process. This makes Playwright free of the typical in-process test runner limitations.\n\nMultiple everything. Test scenarios that span multiple tabs, multiple origins and multiple users. Create scenarios with different contexts for different users and run them against your server, all in one test.\n\nTrusted events. Hover elements, interact with dynamic controls, produce trusted events. Playwright uses real browser input pipeline indistinguishable from the real user.\n\nTest frames, pierce Shadow DOM. Playwright selectors pierce shadow DOM and allow entering frames seamlessly.\n\nFull isolation • Fast execution\n\nBrowser contexts. Playwright creates a browser context for each test. Browser context is equivalent to a brand new browser profile. This delivers full test isolation with zero overhead. Creating a new browser context only takes a handful of milliseconds.\n\nLog in once. Save the authentication state of the context and reuse it in all the tests. This bypasses repetitive log-in operations in each test, yet delivers full isolation of independent tests.\n\nPowerful Tooling\n\nCodegen. Generate tests by recording your actions. Save them into any language.\n\nPlaywright inspector. Inspect page, generate selectors, step through the test execution, see click points, explore execution logs.\n\nTrace Viewer. Capture all the information to investigate the test failure. Playwright trace contains test execution screencast, live DOM snapshots, action explorer, test source, and many more.\n\nChosen by companies and open source projects\nLearn\nGetting started\nPlaywright Training\nLearn Videos\nFeature Videos\nCommunity\nStack Overflow\nDiscord\nTwitter\nLinkedIn\nMore\nGitHub\nYouTube\nBlog\nAmbassadors\nCopyright © 2024 Microsoft' html='<!DOCTYPE html><html lang="en" dir="ltr" class="plugin-pages plugin-id-default" data-has-hydrated="true" data-theme="light" data-rh="lang,dir,class,data-has-hydrated"><head><meta charset="UTF-8"><meta name="generator" content="Docusaurus v3.6.3"><title>Fast and reliable end-to-end testing for modern web apps | Playwright</title><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://repository-images.githubusercontent.com/221981891/8c5c6942-c91f-4df1-825f-4cf474056bd7"><meta data-rh="true" name="twitter:image" content="https://repository-images.githubusercontent.com/221981891/8c5c6942-c91f-4df1-825f-4cf474056bd7"><meta data-rh="true" property="og:url" content="https://playwright.dev/"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docusaurus_tag" content="default"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docsearch:docusaurus_tag" content="default"><meta data-rh="true" property="og:title" content="Fast and reliable end-to-end testing for modern web apps | Playwright"><meta data-rh="true" name="description" content="Cross-browser end-to-end testing for modern web apps"><meta data-rh="true" property="og:description" content="Cross-browser end-to-end testing for modern web apps"><link data-rh="true" rel="icon" href="/img/playwright-logo.svg"><link data-rh="true" rel="canonical" href="https://playwright.dev/"><link data-rh="true" rel="alternate" href="https://playwright.dev/" hreflang="en"><link data-rh="true" rel="alternate" href="https://playwright.dev/" hreflang="x-default"><link rel="search" type="application/opensearchdescription+xml" title="Playwright" href="/opensearch.xml"><script src="/js/redirection.js"></script><link rel="stylesheet" href="/assets/css/styles.f6afdb5c.css"><script src="/assets/js/runtime~main.eab3411e.js" defer=""></script><script src="/assets/js/main.89c99183.js" defer=""></script><meta name="viewport" content="width=device-width, initial-scale=1.0" data-rh="true"><link rel="preconnect" href="https://K09ICMCV6X-dsn.algolia.net" crossorigin="anonymous" data-rh="true"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/aba21aa0.c688fdbe.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/0058b4c6.14b9012b.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/aba21aa0.c688fdbe.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/0058b4c6.14b9012b.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/aba21aa0.c688fdbe.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/0058b4c6.14b9012b.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/17896441.df3d9d27.js"><link rel="prefetch" href="/assets/js/4cf51b27.1271ec49.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/aba21aa0.c688fdbe.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/0058b4c6.14b9012b.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/aba21aa0.c688fdbe.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/0058b4c6.14b9012b.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/aba21aa0.c688fdbe.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/0058b4c6.14b9012b.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/17896441.df3d9d27.js"><link rel="prefetch" href="/assets/js/90f396e5.be5740f0.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/e0719818.d9721f6c.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/d2436a2b.e2223d7d.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/e0719818.d9721f6c.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/d2436a2b.e2223d7d.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/5e95c892.d02a82d0.js"><link rel="prefetch" href="/assets/js/e0719818.d9721f6c.js"><link rel="prefetch" href="/assets/js/a7bd4aaa.beab1732.js"><link rel="prefetch" href="/assets/js/d2436a2b.e2223d7d.js"><link rel="prefetch" href="/assets/js/a94703ab.16a84712.js"><link rel="prefetch" href="/assets/js/17896441.df3d9d27.js"><link rel="prefetch" href="/assets/js/083f60f3.579058f4.js"><link rel="prefetch" href="/assets/js/1df93b7f.f328e6f7.js"><link rel="prefetch" href="/assets/js/a7456010.b01acea0.js"></head><body class="navigation-with-keyboard" data-rh="class" style="overflow: visible;"><script>!function(){var t,e=function(){try{return new URLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}}()||function(){try{return window.localStorage.getItem("theme")}catch(t){}}();t=null!==e?e:window.matchMedia("(prefers-color-scheme: dark)").matches?"dark":window.matchMedia("(prefers-color-scheme: light)").matches?"light":"dark",document.documentElement.setAttribute("data-theme",t)}(),function(){try{for(var[t,e]of new URLSearchParams(window.location.search).entries())if(t.startsWith("docusaurus-data-")){var a=t.replace("docusaurus-data-","data-");document.documentElement.setAttribute(a,e)}}catch(t){}}()</script><div id="__docusaurus"><div role="region" aria-label="Skip to main content"><a class="skipToContent_fXgn" href="#__docusaurus_skipToContent_fallback">Skip to main content</a></div><nav aria-label="Main" class="navbar navbar--fixed-top"><div class="navbar__inner"><div class="navbar__items"><button aria-label="Toggle navigation bar" aria-expanded="false" class="navbar__toggle clean-btn" type="button"><svg width="30" height="30" viewBox="0 0 30 30" aria-hidden="true"><path stroke="currentColor" stroke-linecap="round" stroke-miterlimit="10" stroke-width="2" d="M4 7h22M4 15h22M4 23h22"></path></svg></button><a class="navbar__brand" href="/"><div class="navbar__logo"><img src="/img/playwright-logo.svg" alt="Playwright logo" class="themedComponent_mlkZ themedComponent--light_NVdE"></div><b class="navbar__title text--truncate">Playwright</b></a><a class="navbar__item navbar__link" href="/docs/intro">Docs</a><a class="navbar__item navbar__link" href="/docs/api/class-playwright">API</a><div class="navbar__item dropdown dropdown--hoverable"><a href="#" aria-haspopup="true" aria-expanded="false" role="button" class="navbar__link">Node.js</a><ul class="dropdown__menu"><li><a href="/" rel="noopener noreferrer" class="dropdown__link undefined dropdown__link--active" data-language-prefix="/">Node.js</a></li><li><a href="/python/" rel="noopener noreferrer" class="dropdown__link" data-language-prefix="/python/">Python</a></li><li><a href="/java/" rel="noopener noreferrer" class="dropdown__link" data-language-prefix="/java/">Java</a></li><li><a href="/dotnet/" rel="noopener noreferrer" class="dropdown__link" data-language-prefix="/dotnet/">.NET</a></li></ul></div><a class="navbar__item navbar__link" href="/community/welcome">Community</a></div><div class="navbar__items navbar__items--right"><a href="https://github.com/microsoft/playwright" target="_blank" rel="noopener noreferrer" class="navbar__item navbar__link header-github-link" aria-label="GitHub repository"></a><a href="https://aka.ms/playwright/discord" target="_blank" rel="noopener noreferrer" class="navbar__item navbar__link header-discord-link" aria-label="Discord server"></a><div class="toggle_vylO colorModeToggle_DEke"><button class="clean-btn toggleButton_gllP" type="button" title="Switch between dark and light mode (currently dark mode)" aria-label="Switch between dark and light mode (currently dark mode)" aria-live="polite" aria-pressed="true"><svg viewBox="0 0 24 24" width="24" height="24" class="lightToggleIcon_pyhR"><path fill="currentColor" d="M12,9c1.65,0,3,1.35,3,3s-1.35,3-3,3s-3-1.35-3-3S10.35,9,12,9 M12,7c-2.76,0-5,2.24-5,5s2.24,5,5,5s5-2.24,5-5 S14.76,7,12,7L12,7z M2,13l2,0c0.55,0,1-0.45,1-1s-0.45-1-1-1l-2,0c-0.55,0-1,0.45-1,1S1.45,13,2,13z M20,13l2,0c0.55,0,1-0.45,1-1 s-0.45-1-1-1l-2,0c-0.55,0-1,0.45-1,1S19.45,13,20,13z M11,2v2c0,0.55,0.45,1,1,1s1-0.45,1-1V2c0-0.55-0.45-1-1-1S11,1.45,11,2z M11,20v2c0,0.55,0.45,1,1,1s1-0.45,1-1v-2c0-0.55-0.45-1-1-1C11.45,19,11,19.45,11,20z M5.99,4.58c-0.39-0.39-1.03-0.39-1.41,0 c-0.39,0.39-0.39,1.03,0,1.41l1.06,1.06c0.39,0.39,1.03,0.39,1.41,0s0.39-1.03,0-1.41L5.99,4.58z M18.36,16.95 c-0.39-0.39-1.03-0.39-1.41,0c-0.39,0.39-0.39,1.03,0,1.41l1.06,1.06c0.39,0.39,1.03,0.39,1.41,0c0.39-0.39,0.39-1.03,0-1.41 L18.36,16.95z M19.42,5.99c0.39-0.39,0.39-1.03,0-1.41c-0.39-0.39-1.03-0.39-1.41,0l-1.06,1.06c-0.39,0.39-0.39,1.03,0,1.41 s1.03,0.39,1.41,0L19.42,5.99z M7.05,18.36c0.39-0.39,0.39-1.03,0-1.41c-0.39-0.39-1.03-0.39-1.41,0l-1.06,1.06 c-0.39,0.39-0.39,1.03,0,1.41s1.03,0.39,1.41,0L7.05,18.36z"></path></svg><svg viewBox="0 0 24 24" width="24" height="24" class="darkToggleIcon_wfgR"><path fill="currentColor" d="M9.37,5.51C9.19,6.15,9.1,6.82,9.1,7.5c0,4.08,3.32,7.4,7.4,7.4c0.68,0,1.35-0.09,1.99-0.27C17.45,17.19,14.93,19,12,19 c-3.86,0-7-3.14-7-7C5,9.07,6.81,6.55,9.37,5.51z M12,3c-4.97,0-9,4.03-9,9s4.03,9,9,9s9-4.03,9-9c0-0.46-0.04-0.92-0.1-1.36 c-0.98,1.37-2.58,2.26-4.4,2.26c-2.98,0-5.4-2.42-5.4-5.4c0-1.81,0.89-3.42,2.26-4.4C12.92,3.04,12.46,3,12,3L12,3z"></path></svg></button></div><div class="navbarSearchContainer_Bca1"><button type="button" class="DocSearch DocSearch-Button" aria-label="Search (Ctrl+K)"><span class="DocSearch-Button-Container"><svg width="20" height="20" class="DocSearch-Search-Icon" viewBox="0 0 20 20" aria-hidden="true"><path d="M14.386 14.386l4.0877 4.0877-4.0877-4.0877c-2.9418 2.9419-7.7115 2.9419-10.6533 0-2.9419-2.9418-2.9419-7.7115 0-10.6533 2.9418-2.9419 7.7115-2.9419 10.6533 0 2.9419 2.9418 2.9419 7.7115 0 10.6533z" stroke="currentColor" fill="none" fill-rule="evenodd" stroke-linecap="round" stroke-linejoin="round"></path></svg><span class="DocSearch-Button-Placeholder">Search</span></span><span class="DocSearch-Button-Keys"><kbd class="DocSearch-Button-Key"><svg width="15" height="15" class="DocSearch-Control-Key-Icon"><path d="M4.505 4.496h2M5.505 5.496v5M8.216 4.496l.055 5.993M10 7.5c.333.333.5.667.5 1v2M12.326 4.5v5.996M8.384 4.496c1.674 0 2.116 0 2.116 1.5s-.442 1.5-2.116 1.5M3.205 9.303c-.09.448-.277 1.21-1.241 1.203C1 10.5.5 9.513.5 8V7c0-1.57.5-2.5 1.464-2.494.964.006 1.134.598 1.24 1.342M12.553 10.5h1.953" stroke-width="1.2" stroke="currentColor" fill="none" stroke-linecap="square"></path></svg></kbd><kbd class="DocSearch-Button-Key">K</kbd></span></button></div></div></div><div role="presentation" class="navbar-sidebar__backdrop"></div></nav><div id="__docusaurus_skipToContent_fallback" class="main-wrapper mainWrapper_z2l0"><header class="hero hero--primary heroBanner_UJJx"><div class="container"><h1 class="hero__title heroTitle_ohkl"><span class="highlight_gXVj">Playwright</span> enables reliable end-to-end testing for modern web apps.</h1><div class="buttons_pzbO"><a class="getStarted_Sjon" href="/docs/intro">Get started</a><span class="github-btn github-stargazers github-btn-large"><a class="gh-btn" href="https://github.com/microsoft/playwright" rel="noopener noreferrer" target="_blank" aria-label="Star microsoft/playwright on GitHub"><span class="gh-ico" aria-hidden="true"></span><span class="gh-text">Star</span></a><a class="gh-count" href="https://github.com/microsoft/playwright/stargazers" rel="noopener noreferrer" target="_blank" aria-label="67k+ stargazers on GitHub" style="display:block">67k+</a></span></div></div></header><br><main><br><br><div style="text-align:center"><img src="img/logos/Browsers.png" width="40%" alt="Browsers (Chromium, Firefox, WebKit)"></div><section class="features_keug"><div class="container"><div class="row"><div class="col col--6" style="margin-top:40px"><h3>Any browser • Any platform • One API</h3><div><p><b>Cross-browser.</b> Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox.</p><p><b>Cross-platform.</b> Test on Windows, Linux, and macOS, locally or on CI, headless or headed.</p><p><b>Cross-language.</b> Use the Playwright API in <a href="https://playwright.dev/docs/intro">TypeScript</a>, <a href="https://playwright.dev/docs/intro">JavaScript</a>, <a href="https://playwright.dev/python/docs/intro">Python</a>, <a href="https://playwright.dev/dotnet/docs/intro">.NET</a>, <a href="https://playwright.dev/java/docs/intro">Java</a>.</p><p><b>Test Mobile Web.</b> Native mobile emulation of Google Chrome for Android and Mobile Safari. The same rendering engine works on your Desktop and in the Cloud.</p></div></div><div class="col col--6" style="margin-top:40px"><h3></h3><div></div></div><div class="col col--6" style="margin-top:40px"><h3></h3><div></div></div><div class="col col--6" style="margin-top:40px"><h3>Resilient • No flaky tests</h3><div><p><b>Auto-wait.</b> Playwright waits for elements to be actionable prior to performing actions. It also has a rich set of introspection events. The combination of the two eliminates the need for artificial timeouts - the primary cause of flaky tests.</p><p><b>Web-first assertions.</b> Playwright assertions are created specifically for the dynamic web. Checks are automatically retried until the necessary conditions are met.</p><p><b>Tracing.</b> Configure test retry strategy, capture execution trace, videos, screenshots to eliminate flakes.</p></div></div><div class="col col--6" style="margin-top:40px"><h3>No trade-offs • No limits</h3><div><p>Browsers run web content belonging to different origins in different processes. Playwright is aligned with the modern browsers architecture and runs tests out-of-process. This makes Playwright free of the typical in-process test runner limitations.</p><p><b>Multiple everything.</b> Test scenarios that span multiple <b>tabs</b>, multiple <b>origins</b> and multiple <b>users</b>. Create scenarios with different contexts for different users and run them against your server, all in one test.</p><p><b>Trusted events.</b> Hover elements, interact with dynamic controls, produce trusted events. Playwright uses real browser input pipeline indistinguishable from the real user.</p><p><b>Test frames, pierce Shadow DOM.</b> Playwright selectors pierce shadow DOM and allow entering frames seamlessly.</p></div></div><div class="col col--6" style="margin-top:40px"><h3></h3><div></div></div><div class="col col--6" style="margin-top:40px"><h3></h3><div></div></div><div class="col col--6" style="margin-top:40px"><h3>Full isolation • Fast execution</h3><div><p><b>Browser contexts.</b> Playwright creates a browser context for each test. Browser context is equivalent to a brand new browser profile. This delivers full test isolation with zero overhead. Creating a new browser context only takes a handful of milliseconds.</p><p><b>Log in once.</b> Save the authentication state of the context and reuse it in all the tests. This bypasses repetitive log-in operations in each test, yet delivers full isolation of independent tests.</p></div></div><div class="col col--6" style="margin-top:40px"><h3>Powerful Tooling</h3><div><p><b><a href="docs/codegen">Codegen.</a></b> Generate tests by recording your actions. Save them into any language.</p><p><b><a href="docs/debug#playwright-inspector">Playwright inspector.</a></b> Inspect page, generate selectors, step through the test execution, see click points, explore execution logs.</p><p><b><a href="docs/trace-viewer-intro">Trace Viewer.</a></b> Capture all the information to investigate the test failure. Playwright trace contains test execution screencast, live DOM snapshots, action explorer, test source, and many more.</p></div></div></div></div></section><section class="logosSection_gMWS"><div class="container"><div class="row"><div class="col col--12 logosColumn_GJVT"><h2>Chosen by companies and open source projects</h2><ul class="logosList_zAAF"><li><a href="https://code.visualstudio.com" target="_blank" rel="noreferrer noopener"><img src="img/logos/VSCode.png" alt="VS Code"></a></li><li><a href="https://bing.com" target="_blank" rel="noreferrer noopener"><img src="img/logos/Bing.png" alt="Bing"></a></li><li><a href="https://outlook.com" target="_blank" rel="noreferrer noopener"><img src="img/logos/Outlook.png" alt="Outlook"></a></li><li><a href="https://www.hotstar.com/" target="_blank" rel="noreferrer noopener"><img src="img/logos/DHotstar.jpg" alt="Disney+ Hotstar"></a></li><li><a href="https://github.com/mui-org/material-ui" target="_blank" rel="noreferrer noopener"><img src="img/logos/MUI.png" alt="Material UI"></a></li><li><a href="https://github.com/ing-bank/lion" target="_blank" rel="noreferrer noopener"><img src="img/logos/ING.png" alt="ING"></a></li><li><a href="https://github.com/adobe/spectrum-web-components" target="_blank" rel="noreferrer noopener"><img src="img/logos/Adobe2.png" alt="Adobe"></a></li><li><a href="https://github.com/react-navigation/react-navigation" target="_blank" rel="noreferrer noopener"><img src="img/logos/ReactNavigation.png" alt="React Navigation"></a></li><li><a href="https://accessibilityinsights.io/" target="_blank" rel="noreferrer noopener"><img src="img/logos/accessibilityinsights.png" alt="Accessibility Insights"></a></li></ul></div></div></div></section></main></div><footer class="footer footer--dark"><div class="container container-fluid"><div class="row footer__links"><div class="col footer__col"><div class="footer__title">Learn</div><ul class="footer__items clean-list"><li class="footer__item"><a class="footer__link-item" href="/docs/intro">Getting started</a></li><li class="footer__item"><a href="https://learn.microsoft.com/en-us/training/modules/build-with-playwright/" target="_blank" rel="noopener noreferrer" class="footer__link-item">Playwright Training<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a class="footer__link-item" href="/community/learn-videos">Learn Videos</a></li><li class="footer__item"><a class="footer__link-item" href="/community/feature-videos">Feature Videos</a></li></ul></div><div class="col footer__col"><div class="footer__title">Community</div><ul class="footer__items clean-list"><li class="footer__item"><a href="https://stackoverflow.com/questions/tagged/playwright" target="_blank" rel="noopener noreferrer" class="footer__link-item">Stack Overflow<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a href="https://aka.ms/playwright/discord" target="_blank" rel="noopener noreferrer" class="footer__link-item">Discord<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a href="https://twitter.com/playwrightweb" target="_blank" rel="noopener noreferrer" class="footer__link-item">Twitter<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a href="https://www.linkedin.com/company/playwrightweb" target="_blank" rel="noopener noreferrer" class="footer__link-item">LinkedIn<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li></ul></div><div class="col footer__col"><div class="footer__title">More</div><ul class="footer__items clean-list"><li class="footer__item"><a href="https://github.com/microsoft/playwright" target="_blank" rel="noopener noreferrer" class="footer__link-item">GitHub<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a href="https://www.youtube.com/channel/UC46Zj8pDH5tDosqm1gd7WTg" target="_blank" rel="noopener noreferrer" class="footer__link-item">YouTube<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a href="https://dev.to/playwright" target="_blank" rel="noopener noreferrer" class="footer__link-item">Blog<svg width="13.5" height="13.5" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li class="footer__item"><a class="footer__link-item" href="/community/ambassadors">Ambassadors</a></li></ul></div></div><div class="footer__bottom text--center"><div class="footer__copyright">Copyright © 2024 Microsoft</div></div></div></footer></div></body></html>' url='https://playwright.dev/'

3. 代码解释

这段代码是一个基于 Playwright 的 Python 异步工具,用于通过网页浏览器操作和抓取网页内容。它使用了多个 Python 包,包括 asyncio, pydantic 和 playwright.async_api,以实现灵活的浏览器控制功能。以下是对主要部分的逐段解释:

  1. 模块导入
from __future__ import annotations

启用 Python 未来功能支持,例如类型提示中的前置引用(| 用于联合类型)。

import asyncio
import sys
from pathlib import Path
from typing import Literal, Optional

asyncio: 用于实现异步编程。
sys: 提供与解释器交互的工具。
Path: 用于文件路径操作。
Literal, Optional: 类型注解工具。

from playwright.async_api import async_playwright
from pydantic import BaseModel, Field, PrivateAttr

async_playwright: Playwright 的异步 API,用于浏览器操作。
pydantic: 提供强大的数据验证和模型支持。

  1. 核心类 PlaywrightWrapper
    概述
    封装了 Playwright 的功能,用于创建浏览器会话、抓取网页内容和动态处理。

属性

browser_type: Literal["chromium", "firefox", "webkit"] = "chromium"

指定使用的浏览器类型(默认 chromium)。

launch_kwargs: dict = Field(default_factory=dict)

浏览器启动的参数(例如无头模式等)。

proxy: Optional[str] = None

可选代理服务器地址。

context_kwargs: dict = Field(default_factory=dict)

浏览器上下文参数(例如忽略 HTTPS 错误)。

_has_run_precheck: bool = PrivateAttr(False)

内部属性,用于标记是否完成预检查。

方法

def __init__(self, **kwargs):...

初始化 PlaywrightWrapper 对象。
处理代理配置和上下文参数。

  1. run 方法
async def run(self, url: str, *urls: str) -> WebPage | list[WebPage]:

接受单个或多个 URL,返回网页内容(WebPage 对象)。
使用 asyncio.gather 并发抓取多个 URL。

  1. _scrape 方法
async def _scrape(self, browser, url):

核心抓取逻辑,使用 Playwright 打开页面并提取内容。
包括 HTML 和页面文本。

关键逻辑:

await page.goto(url): 访问 URL。
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)"): 模拟滚动操作。
await page.content(): 获取 HTML。
await page.evaluate("() => document.body.innerText"): 提取文本内容。
  1. _run_precheck 方法
async def _run_precheck(self, browser_type):

检查 Playwright 是否正确安装所需的浏览器。
如果未安装,调用 _install_browsers 下载。

  1. 浏览器安装工具
_install_browsers

异步安装指定的浏览器。
使用 asyncio.create_subprocess_exec 调用 playwright install 安装。

_log_stream

实时处理安装过程中产生的日志输出。

  1. 全局变量
_install_lock: asyncio.Lock = None

防止并发安装操作。

_install_cache: set()

缓存已安装的浏览器,避免重复安装。

用途

网页抓取:从动态网页中提取内容。
代理支持:通过代理服务器访问网页。
浏览器自动安装:根据需求自动下载并配置浏览器。
高效并发:通过 asyncio 实现对多个 URL 的并发抓取。

参考链接: https://github.com/geekan/MetaGPT
https://playwright.dev/python/docs/library

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/889752.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

在C#中编程绘制和移动线段

这个示例允许用户绘制和移动线段。它允许您根据鼠标下方的内容执行三种不同的操作。 当鼠标位于某个线段上时&#xff0c;光标会变成手的形状。然后您可以单击并拖动来移动该线段。当鼠标位于线段的终点上时&#xff0c;光标会变成箭头。然后您可以单击并拖动以移动终点。当鼠…

Jenkins容器使用宿主机Docker(五)

DevOps之安装和配置 Jenkins (一) DevOps 之 CI/CD入门操作 (二) Sonar Qube介绍和安装&#xff08;三&#xff09; Harbor镜像仓库介绍&安装 &#xff08;四&#xff09; Jenkins容器使用宿主机Docker&#xff08;五&#xff09; Jenkins流水线初体验&#xff08;六&#…

网站被间歇性扫描,IP不断更换,我的应对方法

文章目录 背景应对方式封ip自动不响应策略代码为啥不上其他防护软件呢&#xff1f; 背景 我负责的一个网站&#xff0c;不出名&#xff0c;平时也没多少流量。1天有几百个就不错了。没想到&#xff0c;这么不起眼的网站也有被人盯上的时候。 一天&#xff0c;后台使用人员告诉…

WHY - 为什么选择 Rsbuild

目录 一、介绍二、工具对比三、性能 https://rsbuild.dev/zh/guide/start/index 一、介绍 Rsbuild 是由 Rspack 驱动的高性能构建工具&#xff0c;它默认包含了一套精心设计的构建配置&#xff0c;提供开箱即用的开发体验&#xff0c;并能够充分发挥出 Rspack 的性能优势。 二…

停车场系统|Java|SSM|JSP|

【技术栈】 1⃣️&#xff1a;架构: B/S、MVC 2⃣️&#xff1a;系统环境&#xff1a;Windowsh/Mac 3⃣️&#xff1a;开发环境&#xff1a;IDEA、JDK1.8、Maven、Mysql5.7 4⃣️&#xff1a;技术栈&#xff1a;Java、Mysql、SSM、Mybatis-Plus、JSP、jquery,html 5⃣️数据库可…

hive—常用的日期函数

目录 1、current_date 当前日期 2、now() 或 current_timestamp() 当前时间 3、datediff(endDate, startDate) 计算日期相差天数 4、months_between(endDate, startDate) 日期相差月数 5、date_add(startDate, numDays) 日期加N天 6、date_sub(startDate, numDays) 日期减…

12.11函数 结构体 多文件编译

1.脑图 定义一个数组&#xff0c;用来存放从终端输入的5个学生的信息【学生的信息包含学生的姓名、年纪、性别、成绩】 1>封装函数 录入5个学生信息 2>封装函数 显示学生信息 3>封装函数 删除第几个学生信息&#xff0c;删除后调用显示学生信息函数 显示 4> 封…

从 Router 到 Navigation:HarmonyOS 路由框架的全面升级与迁移指南

在本教程中&#xff0c;我们深入探讨了 Router 和 Navigation 在 HarmonyOS 中的用法差异及如何从 Router 切换到 Navigation 的方法。重点涵盖了页面跳转、转场动画、生命周期管理以及跨包路由的实现。 页面结构对比 Router 页面结构 每个页面需要使用 Entry 注解。 页面需要…

【工具】linux matlab 的使用

问题1 - 复制图表 在使用linux matlab画图后&#xff0c;无法保存figure。 例如在windows下 但是在linux下并没有这个“Copy Figure”的选项。 这是因为 “ The Copy Figure option is not available on Linux systems. Use the programmatic alternative.” 解决方案&…

windows11 专业版 docker desktop 安装指南

家庭中文版需升级专业版&#xff0c;家庭版没有hyper-v。 开始运行optionalfeatures.exe打开windows功能 安装wsl2 步骤 1 - 启用适用于 Linux 的 Windows 子系统步骤 2 - 检查运行 WSL 2 的要求步骤 3 - 启用虚拟机功能步骤 4 - 下载 Linux 内核更新包 步骤 1 - 启用适用于 L…

工业大数据分析算法实战-day01

文章目录 前言day01工业上刻画物理世界模型忽略业务的数据挖掘是本末倒置数据分析算法的朴素思想 前言 从毕业后从事的行业是机房动力环境运维行业&#xff0c;职责为动环设备的监控预警和故障诊断&#xff0c;核心主旨为动环设备的数智化&#xff0c;个人浅见从大类视角来看隶…

bug:uniapp运行到微信开发者工具 白屏 页面空白

1、没有报错信息 2、预览和真机调试都能正常显示&#xff0c;说明代码没错 3、微信开发者工具版本已经是win7能装的最高版本了&#xff0c;1.05版 链接 不打算回滚旧版本 4、解决&#xff1a;最后改调试基础库为2.25.4解决了&#xff0c;使用更高版本的都会报错&#xff0c;所…

用 Python 实现经典的 2048 游戏:一步步带你打造属于你的小游戏!

用 Python 实现经典的 2048 游戏&#xff1a;一步步带你打造属于你的小游戏&#xff01;&#xff08;结尾附完整代码&#xff09; 简介 2048 是一个简单而又令人上瘾的数字拼图游戏。玩家通过滑动方块使相同数字的方块合并&#xff0c;目标是创造出数字 2048&#xff01;在这篇…

【Go】-倒排索引的简单实现

目录 什么是倒排索引 定义 基本结构和原理 分词在倒排索引中的重要性 简单倒排索引的实现 接口定义 简单数据库的实现 倒排索引 正排索引 测试 总结 什么是倒排索引 定义 倒排索引&#xff08;Inverted Index&#xff09;是一种索引数据结构&#xff0c;它是文档检…

智汇云舟4个案例入选“中国联通智慧城市物联感知与AI应用案例”

12月10日&#xff0c;由中国联通智慧城市军团联合联通数字科技有限公司物联网事业部、物联中国团体组织联席会共同主办的“中国联通首届智慧城市领域物联感知与AI应用优秀案例发布交流大会”在郑州举行。大会现场对50余个优秀案例进行了集中发布与表彰。智汇云舟凭借深厚的技术…

http 502 和 504 的区别

首先看一下概念&#xff1a; 502&#xff1a;作为网关或者代理工作的服务器尝试执行请求时&#xff0c;从上游服务器接收到无效的响应。503&#xff1a;由于临时的服务器维护或者过载&#xff0c;服务器当前无法处理请求。这个状况是临时的&#xff0c;并且将在一段时间以后恢…

博弈论3:图游戏SG函数(Graph Games)

目录 一、图游戏是什么 1.游戏特征 2.游戏实例 二、图游戏的必胜策略 1.SG 函数&#xff08;Sprague-Grundy Function&#xff09; 2.必胜策略&#xff08;利用SG函数&#xff09; 3.拿走游戏转化成图游戏&#xff08;Take-away Game -> Graph Game&#xff09; 一、图…

免费生成AI PPT产品推荐?

要完全免费几乎是没有的&#xff0c;要知道AI还是非常烧钱的。 不过免费蹭还是有很多方法的&#xff0c;这里收集了一些&#xff1a; 下面分享我自己免费蹭过的几款AI制作PPT的工具。 1 金山-WPS PPT对我们来说并不陌生&#xff0c;而微软的PowerPoint与金山的WPS也是我们最常…

Python机器视觉的学习

一、二值化 1.1 二值化图 二值化图&#xff1a;就是将图像中的像素改成只有两种值&#xff0c;其操作的图像必须是灰度图。 1.2 阈值法 阈值法&#xff08;Thresholding&#xff09;是一种图像分割技术&#xff0c;旨在根据像素的灰度值或颜色值将图像分成不同的区域。该方法…

Cisco Packet Tarcer配置计网实验笔记

文章目录 概要整体架构流程网络设备互连基础拓扑图拓扑说明配置步骤 RIP/OSPF混合路由拓扑图拓扑说明配置步骤 BGP协议拓扑图拓扑说明配置步骤 ACL访问控制拓扑图拓扑说明配置步骤 HSRP冗余网关拓扑图拓扑说明配置步骤 小结 概要 一些环境配置笔记 整体架构流程 网络设备互连…