天天看點

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

安裝

# 安裝playwright庫
pip install playwright

# 安裝浏覽器驅動檔案
python -m playwright install

#再安裝
playwright install      

要求:python版本3.7+

使用Playwright無需寫一行代碼,我們隻需手動操作浏覽器,它會錄制我們的操作,然後自動生成代碼腳本。

下面就是錄制的指令codegen,僅僅一行。

python -m playwright codegen      

codegen的用法可以使用–help檢視,如果簡單使用就是直接在指令後面加上url連結,如果有其他需要可以添加options。我就把結果粘貼出來:

Usage: npx playwright codegen [options] [url]

open page and generate code for user actions

Options:
  -o, --output <file name>     saves the generated script to a file
  --target <language>          language to generate, one of javascript, test, python, python-async, csharp (default: "python")
  -b, --browser <browserType>  browser to use, one of cr, chromium, ff, firefox, wk, webkit (default: "chromium")
  --channel <channel>          Chromium distribution channel, "chrome", "chrome-beta", "msedge-dev", etc
  --color-scheme <scheme>      emulate preferred color scheme, "light" or "dark"
  --device <deviceName>        emulate device, for example  "iPhone 11"
  --geolocation <coordinates>  specify geolocation coordinates, for example "37.819722,-122.478611"
  --ignore-https-errors        ignore https errors
  --load-storage <filename>    load context storage state from the file, previously saved with --save-storage
  --lang <language>            specify language / locale, for example "en-GB"
  --proxy-server <proxy>       specify proxy server, for example "http://myproxy:3128" or "socks5://myproxy:8080"
  --save-storage <filename>    save context storage state at the end, for later use with --load-storage
  --timezone <time zone>       time zone to emulate, for example "Europe/Rome"
  --timeout <timeout>          timeout for Playwright actions in milliseconds (default: "10000")
  --user-agent <ua string>     specify user agent string
  --viewport-size <size>       specify browser viewport size in pixels, for example "1280, 720"
  -h, --help                   display help for command

Examples:

  $ codegen
  $ codegen --target=python
  $ codegen -b webkit https://example.com
      

解釋:

-o:将錄制的腳本儲存到一個檔案

–target:規定生成腳本的語言,有JS和

Python

,java,c#等,預設為Python

-b:指定浏覽器驅動

舉個例子:

比如,我要在baidu.com搜尋,用chromium驅動,将結果儲存為my.py的python檔案。

python -m playwright codegen --target python -o 'my.py' -b chromium https://www.baidu.com      
微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

這就自動生成檔案:

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

現在我還得改個名。大家可以試試。

當你在浏覽器繼續點選,他會繼續更新生成新的代碼:

這次我們執行下指令:

python -m playwright codegen --target python -o 'my.py' -b chromium https://www.baidu.comom
      

點選一下,代碼就自動更新:

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

再點選一下,代碼還會繼續更新:

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

結束後自動關閉浏覽器,儲存生成的自動化腳本到py檔案如下:

from playwright.sync_api import Playwright, sync_playwright


def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(headless=False)
    context = browser.new_context()

    # Open new page
    page = context.new_page()

    # Go to https://www.baidu.com/
    page.goto("https://www.baidu.com/")

    # Click text=阿裡女員工稱被侵害 事發飯店回應
    with page.expect_popup() as popup_info:
        page.click("text=阿裡女員工稱被侵害 事發飯店回應")
    page1 = popup_info.value

    # Click em:has-text("阿裡女員工稱被侵害 事發飯店回應")
    with page1.expect_popup() as popup_info:
        page1.click("em:has-text(\"阿裡女員工稱被侵害 事發飯店回應\")")
    page2 = popup_info.value

    # Close page
    page2.close()

    # Close page
    page1.close()

    # Close page
    page.close()

    # ---------------------
    context.close()
    browser.close()


with sync_playwright() as playwright:
    run(playwright)
      

playwright還提供了同步和異步的API接口,這裡也有官方文檔,如果你英文還可以,可以參考文檔:

https://playwright.dev/python/docs/intro/      

這裡我繼續示範一個例子。

我就以通路我的CSDN為例子,我則需要在terminal執行以下指令:

python -m playwright codegen --target python -o 'my.py' -b chromium https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343      

回車即可開始神操作!

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

看效果,我多貼幾個示範這個動态過程:

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

我點選了一下一篇文章,代碼也跟着自動更新:

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

那麼我還想再給自己點個贊,似乎被發現了存在某種問題或陰謀?那我就用手機掃碼登入,因為這是新的浏覽器,沒有登陸曆史,是以這是正常的。

微軟開源最強Python自動化神器Playwright!不用寫一行代碼!自動生成代碼還竟然如此流暢!媽媽再也不用擔心我不會寫代碼了!

登陸了,我就給自己點了一個贊,代碼也同時更新了我點贊的部分。

手殘,點了一下圖檔,他還是把代碼加了一部分來點選照片,下面是‘my.py’新的代碼:

from playwright.sync_api import Playwright, sync_playwright


def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(headless=False)
    context = browser.new_context()

    # Open new page
    page = context.new_page()

    # Go to https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343
    page.goto("https://blog.csdn.net/weixin_46211269?spm=1000.2115.3001.5343")

    # Click text=Django3.0入門教程:文章釋出系統
    with page.expect_popup() as popup_info:
        page.click("text=Django3.0入門教程:文章釋出系統")
    page1 = popup_info.value

    # Click #is-like-img
    page1.click("#is-like-img")

    # Click text=CSDN App掃碼
    page1.frame(name="passport_iframe").click("text=CSDN App掃碼")

    # Go to https://blog.csdn.net/weixin_46211269/article/details/119553344?spm=1001.2014.3001.5501
    page1.goto("https://blog.csdn.net/weixin_46211269/article/details/119553344?spm=1001.2014.3001.5501")

    # Click text=0 點贊 >> a
    page1.click("text=0 點贊 >> a")

    # Click img[alt="在這裡插入圖檔描述"]
    page1.click("img[alt=\"在這裡插入圖檔描述\"]", button="right")

    # Click text=在model.py複制粘貼以下代碼:
    page1.click("text=在model.py複制粘貼以下代碼:")

    # Click .imgViewDom img
    page1.click(".imgViewDom img")

    # Close page
    page1.close()

    # Close page
    page.close()

    # ---------------------
    context.close()
    browser.close()


with sync_playwright() as playwright:
    run(playwright)
      

你可以把檔案名引号去掉,用這個代碼運作,他則會執行相同的操作。

那麼大家是不是還好奇如何分别使用同步和異步?

那麼問題來了,我先貼個我的群:970353786hhhh繼續發車

由于這個生成代碼如此之快,萬一被發現怎麼辦?于是我想讓他慢一點,比如這個火狐浏覽器,使用slow_mo讓他慢下來,而不能再用timeout,也不能用time . sleep (5) 來休息,而是可以用page.wait_for_timeout (5000)來代替,headless=False則表示無頭模式

firefox.launch(headless=False, slow_mo=50)      

直接說哦異步,因為同步我們已經寫的太多了,如下就是異步的最簡demo

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        await browser.close()

asyncio.run(main())      

浏覽器環境中也可以用來模拟多頁場景涉及到移動裝置,權限,語言環境和配色方案.比如:

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        iphone_11 = p.devices['iPhone 11 Pro']
        browser = await p.chromium.launch()
        context = await browser.new_context(
            **iphone_11,
            locale='de-DE',
            geolocation={ 'longitude': 12.492507, 'latitude': 41.889938 },
            permissions=['geolocation'],
            color_scheme='dark',
        )
        page = await browser.new_page()
        await browser.close()

asyncio.run(main())      

浏覽器中可以有多個頁面。 一個 頁面 是指一個标簽或一個彈出視窗在浏覽器中上下文。 它應該被用來導航到url頁面内容并與之互動。比如以下代碼,這是demo,你得根據你需要的網址進行修改,example.com則為demo

page = await context.new_page()

# Navigate explicitly, similar to entering a URL in the browser.
await page.goto('http://example.com')
# Fill an input.
await page.fill('#search', 'query')

# Navigate implicitly by clicking a link.
await page.click('#submit')
# Expect a new url.
print(page.url)

# Page can navigate from the script - this will be picked up by Playwright.
# window.location.href = 'https://example.com'      

操作實在太多,不再繼續示範了,如果你英文可以,可以看看上面的參考文檔,這工具也是實在牛逼,爽爆了!