天天看點

Proxypool代理池搭建

為了更好的閱讀體驗,建議通路我的個人部落格:點我

前言

項目位址 : https://github.com/jhao104/proxy_pool

這個項目是github上一個大佬基于python爬蟲制作的定時擷取免費可用代理并入池的代理池項目

我們來具體實作一下。

具體操作

1.安裝配置redis

将自動爬取的代理入池需要redis資料庫,首先就得安裝redis。

redis官方建議我們在linux上安裝,安裝方式主要有兩種,直接包擷取或手動安裝。

- 指令安裝

apt-get install redis-server
           

- 手動安裝

在官網下載下傳最新redis安裝包,導入Linux。

tar -zxvf redis-6.2.6.tar.gz
cd redis-6.2.6/
make
make install
cd /usr/local/bin
mkdir config
cp /opt/redis-6.2.6/redis.conf config		# 預設安裝位置為/opt
           

配置檔案修改

修改redis配置檔案(注意兩種安裝方式的配置檔案位置不同,自動安裝在

/etc/redis/redis.conf

,手動安裝在

/opt/redis-6.2.6/redis.conf

),進行如下修改:

daemonize yes		# 守護程序開啟
protected-mode no   # 關閉保護模式
# bind 127.0.0.1 ::1			# 此條為僅允許本地通路,必須注釋掉
port 6379			# redis 開放端口(如果是有防火牆的伺服器需要開啟該端口)
           

開啟redis

redis-server config/redis.conf
redis-cli
           
Proxypool代理池搭建

如需停止:

shutdown
exit
           

2.拉取并使用腳本

根據項目文檔,可以手動配置也可以使用docker部署(推薦)

docker 使用方法見另一篇部落格
docker pull jhao104/proxy_pool
docker run --env DB_CONN=redis://:[password]@[ip]:[port]/[db] -p 5010:5010 jhao104/proxy_pool:latest
           

password 沒有可為空

db 預設0

運作成功應如圖:

Proxypool代理池搭建

3.生成配置檔案并導入Proxyfier

首先pip安裝redis包

pip install redis
           

編譯以下代碼,注意修改第8行的ip和port(redis)

# -*- coding:utf8 -*-
import redis
import json
from xml.etree import ElementTree

def RedisProxyGet():
    ConnectString = []
    pool = redis.ConnectionPool(host='[ip]', port=[port], db=0, decode_responses=True)
    use_proxy = redis.Redis(connection_pool=pool)
    key = use_proxy.hkeys('use_proxy')
    for temp in key:
        try:
            ConnectString.append(json.loads(use_proxy.hget('use_proxy',temp)))
        except json.JSONDecodeError: # JSON解析異常處理
            pass
    return ConnectString

def xmlOutputs(data):
    i = 101
    ProxyIDList = []
    ProxifierProfile = ElementTree.Element("ProxifierProfile")
    ProxifierProfile.set("version", str(i))
    ProxifierProfile.set("platform", "Windows")
    ProxifierProfile.set("product_id", "0")
    ProxifierProfile.set("product_minver", "310")
    Options = ElementTree.SubElement(ProxifierProfile, "Options")
    Resolve = ElementTree.SubElement(Options, "Resolve")
    AutoModeDetection = ElementTree.SubElement(Resolve, "AutoModeDetection")
    AutoModeDetection.set("enabled", "false")
    ViaProxy = ElementTree.SubElement(Resolve, "ViaProxy")
    ViaProxy.set("enabled", "false")
    TryLocalDnsFirst = ElementTree.SubElement(ViaProxy, "TryLocalDnsFirst")
    TryLocalDnsFirst.set("enabled", "false")
    ExclusionList = ElementTree.SubElement(Resolve, "ExclusionList")
    ExclusionList.text = "%ComputerName%; localhost; *.local"
    Encryption = ElementTree.SubElement(Options, "Encryption")
    Encryption.set("mode", 'basic')
    Encryption = ElementTree.SubElement(Options, "HttpProxiesSupport")
    Encryption.set("enabled", 'true')
    Encryption = ElementTree.SubElement(Options, "HandleDirectConnections")
    Encryption.set("enabled", 'false')
    Encryption = ElementTree.SubElement(Options, "ConnectionLoopDetection")
    Encryption.set("enabled", 'true')
    Encryption = ElementTree.SubElement(Options, "ProcessServices")
    Encryption.set("enabled", 'false')
    Encryption = ElementTree.SubElement(Options, "ProcessOtherUsers")
    Encryption.set("enabled", 'false')
    ProxyList = ElementTree.SubElement(ProxifierProfile, "ProxyList")
    for temp in data:
        i += 1  # 從101開始增加
        Proxy = ElementTree.SubElement(ProxyList, "Proxy")
        Proxy.set("id", str(i))
        if not temp['https']:
            Proxy.set("type", "HTTP")
        else:
            Proxy.set("type", "HTTPS")
            Proxy.text = str(i)
            ProxyIDList.append(i)
        Address = ElementTree.SubElement(Proxy, "Address")
        Address.text = temp['proxy'].split(":", 1)[0]

        Port = ElementTree.SubElement(Proxy, "Port")
        Port.text = temp['proxy'].split(":", 1)[1]

        Options = ElementTree.SubElement(Proxy, "Options")
        Options.text = "48"
    ChainList = ElementTree.SubElement(ProxifierProfile, "ChainList")

    Chain = ElementTree.SubElement(ChainList, "Chain")
    Chain.set("id", str(i))
    Chain.set("type", "simple")

    Name = ElementTree.SubElement(Chain, "Name")
    Name.text="AgentPool"

    for temp_id in ProxyIDList:
        Proxy = ElementTree.SubElement(Chain, "Proxy")
        Proxy.set("enabled", "true")
        Proxy.text=str(temp_id)
    RuleList = ElementTree.SubElement(ProxifierProfile, "RuleList")

    Rule = ElementTree.SubElement(RuleList, "Rule")
    Rule.set("enabled", "true")
    Name = ElementTree.SubElement(Rule,"Name")
    Applications = ElementTree.SubElement(Rule,"Applications")
    Action = ElementTree.SubElement(Rule,"Action")

    Name.text="禦劍背景掃描工具.exe [auto-created]"
    Applications.text="禦劍背景掃描工具.exe"
    Action.set("type","Direct")

    # Rule
    Rule = ElementTree.SubElement(RuleList, "Rule")
    Rule.set("enabled", "true")
    Name = ElementTree.SubElement(Rule,"Name")
    Targets = ElementTree.SubElement(Rule,"Targets")
    Action = ElementTree.SubElement(Rule,"Action")

    Name.text="Localhost"
    Targets.text="localhost; 127.0.0.1; %ComputerName%"
    Action.set("type", "Direct")

    # Rule
    Rule = ElementTree.SubElement(RuleList, "Rule")
    Rule.set("enabled", "true")
    Name = ElementTree.SubElement(Rule, "Name")
    Action = ElementTree.SubElement(Rule, "Action")
    Name.text = "Default"
    Action.text = "102"
    Action.set("type", "Proxy")

    tree = ElementTree.ElementTree(ProxifierProfile)
    tree.write("ProxifierConf.ppx", encoding="UTF-8", xml_declaration=True)
    if __name__ == '__main__':
    proxy_data = RedisProxyGet()
    xmlOutputs(proxy_data)
    print("ProxifierConf.ppx配置檔案建立完成....")
           

編譯成功生成

ProxyfierConf.ppx

檔案。輕按兩下導入proxyfier即可

這裡proxyfier的版本不能太高,否則會報錯,建議3.3.1
Proxypool代理池搭建