4.python爬蟲浏覽器僞裝技術

2018-02-07 23:50:00

#python爬蟲的浏覽器僞裝技術
#爬取csdn部落格，會傳回403錯誤，因為對方伺服器會對爬蟲進行屏蔽，此時需要僞裝成浏覽器才能爬取
#浏覽器僞裝，一般通過報頭進行。

import urllib.request
url="http://blog.csdn.net/bingoxubin/article/details/78503370"
headers=("User-Agent","浏覽器中User-Agent的值")
opener=urllib.request.build_opener()
opener.add_handlers=[headers]
data=opener.open(url).read()
print(len(data))

資料采集 Python python資料可視化寫入資料python python資料寫入 python資料問題 python資料

上一篇: 1.python爬蟲基礎——正規表達式

下一篇: 1.python基礎階段

繼續閱讀

來自python的【條件控制/語句循環/break/continue/else/pass】一、條件控制二、語句循環
Python
08-07
無法解析的外部符号 wmain，該符号在函數 "void __cdecl mainCRTStartupHelper(struct HINSTANCE__ *,unsigned short con......
Python
08-07
TestLink導出用例轉換工具(XML2Excel)
實用小工具 Python Testlink XML轉Excel
08-07
YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作
DEV Python
08-07
Small tricks
Python import encoding byte module class
08-07
libsvm for python 安裝
Python 機器學習ML libsvm
08-07
學習軟體測試基礎測試第七天
測試工具單元測試 Python
08-07
Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API
Python
08-07
【Torch】最簡潔logging使用指南
Python Pytorch logging
08-07
27. Remove Element(清單)題目代碼
leetcode Python
08-07
Cloud Studio初體驗
Python ruby php
08-07
使用 ctypes 進行 Python 和 C 的混合程式設計
Python
08-07
【python】【資料處理】畫多元資料分布圖
jupyter Python ML 資料處理多元資料壓縮資料分布
08-07
【python】netconf協定對接管理裝置
Python netconf
08-07
「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置
NetDevOps netconf network Python
08-07
在python中建立excel并寫入
Python
08-07