天天看點

爬蟲淘寶遇到密碼登入

最近在爬取淘寶中的相關資訊,遇到登入界面現給出兩種方法解決登入問題,現測可用。因為cookie可以用來儲存登入的資訊,故通過儲存cookie資訊來模拟登入。

打開淘寶進入登陸頁面,打開開發者工具切換到Network選項,監聽log。把賬号和密碼填入選型款,再點選登陸,回到Network選項,找到含有login的頁籤,可能有多個login選項,找到請求為post的那個網頁,選中headers下的request_headers選項,把Form_data下的内容全部複制到代碼中。

代碼如下:

import urllib.request

import urllib.parse

import http.cookiejar#用來儲存cookie

from urllib.parse import urlencode

#建立一個cookiejar對象,用來儲存cookie

cj = http.cookiejar.CookieJar()

#通過cookiejar建立一個handler

handler = urllib.request.HTTPCookieProcessor(cj)

#根據handler建立一個opener

opener = urllib.request.build_opener(handler)

def main():

headers = { ‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36’,

}

form_data = {

‘TPL_username’: 15013122417,

‘TPL_password’: ‘’,

‘ncoSig’:’’,

‘ncoSessionid’:’’,

‘ncoToken’: ‘e21d6ce02619ad1fb82ddce9f7f72a2413a7d537’,

‘slideCodeShow’: ‘false’,

‘useMobile’: ‘false’,

‘lang’: ‘zh_CN’,

‘loginsite’: 0,

‘newlogin’: 0,

‘TPL_redirect_url’: ‘https://www.taobao.com/’,

‘from’: ‘tbTop’,

‘fc’: ‘default’,

‘style’: ‘default’,

‘css_style’:’’,

‘keyLogin’: ‘false’,

‘qrLogin’: ‘true’,

‘newMini’: ‘false’,

‘newMini2’: ‘false’,

‘tid’:’’,

‘minititle’:’’,

‘minipara’:’’,

‘pstrong’:’’,

‘sign’:’’,

‘need_sign’:’’,

‘isIgnore’:’’,

‘full_redirect’:’’,

‘sub_jump’:’’,

‘popid’:’’,

‘callback’:’’,

‘guf’:’’,

‘not_duplite_str’:’’,

‘need_user_id’:’’,

‘poy’:’’,

‘loginType’: 3,

‘gvfdcname’: 10,

‘gvfdcre’: ‘68747470733A2F2F7777772E74616F62616F2E636F6D2F’,

‘from_encoding’:’’,

‘sub’:’’,

‘TPL_password_2’: ‘1e60cefcfb3c403402b7627c7b8927764ed61e779f62f70add829d61878194b9e45e44a920246a13f2b0375f14157a1cc316ad46eab2ff85872e1c2055d32a2f1730c7fe7a6f5b529fd3d675b8a9ebb1860a5f0af83328df6ac4ef86f3a85977227a607bef4923cf88a01dfca44ddd4ddf4f1a2e1e74f6a4ee1317e3396bd1f1’,

‘loginASR’: 1,

‘loginASRSuc’: 1,

‘allp’:’’,

‘oslanguage’: ‘zh - CN’,

‘sr’: 1366 * 768,

‘naviVer’: ‘chrome | 73.03683103’,

‘osACN’: ‘Mozilla’,

‘osAV’: ‘5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 73.0.3683.103Safari / 537.36’,

‘osPF’: ‘Win32’,

‘appkey’: 00000000,

‘nickLoginLink’: ‘’,

‘mobileLoginLink’: ‘https://login.taobao.com/member/login.jhtml?spm=a21bo.2017.754894437.1.5af911d9I6UfNH&f=top&redirectURL=https://www.taobao.com/&useMobile=true’,

‘showAssistantLink’:’’,

‘um_token’:‘T13D88A3902FED5E62629CFFE67A8764267D8ACF3FC100820CF24AD7585’, ‘ua’:‘118#ZVWZzZ6PogTAvZ/Ope2VZeZTZeHhngdrZHxfIsqTzHRzZDeZXoq4YZ2+ZZCTVHW41g2uZCp9zeJO4ZZ2Xoq4zeHzzZw4XHRVagCCXDiPwvNa+824cfq4PHZZueZhVHWVZZA/uQOLC/WZZgCuc6q4ze22ZYfJ5eSYZgY0ZsqhzeC4ZZZuVfq4zH2ZZZChXHWYZg2ZZYqhzHJTZZZuVfq4zEWwZZZhVoR4eZRhZeg2+W+XZuiu/HuFqsu2Usv3spjSnjARFK83VjCyXtb68nrFgb012h6It3bWJu80Zydz2ee7ZZe3L4PXZZxqHHSwr9z3ZYiG0a+vyJs+TksTNAGu95bZjER7pfEd8Uq+fi85yCeRC1or28aik8dDkftGcdYYVKp/QGDz+PlHEs8ACH0JNpg+7j+c7XRGTvLhJbepxFcjsprZ3hSrkCLza5U8P6HHJIdR0wvPbGDnVb5lKPH37n4s3+89LVM1TcmD226XOMWl3PZ2TtG/vXVpZhIZrF2bkcKm1JwzGEIY6T50SpXSSBCc/QEs3Ge7o5rrSpZZT8Emcn0yQa+wCAXQy8eQ1EeTrFh3hDRoGtZD0fdOylE1KL0OdiiHX8tBQFq4cBUCAkfKoeiAw2h3kCrtzazo4aSVT3gS8tqpWc6ltu/h8CkioaaXaJP4ss0EBfrBFLu2Ubk+2zU0+VQGwP8PoOSqzp9+7ZGWlEgDE+xrrPs0QlMmeDgPGfLS/tT9SEMB7WIYYJnjDITyMuIYpaJHqAj08fdF/AY/ExG5IoMmVr3DjtZ40pCN9LVxL6oWuMvBZduk0+lhosx6/cMVCx3Xlu4oGYbUj8PWdrHm7jgA2ArHYv5zjctFEdMYGH1EpdA1PdF+cTVCCXvz5cDzLpUGdrlziWm60C/VqQjQrOuLsNvMApZn’,

}
url = 'https://login.taobao.com/member/login.jhtml?'登入url
url_t = 'https://www.taobao.com/markets/xie/nvxie/index?spm=a21bo.2017.201867-main.4.5af911d9DO03rO'此url用來檢測cookie是否儲存成功,驗證是否能成功登入
 form_data = urlencode(form_data).encode()
request = urllib.request.Request(url, headers=headers)
response = opener.open(request, data=form_data)将登入後的cookie資訊儲存到opener方法中
request1 = urllib.request.Request(url=url_t, headers=headers)
response1 = opener.open(request1)  # open方法中帶有cookie資訊
print(response1.read().decode())
           

if name == ‘main’:

main()

上面是用cookie模拟登入,還有一種方法是通過會話session來儲存登入資訊,此方法也可以。可以參考這篇文章,https://www.jianshu.com/p/9b317e95d0a6,在此謝謝作者,我也是看了作者的文章以及自己前面學過的才想出我這個方法。