天天看点

Python-requests库的安装

准备安装pip

[[email protected] ~]# python36 --version
Python 3.6.3
[[email protected] ~]# command -v pi
pic              pidof            pifconfig        pinentry-curses  pinentry-gtk-2   ping             pinky            pivot_root       
piconv           pidstat          pinentry         pinentry-gtk     pinfo            ping6            pitchplay        
[ro[email protected] ~]# command -v pip
[[email protected] ~]# command -v pip3
           

查询并安装Pip。使用yum可以安装pip,缺点就是不是最新版。

[[email protected] ~]# yum list *pip*
Loaded plugins: fastestmirror, langpacks
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Loading mirror speeds from cached hostfile
 * base: mirror.bit.edu.cn
 * epel: mirrors.tongji.edu.cn
 * extras: mirrors.zju.edu.cn
 * updates: mirror.bit.edu.cn
Installed Packages
libpipeline.x86_64                                                                                    1.2.3-3.el7                                                                     @anaconda
Available Packages
aespipe.x86_64                                                                                        2.4d-2.el7                                                                      epel     
globus-xio-pipe-driver.x86_64                                                                         3.10-1.el7                                                                      epel     
globus-xio-pipe-driver-devel.x86_64                                                                   3.10-1.el7                                                                      epel     
libpipeline.i686                                                                                      1.2.3-3.el7                                                                     base     
libpipeline-devel.i686                                                                                1.2.3-3.el7                                                                     base     
libpipeline-devel.x86_64                                                                              1.2.3-3.el7                                                                     base     
nodejs-unpipe.noarch                                                                                  1.0.0-2.el7                                                                     epel     
pdns-backend-pipe.x86_64                                                                              3.4.11-4.el7                                                                    epel     
perl-IO-Pipely.noarch                                                                                 0.005-4.el7                                                                     epel     
pipelight-selinux.noarch                                                                              0.1.0-2.el7                                                                     epel     
python-apipkg.noarch                                                                                  1.2-7.el7                                                                       epel     
python-django-pipeline.noarch                                                                         1.3.27-1.el7                                                                    epel     
python2-pip.noarch                                                                                    8.1.2-6.el7                                                                     epel     
python34-pip.noarch                                                                                   8.1.2-6.el7                                                                     epel     
rubygem-apipie-bindings.noarch                                                                        0.0.10-2.el7                                                                    epel     
rubygem-apipie-bindings-doc.noarch                                                                    0.0.10-2.el7                                                                    epel     
uwsgi-logger-pipe.x86_64                                                                              2.0.16-1.el7                                                                    epel     
vanessa_socket-pipe.x86_64                                                                            0.0.12-3.el7                                                                    epel     
[[email protected] ~]# yum install python34-pip.noarch
Loaded plugins: fastestmirror, langpacks
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
base                                                                                                                                                                    | 3.6 kB  00:00:00     
epel/x86_64/metalink                                                                                                                                                    | 6.6 kB  00:00:00     
epel                                                                                                                                                                    | 3.2 kB  00:00:00     
extras                                                                                                                                                                  | 3.4 kB  00:00:00     
updates                                                                                                                                                                 | 3.4 kB  00:00:00     
zabbix                                                                                                                                                                  | 2.9 kB  00:00:00     
zabbix-non-supported                                                                                                                                                    |  951 B  00:00:00     
(1/5): extras/7/x86_64/primary_db                                                                                                                                       | 149 kB  00:00:00     
epel/x86_64/primary            FAILED                                            11% [========                                                               ] 201 kB/s | 814 kB  00:00:29 ETA 
http://mirror1.ku.ac.th/fedora/epel/7/x86_64/repodata/916333309cde50b4a977b81421d0043801ca99a6627933cc9270f48a30e61b57-primary.xml.gz: [Errno 14] HTTP Error 404 - Not Found4 kB  00:00:29 ETA 
Trying other mirror.
To address this issue please refer to the below knowledge base article 

https://access.redhat.com/articles/1320623

If above article doesn't help to resolve this issue please create a bug on https://bugs.centos.org/

(2/5): zabbix/x86_64/primary_db                                                                                                                                         |  87 kB  00:00:03     
(3/5): updates/7/x86_64/primary_db                                                                                                                                      | 2.0 MB  00:00:13     
(4/5): epel/x86_64/updateinfo                                                                                                                                           | 930 kB  00:00:16     
epel/x86_64/primary            FAILED                                          
http://ftp.kddilabs.jp/Linux/packages/fedora/epel/7/x86_64/repodata/916333309cde50b4a977b81421d0043801ca99a6627933cc9270f48a30e61b57-primary.xml.gz: [Errno 14] HTTP Error 404 - Not Found ETA 
Trying other mirror.
(5/5): epel/x86_64/primary                                                                                                                                              | 3.5 MB  00:00:35     
Loading mirror speeds from cached hostfile
 * base: mirror.bit.edu.cn
 * epel: mirrors.tuna.tsinghua.edu.cn
 * extras: mirrors.zju.edu.cn
 * updates: mirror.bit.edu.cn
epel                                                                                                                                                                               12583/12583
Resolving Dependencies
--> Running transaction check
---> Package python34-pip.noarch 0:8.1.2-6.el7 will be installed
--> Processing Dependency: python(abi) = 3.4 for package: python34-pip-8.1.2-6.el7.noarch
--> Processing Dependency: python34-setuptools for package: python34-pip-8.1.2-6.el7.noarch
--> Processing Dependency: /usr/bin/python3.4 for package: python34-pip-8.1.2-6.el7.noarch
--> Running transaction check
---> Package python34.x86_64 0:3.4.8-1.el7 will be installed
--> Processing Dependency: python34-libs(x86-64) = 3.4.8-1.el7 for package: python34-3.4.8-1.el7.x86_64
--> Processing Dependency: libpython3.4m.so.1.0()(64bit) for package: python34-3.4.8-1.el7.x86_64
---> Package python34-setuptools.noarch 0:19.2-3.el7 will be installed
--> Running transaction check
---> Package python34-libs.x86_64 0:3.4.8-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===============================================================================================================================================================================================
 Package                                                Arch                                      Version                                        Repository                               Size
===============================================================================================================================================================================================
Installing:
 python34-pip                                           noarch                                    8.1.2-6.el7                                    epel                                    1.7 M
Installing for dependencies:
 python34                                               x86_64                                    3.4.8-1.el7                                    epel                                     51 k
 python34-libs                                          x86_64                                    3.4.8-1.el7                                    epel                                    8.3 M
 python34-setuptools                                    noarch                                    19.2-3.el7                                     epel                                    373 k

Transaction Summary
===============================================================================================================================================================================================
Install  1 Package (+3 Dependent packages)

Total download size: 10 M
Installed size: 37 M
Is this ok [y/d/N]: y
Downloading packages:
(1/4): python34-3.4.8-1.el7.x86_64.rpm                                                                                                                                  |  51 kB  00:00:02     
(2/4): python34-pip-8.1.2-6.el7.noarch.rpm                                                                                                                              | 1.7 MB  00:00:05     
(3/4): python34-setuptools-19.2-3.el7.noarch.rpm                                                                                                                        | 373 kB  00:00:09     
(4/4): python34-libs-3.4.8-1.el7.x86_64.rpm                                                                                                                             | 8.3 MB  00:03:54     
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                           45 kB/s |  10 MB  00:03:54     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : python34-libs-3.4.8-1.el7.x86_64                                                                                                                                            1/4 
  Installing : python34-3.4.8-1.el7.x86_64                                                                                                                                                 2/4 
  Installing : python34-setuptools-19.2-3.el7.noarch                                                                                                                                       3/4 
  Installing : python34-pip-8.1.2-6.el7.noarch                                                                                                                                             4/4 
  Verifying  : python34-setuptools-19.2-3.el7.noarch                                                                                                                                       1/4 
  Verifying  : python34-3.4.8-1.el7.x86_64                                                                                                                                                 2/4 
  Verifying  : python34-pip-8.1.2-6.el7.noarch                                                                                                                                             3/4 
  Verifying  : python34-libs-3.4.8-1.el7.x86_64                                                                                                                                            4/4 

Installed:
  python34-pip.noarch 0:8.1.2-6.el7                                                                                                                                                            

Dependency Installed:
  python34.x86_64 0:3.4.8-1.el7                             python34-libs.x86_64 0:3.4.8-1.el7                             python34-setuptools.noarch 0:19.2-3.el7                            

Complete!
           

安装完pip验证一下,没问题。

[[email protected] ~]# command -v pip3
/usr/bin/pip3
           

再安装requests库。

[[email protected] ~]# pip3 install requests
Collecting requests
  Downloading https://files.pythonhosted.org/packages/65/47/7e02164a2a3db50ed6d8a6ab1d6d60b69c4c3fdf57a284257925dfc12bda/requests-2.19.1-py2.py3-none-any.whl (91kB)
    100% |████████████████████████████████| 92kB 194kB/s 
Collecting certifi>=2017.4.17 (from requests)
  Downloading https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl (150kB)
    100% |████████████████████████████████| 153kB 464kB/s 
Collecting chardet<3.1.0,>=3.0.2 (from requests)
  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)
    100% |████████████████████████████████| 143kB 864kB/s 
Collecting urllib3<1.24,>=1.21.1 (from requests)
  Downloading https://files.pythonhosted.org/packages/bd/c9/6fdd990019071a4a32a5e7cb78a1d92c53851ef4f56f62a3486e6a7d8ffb/urllib3-1.23-py2.py3-none-any.whl (133kB)
    100% |████████████████████████████████| 143kB 943kB/s 
Collecting idna<2.8,>=2.5 (from requests)
  Downloading https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl (58kB)
    100% |████████████████████████████████| 61kB 92kB/s 
Installing collected packages: certifi, chardet, urllib3, idna, requests
Successfully installed certifi-2018.4.16 chardet-3.0.4 idna-2.7 requests-2.19.1 urllib3-1.23
You are using pip version 8.1.2, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
           

进入python指令界面,使用get()抓取163网页,发现由于没有导入requests库,抓取失败。

[[email protected] ~]# python3
Python 3.4.8 (default, Mar 23 2018, 10:04:27) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> r = requests.get("http://www.163.com")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'requests' is not defined
           

导入requests库后重新抓取,成功。

>>> import requests
>>> r = requests.get("http://www.163.com")
>>> r.status_code
200
           

由于163页面内容太多,换一个页面爬。使用utf-8编码后,易于阅读了。

>>> r = requests.get("http://www.baidu.com")
>>> r.status_code
200
>>> r.text
'<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=http://s1.bdstatic.com/r/www/cache/bdorz/baidu.min.css><title>ç\x99¾åº¦ä¸\x80ä¸\x8bï¼\x8cä½\xa0å°±ç\x9f¥é\x81\x93</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=ç\x99¾åº¦ä¸\x80ä¸\x8b class="bg s_btn"></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>æ\x96°é\x97»</a> <a href=http://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>å\x9c°å\x9b¾</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>è§\x86é¢\x91</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>è´´å\x90§</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>ç\x99»å½\x95</a> </noscript> <script>document.write(\'<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=\'+ encodeURIComponent(window.location.href+ (window.location.search === " target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" " ? "?" : "&")+ "bdorz_come=1")+ \'" name="tj_login" class="lb">ç\x99»å½\x95</a>\');</script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">æ\x9b´å¤\x9a产å\x93\x81</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>å\x85³äº\x8eç\x99¾åº¦</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017 Baidu <a href=http://www.baidu.com/duty/>使ç\x94¨ç\x99¾åº¦å\x89\x8då¿\x85读</a>  <a href=http://jianyi.baidu.com/ class=cp-feedback>æ\x84\x8fè§\x81å\x8f\x8dé¦\x88</a> äº¬ICPè¯\x81030173å\x8f·  <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>\r\n'
>>> r.encoding = 'utf-8'
>>> r.text
'<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=http://s1.bdstatic.com/r/www/cache/bdorz/baidu.min.css><title>百度一下,你就知道</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=百度一下 class="bg s_btn"></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>新闻</a> <a href=http://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>地图</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>视频</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>贴吧</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>登录</a> </noscript> <script>document.write(\'<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=\'+ encodeURIComponent(window.location.href+ (window.location.search === " target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" " ? "?" : "&")+ "bdorz_come=1")+ \'" name="tj_login" class="lb">登录</a>\');</script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">更多产品</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>关于百度</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017 Baidu <a href=http://www.baidu.com/duty/>使用百度前必读</a>  <a href=http://jianyi.baidu.com/ class=cp-feedback>意见反馈</a> 京ICP证030173号  <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>\r\n'