新手學習爬蟲,爬取簡書網熱評,其中就隻有點贊數無法導入,以下為報錯資訊:
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'like,reward) values ('462','7')' at line 1")
importrequestsfrom bs4 importBeautifulSoupfrom lxml importetreefrom multiprocessing importPoolimportpymysql
headers= {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36'\'(KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36'}
conn= pymysql.connect(host='localhost',user='root',passwd='123456',db='mydb',port=3306,charset='utf8')
cursor=conn.cursor()defget_jianshu_info(url):
res= requests.get(url,headers=headers)
selector=etree.HTML(res.text)
infos= selector.xpath('//ul[@class="note-list"]/li')for info ininfos:try:
title= info.xpath('div/a/text()')[0]
author=info.xpath('div / div / a[1]/text()')[0]
content= info.xpath('div/p/text()')[0].strip()
comment= info.xpath('div/div/a/text()')[2].strip()if len(comment)==0:
comment= '無'like= info.xpath('div/div/span[1]/text()')[0].strip()if len(like) == 0 and 11:
like= '無'reward= info.xpath('div/div/span[2]/text()')if len(reward) ==0:
reward= '無'
else:
reward=reward[0].strip()
cursor.execute("insert into jianshureping (like,reward)"
"values (%s,%s)",
(str(like),str(reward))
)
conn.commit()print('ok')exceptIndexError:print('error')if __name__ == '__main__':
urls=\
['https://www.jianshu.com/c/bDHhpK?order_by=commented_at&page={}'.format(str(i)) for i in range(1, 3)]for url inurls:
get_jianshu_info(url)
我試過将點贊數去掉就可以導入,單獨爬取發現點贊數中有換行符,我也用條件語句排除了,結果如下:
點贊數:['462', '21349', '無', '2885', '118', '60', '17', '436', '18', '4']
評論數:['112', '12572', '20', '179', '17', '46', '23', '237', '10', '6']
打賞數:['7', '121', '58', '8', '無', '2', '無', '2', '無', '1']