我想从图像标签的以下src中提取文本,并且在div类数据内部提取锚标签的文本.
我成功地提取了img src,但是我从锚标签中提取文本时遇到麻烦.
Nikon COOLPIX L26 16.1 MP Digital Camera with 5x Zoom NIKKOR Glass Lens and 3-inch LCD (Red)
这是我的代码
for div in soup.findAll('div', attrs={'class':'image'}):
print "\n"
for data in div.findNextSibling('div', attrs={'class':'data'}):
for a in data.findAll('a', attrs={'class':'title'}):
print a.text
for img in div.findAll('img'):
print img['src']
我想要做的是提取图像src(link)和标题在div class = data旁边.
所以例如
Nikon COOLPIX L26 16.1 MP Digital Camera with 5x Zoom NIKKOR Glass Lens and 3-inch LCD (Red)
我想提取:尼康COOLPIX L26 16.1 MP数码相机与5倍变焦NIKKOR玻璃镜头和3英寸液晶(红色)