爬虫-淘宝

import bs4
import requests
import xlwt
import datetime
params={
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}
date = datetime.datetime.now().strftime('%Y-%m-%d')  # 给文件打上时间戳，便于数据更新
url = 'https://uland.taobao.com/sem/tbsearch?refpid=mm_26632360_8858797_29866178&keyword=%E4%B9%A6%E7%B1%8D&clk1=da382fc3cc2efef28fd1c72638a78aca&upsId=da382fc3cc2efef28fd1c72638a78aca&spm=a2e0b.20350158.search.1&pid=mm_26632360_8858797_29866178&union_lens=recoveryid%3A201_11.170.86.131_4623454_1620196433004%3Bprepvid%3A201_11.170.86.131_4623454_1620196433004'  # 网址
# payload = {'SearchText': 'taob', 'page': '1', 'ie': 'utf8', 'g': 'y'}  # 字典传递url参数
resp = requests.get(url,headers=params)
soup = bs4.BeautifulSoup(resp.text, "html.parser")
print(resp.url)  # 打印访问的网址
# print(resp.text)
print(resp.status_code)
resp.encoding = 'utf-8'  # 设置编码
title=[]
# 标题
all_title = soup.find_all('span',class_="title-text")
print(all_title)
# for j in all_title:
#     soup_title = bs4.BeautifulSoup(str(j), "html.parser")
#     title.append(soup_title.span.string)
#     print(title)
#
#     # 店铺名称
#     all_store = soup.find_all('span', class_="shopNick")
#     for k in all_store:
#         soup_store = bs4.BeautifulSoup(str(k), "html.parser", )
#         store.append(soup_store.span.string)
#
#     # 价格
#     all_price = soup.find_all('span', class_="pricedetail")
#     for l in all_price:
#         soup_price = bs4.BeautifulSoup(str(l), "html.parser")
#         price.append(soup_price.strong.string)
#
#     # 销售量
#     all_paynum = soup.find_all('span', class_="payNum")
#     for m in all_paynum:
#         soup_paynum = bs4.BeautifulSoup(str(m), "html.parser")
#         paynum.append(soup_paynum.span.string)
#
# # 数据验证
# print(len(title))
# print(len(store))
# print(len(price))
# print(len(paynum))

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/567620.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

爬虫-淘宝

相关文章

vs winform常用函数_使用.net core3.0 正式版创建Winform程序

VScode中编写运行C/html文件

linq结果转换object_你知道Object.entries()，但你还知道有Object.fromEntries()吗？

C语言中数组越界访问造成死循环现象

java 同步锁_Java多线程：synchronized同步锁的使用和实现原理

Apache shutdown unexpectedly启动错误解决方法

java基础代码实例_基础篇：详解JAVA对象实例化过程

搭建webUI自动化及问题解决：Message: ‘chromedriver‘ executable needs to be in PATH.解决办法

python内置输入函数_python内置函数 print()

webUI自动化一元素定位

python propresql mysql_python数据库操作mysql：pymysql、sqlalchemy常见用法详解

webUI自动化二-获取元素信息相关方法

python环境变量的配置 alias_配置别名

C语言-字符串处理函数strcpy

pythonjavascript一起开发_Python开发【第十一篇】：JavaScript

C语言-字符串处理函数strcat

js数组截取前5个_我不能没有的5个Vue.js库

this.$router.push如何刷新页面_小程序丨微信小程序如何实现页面下拉刷新

C语言-字符串处理函数strcmp

导出excel数字前面的0消失_Excel操作中常见的3大坑你遇到过吗？遇到应该这么解决...