深入理解Scrapy


Scrapy是什么

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Scrapy是适用于Python的一个快速、简单、功能强大的web爬虫框架,通常用于抓取web站点并从页面中提取结构化的数据,也可以用来做监控与自动化测试。架构图如下所示:

640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

Scrapy如何工作

理解工作原理更有益于后面的学习(也可先看后面的快速上手后再返回来看这里),Scrapy运行流程图如下所示:

640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

运行过程如下:

  1. 程序启动后将会创建一个/多个Spiders(爬虫)Spiders会将Requests(请求)经过SpiderMiddlewares(爬虫中间件)加工,再交给Engine(引擎)

  2. EngineSpiders传递过来多个请求转交给Scheduler(调度器),由调度器来安排请求。

  3. Scheduler将需要马上执行的请求交回给Engine

  4. Engine将请求经过DownloaderMiddlewares(下载器中间件)加工,再发送给Downloader(下载器)

  5. Downloader使用Requests完成页面/接口的下载,并生成Responses(响应), 将Responses经过DownloaderMiddlewares再转交给Engine

  6. EngineResponses经过SpiderMiddlewares交回给爬虫处理Responses

  7. Spiders处理Responses后产生的结果返回给Engine。(Spiders处理 Responses

  8. 步骤7 中Spiders处理结果返回的Requests对象将回到步骤2;返回的Items(数据结构化对象)或者dict(字典对象)将交给ItemPipelines(数据管道)处理。

  9. 通过定制ItemPipelines来控制数据如何持久化及处理。

开始使用Scrapy

1.安装Scrapy

通过如下命令安装Scrapy

pip install scrapy

Scrapy安装完成会提供一个scrapy工具, 通过命令scrapy --help显示如下则表示安装成功:

> scrapy --help
Scrapy 2.6.2 - no active projectUsage:scrapy <command> [options] [args]Available commands:bench         Run quick benchmark testcommandsfetch         Fetch a URL using the Scrapy downloadergenspider     Generate new spider using pre-defined templatesrunspider     Run a self-contained spider (without creating a project)settings      Get settings valuesshell         Interactive scraping consolestartproject  Create new projectversion       Print Scrapy versionview          Open URL in browser, as seen by Scrapy[ more ]      More commands available when run from project directoryUse "scrapy <command> -h" to see more info about a command

2.创建scrapy项目

通过命令scrapy startproject xxx创建一个Scrapy项目:

scrapy startproject MySpider

命令执行之后在当前目录下会生成一个MySpider的目录,目录结构如下所示:

MySpider/
├─scrapy.cfg
└─MySpider/├─items.py├─middlewares.py├─pipelines.py├─settings.py├─__init__.py└─spiders/└─__init__.py
  • items.py文件存放自定义的Items

  • middlewares.py 文件存放SpiderMiddlewaresDownloaderMiddlewares

  • pipelines.py 文件存放自定义的ItemPipelines

  • settings.py 文件存放全局的配置信息

  • spiders/ 目录存放所有Spiders

之后以第一个MySpider/目录作为项目根目录

3.创建Scrapy爬虫

创建Scrapy爬虫命令scrapy genspider [spidername] [allow_domain] 以慢慢买历史价格接口为例,创建慢慢买爬虫:

scrapy genspider manmanbuy manmanbuy.com

慢慢买历史价格爬取流程:

  1. 访问 http://tool.manmanbuy.com/HistoryLowest.aspx 页面获取隐藏的 <input id="ticket" ...> 标签的 value 值。

  2. 通过 步骤1 获取的 value 值, 加工生成请求头的 Authorization 参数

  3. 生成 请求参数 token 的值

  4. 调用 http://tool.manmanbuy.com/api.ashx 接口获取商品历史价格 (该接口依赖有效cookie, 如何获取有效cookie不是本文重点暂不说明)

此时在spiders/目录下就能找到生成的manmanbuy.py文件,文件内容如下:

import scrapyclass ManmanbuySpider(scrapy.Spider):name = 'manmanbuy'allowed_domains = ['manmanbuy.com']start_urls = ['http://manmanbuy.com/']def parse(self, response):pass

其中name为爬虫名称, allowed_domains 为允许访问的域名, start_urls 为启动爬取的地址。

Spider 支持两种启动爬取方式, 一种为便捷的配置start_urls方式, 启动后将直接爬取配置的url, 另一种为重写start_requests方法,返回自定义初始化的Request

4.Scrapy爬虫开发

4.1. 编写Spiders(MySpider/spiders/manmanbuy.py)

import scrapy
from urllib.parse import quote
import hashlib
import time
import copyclass ManmanbuySpider(scrapy.Spider):name = 'manmanbuy'allowed_domains = ['manmanbuy.com']def start_requests(self):# 以京东单个商品查询历史价格为例, 商品ID: 100011493273item_urls = ['https://item.jd.com/100011493273.html']# 定义请求头headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',}# 第一步先从h5页面获取ticket参数for item_url in item_urls:yield scrapy.Request(url='http://tool.manmanbuy.com/HistoryLowest.aspx?url=' + item_url,headers=headers,# 透传参数meta={'key': item_url})def parse(self, response: scrapy.http.Response):# 从页面中获取ticket值ticket = response.css('#ticket')[0].attrib['value']# 获取下一段接口请求参数req = parse_req({'key': response.meta['key'], 'method': 'getHistoryTrend'})# 请求头headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',# 计算auth'Authorization': parse_basic_auth(ticket),}return scrapy.FormRequest(url='http://tool.manmanbuy.com/api.ashx',method='POST',formdata=req,headers=headers,cookies=self.get_cookies(),# 自定义回调地址callback=self.parse_history_price)def get_cookies(self):# 省略获取cookie逻辑cks = '_ga=GA1.2.604426644.1596510819; ASP.NET_SessionId=bbyuxdftfkcf5mrijdgkmnc5; Hm_lvt_01a310dc95b71311522403c3237671ae=1658906329; Hm_lvt_85f48cee3e51cd48eaba80781b243db3=1658748396,1658906330; _gid=GA1.2.472137414.1658906330; _gat_gtag_UA_145348783_1=1; 60014_mmbuser=VQYJA1IFBTBSVwNdClFVUgdRUQcLUgdXDg1RBgNTAVVUVAZRBQFeAw%3d%3d; Hm_lpvt_85f48cee3e51cd48eaba80781b243db3=1658906625; Hm_lpvt_01a310dc95b71311522403c3237671ae=1658906625'cookies = {}for one in cks.split(';'):k, v = one.strip().split("=")cookies[k] = vreturn cookiesdef parse_history_price(self, response: scrapy.http.Response):# 输出相应self.logger.info(response.text)def parse_basic_auth(ticket):"""这是解析ticket的值啊,就是上面说的那逻辑"""return 'BasicAuth ' + ticket[:160][-4:] + ticket[:160 - 4]def parse_req(d):"""这是解析请求,增加t和token参数"""d['t'] = str(int(time.time() * 1000))n = copy.deepcopy(d)ks = list(n.keys())ks.sort()ask = 'c5c3f201a8e8fc634d37a766a0299218'mask = askfor k in ks:mask += f'{k}{quote(str(n[k])).replace("/", "%2F")}'mask += askmask = mask.upper()md5 = hashlib.md5()md5.update(mask.encode('utf-8'))d['token'] = md5.hexdigest().upper()return d

4.2. 修改Settings(MySpider/settings.py)

# robots.txt 文件检查, 默认为: true, 需要改为Flase
ROBOTSTXT_OBEY = False

运行scrapy crawl manmanbuy命令启动爬虫, 观察日志能够正常获取数据:

{"msg":"","code":0,"data":{"haveTrend":1,"changPriceRemark":"降幅5%","runtime":41,"zouShi_test":2,"changePriceCount":14,"spbh":"1|100011493273","spUrl":"https://item.jd.com/100011493273.html","spPic":"http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg","currentPrice":1049.00,"spName":"荣耀Play5T 22.5W超级快充 5000mAh大电池 6.5英寸护眼屏 全网通8GB+128GB极光蓝","lowerDate":"2022-03-08T00:00:00+08:00","lowerPrice":899.00,"bjid":551120462,"zouShi":2,"siteId":1,"siteName":"京东商城","datePrice":"[1621353600000,1199.00,\"\"],[1621440000000,1199.00,\"\"],[1621526400000,1199.00,\"\"],[1621612800000,1199.00,\"\"],[1621699200000,1199.00,\"\"],[1621785600000,1199.00,\"\"],[1621872000000,1199.00,\"\"],[1621958400000,1199.00,\"\"],[1622044800000,1199.00,\"\"],[1622131200000,1199.00,\"\"],[1622217600000,1199.00,\"\"],[1622304000000,1199.00,\"\"],[1622390400000,1199.00,\"\"],[1622476800000,1199.00,\"\"],[1622563200000,1199.00,\"\"],[1622649600000,1199.00,\"\"],[1622736000000,1199.00,\"\"],[1622822400000,1199.00,\"\"],[1622908800000,1199.00,\"\"],[1622995200000,1199.00,\"\"],[1623081600000,1199.00,\"\"],[1623168000000,1199.00,\"\"],[1623254400000,1199.00,\"\"],[1623340800000,1199.00,\"1199元\"],[1623427200000,1199.00,\"\"],[1623513600000,1199.00,\"\"],[1623600000000,1199.00,\"\"],[1623686400000,1199.00,\"\"],[1623772800000,1199.00,\"\"],[1623859200000,1199.00,\"\"],[1623945600000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1624032000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624118400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624204800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624291200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624377600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624464000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624550400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624636800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624723200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624809600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624896000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624982400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625068800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625155200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625241600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625328000000,1199.00,\"\"],[1625414400000,1199.00,\"\"],[1625500800000,1189.0,\"京东秒杀价:1189\"],[1625587200000,1199.00,\"\"],[1625673600000,1189.0,\"\"],[1625760000000,1199.0,\"\"],[1625846400000,1189.0,\"\"],[1625932800000,1199.0000,\"\"],[1626019200000,1199.0000,\"\"],[1626105600000,1189.0,\"\"],[1626192000000,1199.0,\"\"],[1626278400000,1189.0,\"\"],[1626364800000,1199.0000,\"\"],[1626451200000,1199.0000,\"\"],[1626537600000,1199.0000,\"\"],[1626624000000,1199.0000,\"\"],[1626710400000,1199.0000,\"\"],[1626796800000,1199.0000,\"\"],[1626883200000,1189.00,\"\"],[1626969600000,1199.0000,\"\"],[1627056000000,1199.0000,\"\"],[1627142400000,1199.0000,\"\"],[1627228800000,1199.0000,\"\"],[1627315200000,1189.00,\"\"],[1627401600000,1199.0000,\"\"],[1627488000000,1189.00,\"1189元\"],[1627574400000,1189.00,\"\"],[1627660800000,1199.00,\"\"],[1627747200000,1179.00,\"1179元\"],[1627833600000,1189.0000,\"\"],[1627920000000,1199.00,\"\"],[1628006400000,1199.00,\"\"],[1628092800000,1189.00,\"\"],[1628179200000,1189.0000,\"\"],[1628265600000,1189.0000,\"\"],[1628352000000,1189.0000,\"\"],[1628438400000,1199.00,\"\"],[1628524800000,1189.00,\"\"],[1628611200000,1199.0,\"\"],[1628697600000,1189.0000,\"\"],[1628784000000,1189.0000,\"\"],[1628870400000,1189.0000,\"\"],[1628956800000,1199.00,\"1199元\"],[1629043200000,1189.0000,\"\"],[1629129600000,1199.00,\"\"],[1629216000000,1189.00,\"\"],[1629302400000,1199.0000,\"\"],[1629388800000,1169.0,\"京东秒杀价:1169\"],[1629475200000,1199.00,\"\"],[1629561600000,1199.00,\"\"],[1629648000000,1199.00,\"\"],[1629734400000,1169.00,\"\"],[1629820800000,1199.0,\"\"],[1629907200000,1189.0,\"京东秒杀价:1189\"],[1629993600000,1199.00,\"\"],[1630080000000,1199.00,\"\"],[1630166400000,1199.00,\"\"],[1630252800000,1199.00,\"\"],[1630339200000,1189.00,\"1189元包邮\"],[1630425600000,1189.00,\"\"],[1630512000000,1175.00,\"购买1件,plus价格1175\"],[1630598400000,1189.00,\"\"],[1630684800000,1199.0,\"\"],[1630771200000,1189.0000,\"\"],[1630857600000,1199.0,\"\"],[1630944000000,1189.0000,\"\"],[1631030400000,1199.0,\"\"],[1631116800000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1631203200000,1189.0000,\"\"],[1631289600000,1189.00,\"\"],[1631376000000,1199.00,\"\"],[1631462400000,1189.0000,\"\"],[1631548800000,1189.0,\"\"],[1631635200000,1199.0,\"\"],[1631721600000,1189.00,\"\"],[1631808000000,1199.00,\"\"],[1631894400000,1189.0,\"\"],[1631980800000,1199.0,\"\"],[1632067200000,1169.00,\"\"],[1632153600000,1169.00,\"\"],[1632240000000,1169.00,\"\"],[1632326400000,1189.0,\"\"],[1632412800000,1169.0,\"\"],[1632499200000,1199.0,\"\"],[1632585600000,1169.00,\"\"],[1632672000000,1199.00,\"\"],[1632758400000,1169.0,\"\"],[1632844800000,1199.0000,\"\"],[1632931200000,1169.0,\"\"],[1633017600000,1169.0,\"\"],[1633104000000,1169.0,\"\"],[1633190400000,1169.00,\"\"],[1633276800000,1169.0000,\"\"],[1633363200000,1199.00,\"\"],[1633449600000,1199.00,\"\"],[1633536000000,1169.0,\"\"],[1633622400000,1169.0,\"\"],[1633708800000,1169.0,\"京东秒杀价:1169\"],[1633795200000,1169.0,\"\"],[1633881600000,1199.00,\"\"],[1633968000000,1169.00,\"\"],[1634054400000,1199.0,\"\"],[1634140800000,1169.00,\"\"],[1634227200000,1199.0,\"\"],[1634313600000,1199.0,\"\"],[1634400000000,1169.0,\"\"],[1634486400000,1199.0,\"\"],[1634572800000,1169.00,\"\"],[1634659200000,1199.0000,\"\"],[1634745600000,1169.00,\"\"],[1634832000000,1199.0,\"\"],[1634918400000,1199.0,\"\"],[1635004800000,1199.0,\"\"],[1635091200000,1199.0,\"\"],[1635177600000,1199.0,\"\"],[1635264000000,1199.0,\"\"],[1635350400000,1199.0,\"\"],[1635436800000,1199.0,\"\"],[1635523200000,1099.00,\"1099元 \"],[1635609600000,1099.0,\"\"],[1635696000000,1099.00,\"\"],[1635782400000,1099.00,\"\"],[1635868800000,1099.00,\"\"],[1635955200000,1099.00,\"购买1件,plus价格1099\"],[1636041600000,949.00,\"购买1件,当前价:1099.00,可叠加优惠券2:满880减150\"],[1636128000000,1099.00,\"\"],[1636214400000,1199.0,\"\"],[1636300800000,1099.0,\"\"],[1636387200000,1099.0,\"\"],[1636473600000,1099.0,\"\"],[1636560000000,979.00,\"购买1件,当前价:1099.00,满减:每满1080减120\"],[1636646400000,1099.0000,\"\"],[1636732800000,1199.00,\"\"],[1636819200000,1199.00,\"\"],[1636905600000,1199.00,\"\"],[1636992000000,1199.00,\"\"],[1637078400000,1099.00,\"\"],[1637164800000,1099.00,\"\"],[1637251200000,1099.00,\"\"],[1637337600000,1099.00,\"\"],[1637424000000,1099.00,\"1099元\"],[1637510400000,1099.00,\"\"],[1637596800000,1099.0,\"\"],[1637683200000,1099.00,\"\"],[1637769600000,1099.00,\"\"],[1637856000000,1099.00,\"\"],[1637942400000,1099.00,\"\"],[1638028800000,1199.0000,\"\"],[1638115200000,1199.00,\"\"],[1638201600000,1199.00,\"\"],[1638288000000,1099.0,\"\"],[1638374400000,1099.00,\"\"],[1638460800000,1099.00,\"\"],[1638547200000,1199.0,\"\"],[1638633600000,1199.00,\"\"],[1638720000000,1099.0,\"\"],[1638806400000,1099.00,\"\"],[1638892800000,1099.00,\"\"],[1638979200000,1099.00,\"1099元\"],[1639065600000,1099.00,\"\"],[1639152000000,1089.0,\"\"],[1639238400000,1089.0,\"\"],[1639324800000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639411200000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639497600000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639584000000,1099.00,\"\"],[1639670400000,1099.0,\"\"],[1639756800000,1099.0,\"\"],[1639843200000,1099.0,\"\"],[1639929600000,1099.00,\"1099元\"],[1640016000000,1099.0,\"\"],[1640102400000,1099.0,\"\"],[1640188800000,1099.0,\"\"],[1640275200000,1099.00,\"1099元\"],[1640361600000,1099.0,\"\"],[1640448000000,1069.0,\"\"],[1640534400000,1099.0,\"\"],[1640620800000,1099.0,\"\"],[1640707200000,1099.0,\"\"],[1640793600000,1099.0,\"\"],[1640880000000,1099.0,\"\"],[1640966400000,1099.00,\"1099元\"],[1641052800000,1099.0,\"\"],[1641139200000,1099.0,\"\"],[1641225600000,1099.0,\"\"],[1641312000000,1099.0,\"\"],[1641398400000,1099.0,\"\"],[1641484800000,1099.0,\"\"],[1641571200000,1099.0,\"\"],[1641657600000,1099.00,\"1099元\"],[1641744000000,1099.0,\"\"],[1641830400000,1099.0,\"\"],[1641916800000,1099.0,\"\"],[1642003200000,1099.0,\"\"],[1642089600000,1099.0,\"\"],[1642176000000,1099.0,\"\"],[1642262400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642348800000,1099.00,\"\"],[1642435200000,1099.00,\"\"],[1642521600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642608000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642694400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642780800000,1099.0,\"\"],[1642867200000,1099.0,\"\"],[1642953600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643040000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643126400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643212800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643299200000,1099.0,\"\"],[1643385600000,1099.0,\"\"],[1643472000000,1099.0,\"\"],[1643558400000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643644800000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643731200000,1099.0,\"\"],[1643817600000,1099.0,\"\"],[1643904000000,1099.0,\"\"],[1643990400000,1099.0,\"\"],[1644076800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644163200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644249600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644336000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644422400000,1099.00,\"\"],[1644508800000,1099.00,\"1099元\"],[1644595200000,1199.00,\"1199元\"],[1644681600000,1199.00,\"1089元\"],[1644768000000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644854400000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644940800000,1099.00,\"1049元\"],[1645027200000,1099.00,\"\"],[1645113600000,1099.00,\"\"],[1645200000000,1099.00,\"\"],[1645286400000,1099.00,\"\"],[1645372800000,1099.00,\"\"],[1645459200000,1099.00,\"\"],[1645545600000,1099.00,\"\"],[1645632000000,1099.00,\"\"],[1645718400000,1099.00,\"1099元\"],[1645804800000,1099.00,\"1069元\"],[1645891200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1645977600000,1099.00,\"\"],[1646064000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646150400000,1099.00,\"\"],[1646236800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646323200000,1099.00,\"\"],[1646409600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646496000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646582400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646668800000,899.00,\"购买1件,当前价:1099.00,满减:每满1080减200\"],[1646755200000,1099.00,\"\"],[1646841600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646928000000,1099.0,\"\"],[1647014400000,1099.0,\"\"],[1647100800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647187200000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647273600000,1099.00,\"\"],[1647360000000,1099.00,\"1099元\"],[1647446400000,1049.0,\"购买1件,当前价:1099.0,满减:满1050减50\"],[1647532800000,1099.0,\"\"],[1647619200000,1099.00,\"\"],[1647705600000,1099.00,\"\"],[1647792000000,1099.00,\"\"],[1647878400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647964800000,1099.00,\"\"],[1648051200000,1099.00,\"\"],[1648137600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648224000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648310400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648396800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648483200000,1039.00,\"购买1件,当前价:1089.00,满减:满1050减50\"],[1648569600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648656000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648742400000,1049.00,\"\"],[1648828800000,1049.00,\"\"],[1648915200000,1049.00,\"1049元\"],[1649001600000,1099.00,\"\"],[1649088000000,1049.00,\"\"],[1649174400000,1049.00,\"\"],[1649260800000,1049.00,\"\"],[1649347200000,1049.00,\"\"],[1649433600000,1099.00,\"\"],[1649520000000,1099.00,\"\"],[1649606400000,1099.00,\"\"],[1649692800000,1049.0,\"购买1件,当前价格1049\"],[1649779200000,1099.00,\"\"],[1649865600000,1099.00,\"\"],[1649952000000,1049.00,\"1049元\"],[1650038400000,1099.00,\"\"],[1650124800000,1049.00,\"\"],[1650211200000,1049.00,\"\"],[1650297600000,1049.00,\"\"],[1650384000000,1049.00,\"\"],[1650470400000,1099.00,\"1099元\"],[1650556800000,1099.00,\"\"],[1650643200000,1099.00,\"\"],[1650729600000,1099.00,\"\"],[1650816000000,1099.00,\"\"],[1650902400000,1099.00,\"\"],[1650988800000,1099.00,\"\"],[1651075200000,1099.00,\"1099元\"],[1651161600000,1099.00,\"\"],[1651248000000,1099.00,\"\"],[1651334400000,1049.00,\"\"],[1651420800000,1099.00,\"\"],[1651507200000,1099.0000,\"\"],[1651593600000,1099.0000,\"\"],[1651680000000,1099.00,\"1099元\"],[1651766400000,1089.0,\"购买1件,当前价格1089\"],[1651852800000,1049.00,\"\"],[1651939200000,1089.00,\"\"],[1652025600000,1099.00,\"1099元\"],[1652112000000,1089.00,\"\"],[1652198400000,1049.00,\"\"],[1652284800000,1089.00,\"\"],[1652371200000,1049.00,\"\"],[1652457600000,1099.00,\"\"],[1652544000000,1099.00,\"\"],[1652630400000,1099.00,\"\"],[1652716800000,1099.00,\"1099元\"],[1652803200000,1099.00,\"\"],[1652889600000,1099.00,\"\"],[1652976000000,1049.00,\"\"],[1653062400000,1099.00,\"\"],[1653148800000,1099.00,\"\"],[1653235200000,1099.00,\"1099元\"],[1653321600000,1099.00,\"\"],[1653408000000,1099.00,\"\"],[1653494400000,1099.00,\"\"],[1653580800000,1099.00,\"\"],[1653667200000,1099.00,\"\"],[1653753600000,1099.00,\"\"],[1653840000000,1099.00,\"\"],[1653926400000,1049.0,\"\"],[1654012800000,1049.00,\"\"],[1654099200000,1049.00,\"\"],[1654185600000,1049.00,\"\"],[1654272000000,1049.00,\"\"],[1654358400000,1049.00,\"\"],[1654444800000,1049.00,\"\"],[1654531200000,1049.00,\"\"],[1654617600000,1049.00,\"\"],[1654704000000,1049.00,\"\"],[1654790400000,1049.00,\"\"],[1654876800000,1049.00,\"\"],[1654963200000,1049.00,\"\"],[1655049600000,1049.00,\"\"],[1655136000000,1049.00,\"\"],[1655222400000,1049.00,\"\"],[1655308800000,1049.00,\"1049元\"],[1655395200000,1049.00,\"\"],[1655481600000,1049.00,\"\"],[1655568000000,1049.00,\"\"],[1655654400000,1049.00,\"\"],[1655740800000,1049.00,\"\"],[1655827200000,1049.00,\"1049元\"],[1655913600000,1049.00,\"\"],[1656000000000,1049.00,\"\"],[1656086400000,1049.00,\"\"],[1656172800000,1049.00,\"1049元\"],[1656259200000,1049.00,\"\"],[1656345600000,1049.00,\"\"],[1656432000000,1049.00,\"\"],[1656518400000,1049.00,\"\"],[1656604800000,1049.00,\"1049元\"],[1656691200000,1049.00,\"\"],[1656777600000,1049.00,\"\"],[1656864000000,1049.00,\"1049元\"],[1656950400000,1049.00,\"\"],[1657036800000,1049.00,\"\"],[1657123200000,1049.00,\"\"],[1657209600000,1049.00,\"\"],[1657296000000,1049.00,\"\"],[1657382400000,1049.00,\"\"],[1657468800000,1049.00,\"\"],[1657555200000,1049.00,\"\"],[1657641600000,1049.00,\"\"],[1657728000000,1049.00,\"\"],[1657814400000,1049.00,\"\"],[1657900800000,1049.00,\"1049元\"],[1657987200000,1049.00,\"\"],[1658073600000,1049.00,\"\"],[1658160000000,1049.00,\"\"],[1658246400000,1049.00,\"\"],[1658332800000,1049.00,\"1049元\"],[1658419200000,1049.00,\"\"],[1658505600000,1049.00,\"\"],[1658592000000,1049.00,\"\"],[1658678400000,1049.00,\"\"],[1658764800000,1049.00,\"\"],[1658851200000,1049.00,\"\"]","ZheKouCount":95},"count":0}

5.Scrapy数据持久化开发

5.1. 编写Items(NySpider/items.py)

import scrapyclass HistoryPriceItem(scrapy.Item):"""自定义历史价格存储Item"""# 商品URLitemUrl = scrapy.Field()# 图片URLpicUrl = scrapy.Field()# 历史价格信息detailPrice = scrapy.Field()

5.2. 编写ItemPipelines(MySpider/pipelines.py), 以文件存储为例:

import scrapy.crawler
from itemadapter import ItemAdapter
from scrapy import signalsclass FilePipeline:def __init__(self, filename='store.txt'):self.filename = filenamedef process_item(self, item, spider):# 使用适配器包装item, 防止直接对item进行修改/删除影响后续Pipelineadapter = ItemAdapter(item)# 写入文件self.fp.write(adapter.get('itemUrl') + "    " + adapter.get('picUrl') + "    " + adapter.get('detailPrice') + '\n')return item@classmethoddef from_crawler(cls, crawler:scrapy.crawler.Crawler):s = cls()# 通过信号绑定行为# 爬虫启动时创建文件fpcrawler.signals.connect(s.opened, signal=signals.spider_opened)# 爬虫停止时关闭文件fpcrawler.signals.connect(s.closed, signal=signals.spider_closed)return sdef closed(self, spider):self.fp.close()def opened(self, spider):self.fp = open(self.filename, 'w', encoding='utf-8')self.fp.write('商品URL    主图URL    历史价格信息\n')

5.3. 修改Spiders(MySpider/spiders/manmanbuy.py)

import scrapy
import json
from MySpider.items import HistoryPriceItemclass ManmanbuySpider(scrapy.Spider):# 省略未修改内容custom_settings = {# 配置使用的Item管道'ITEM_PIPELINES': {'MySpider.pipelines.FilePipeline': 300,}}def parse_history_price(self, response: scrapy.http.Response):# 解析价格响应self.logger.info(response.text)data = json.loads(response.text)# 返回Itemreturn HistoryPriceItem(itemUrl=data['data']['spUrl'], picUrl=data['data']['spPic'], detailPrice=data['data']['datePrice'])

这里说明一下scrapy有5种添加配置方式,常用的有3种,高优先级配置会覆盖低优先级相同的Key的配置,不同的Key的配置则组合起来,按优先级从高到底分别是:

  1. 命令行配置

  2. 爬虫配置

  3. 项目全局配置

Spiders中的custom_settings参数就是爬虫配置

5.4. 运行scrapy crawl manmanbuy命令启动爬虫,观察当前目录发现生成一个store.txt文件,文件内容如下:

商品URL    主图URL    历史价格信息
https://item.jd.com/100011493273.html    http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg    [1621353600000,1199.00,""],[1621440000000,1199.00,""],[1621526400000,1199.00,""],[1621612800000,1199.00,""],[1621699200000,1199.00,""],[1621785600000,1199.00,""],[1621872000000,1199.00,""],[1621958400000,1199.00,""],[1622044800000,1199.00,""],[1622131200000,1199.00,""],[1622217600000,1199.00,""],[1622304000000,1199.00,""],[1622390400000,1199.00,""],[1622476800000,1199.00,""],[1622563200000,1199.00,""],[1622649600000,1199.00,""],[1622736000000,1199.00,""],[1622822400000,1199.00,""],[1622908800000,1199.00,""],[1622995200000,1199.00,""],[1623081600000,1199.00,""],[1623168000000,1199.00,""],[1623254400000,1199.00,""],[1623340800000,1199.00,"1199元"],[1623427200000,1199.00,""],[1623513600000,1199.00,""],[1623600000000,1199.00,""],[1623686400000,1199.00,""],[1623772800000,1199.00,""],[1623859200000,1199.00,""],[1623945600000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1624032000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624118400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624204800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624291200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624377600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624464000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624550400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624636800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624723200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624809600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624896000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624982400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625068800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625155200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625241600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625328000000,1199.00,""],[1625414400000,1199.00,""],[1625500800000,1189.0,"京东秒杀价:1189"],[1625587200000,1199.00,""],[1625673600000,1189.0,""],[1625760000000,1199.0,""],[1625846400000,1189.0,""],[1625932800000,1199.0000,""],[1626019200000,1199.0000,""],[1626105600000,1189.0,""],[1626192000000,1199.0,""],[1626278400000,1189.0,""],[1626364800000,1199.0000,""],[1626451200000,1199.0000,""],[1626537600000,1199.0000,""],[1626624000000,1199.0000,""],[1626710400000,1199.0000,""],[1626796800000,1199.0000,""],[1626883200000,1189.00,""],[1626969600000,1199.0000,""],[1627056000000,1199.0000,""],[1627142400000,1199.0000,""],[1627228800000,1199.0000,""],[1627315200000,1189.00,""],[1627401600000,1199.0000,""],[1627488000000,1189.00,"1189元"],[1627574400000,1189.00,""],[1627660800000,1199.00,""],[1627747200000,1179.00,"1179元"],[1627833600000,1189.0000,""],[1627920000000,1199.00,""],[1628006400000,1199.00,""],[1628092800000,1189.00,""],[1628179200000,1189.0000,""],[1628265600000,1189.0000,""],[1628352000000,1189.0000,""],[1628438400000,1199.00,""],[1628524800000,1189.00,""],[1628611200000,1199.0,""],[1628697600000,1189.0000,""],[1628784000000,1189.0000,""],[1628870400000,1189.0000,""],[1628956800000,1199.00,"1199元"],[1629043200000,1189.0000,""],[1629129600000,1199.00,""],[1629216000000,1189.00,""],[1629302400000,1199.0000,""],[1629388800000,1169.0,"京东秒杀价:1169"],[1629475200000,1199.00,""],[1629561600000,1199.00,""],[1629648000000,1199.00,""],[1629734400000,1169.00,""],[1629820800000,1199.0,""],[1629907200000,1189.0,"京东秒杀价:1189"],[1629993600000,1199.00,""],[1630080000000,1199.00,""],[1630166400000,1199.00,""],[1630252800000,1199.00,""],[1630339200000,1189.00,"1189元包邮"],[1630425600000,1189.00,""],[1630512000000,1175.00,"购买1件,plus价格1175"],[1630598400000,1189.00,""],[1630684800000,1199.0,""],[1630771200000,1189.0000,""],[1630857600000,1199.0,""],[1630944000000,1189.0000,""],[1631030400000,1199.0,""],[1631116800000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1631203200000,1189.0000,""],[1631289600000,1189.00,""],[1631376000000,1199.00,""],[1631462400000,1189.0000,""],[1631548800000,1189.0,""],[1631635200000,1199.0,""],[1631721600000,1189.00,""],[1631808000000,1199.00,""],[1631894400000,1189.0,""],[1631980800000,1199.0,""],[1632067200000,1169.00,""],[1632153600000,1169.00,""],[1632240000000,1169.00,""],[1632326400000,1189.0,""],[1632412800000,1169.0,""],[1632499200000,1199.0,""],[1632585600000,1169.00,""],[1632672000000,1199.00,""],[1632758400000,1169.0,""],[1632844800000,1199.0000,""],[1632931200000,1169.0,""],[1633017600000,1169.0,""],[1633104000000,1169.0,""],[1633190400000,1169.00,""],[1633276800000,1169.0000,""],[1633363200000,1199.00,""],[1633449600000,1199.00,""],[1633536000000,1169.0,""],[1633622400000,1169.0,""],[1633708800000,1169.0,"京东秒杀价:1169"],[1633795200000,1169.0,""],[1633881600000,1199.00,""],[1633968000000,1169.00,""],[1634054400000,1199.0,""],[1634140800000,1169.00,""],[1634227200000,1199.0,""],[1634313600000,1199.0,""],[1634400000000,1169.0,""],[1634486400000,1199.0,""],[1634572800000,1169.00,""],[1634659200000,1199.0000,""],[1634745600000,1169.00,""],[1634832000000,1199.0,""],[1634918400000,1199.0,""],[1635004800000,1199.0,""],[1635091200000,1199.0,""],[1635177600000,1199.0,""],[1635264000000,1199.0,""],[1635350400000,1199.0,""],[1635436800000,1199.0,""],[1635523200000,1099.00,"1099元 "],[1635609600000,1099.0,""],[1635696000000,1099.00,""],[1635782400000,1099.00,""],[1635868800000,1099.00,""],[1635955200000,1099.00,"购买1件,plus价格1099"],[1636041600000,949.00,"购买1件,当前价:1099.00,可叠加优惠券2:满880减150"],[1636128000000,1099.00,""],[1636214400000,1199.0,""],[1636300800000,1099.0,""],[1636387200000,1099.0,""],[1636473600000,1099.0,""],[1636560000000,979.00,"购买1件,当前价:1099.00,满减:每满1080减120"],[1636646400000,1099.0000,""],[1636732800000,1199.00,""],[1636819200000,1199.00,""],[1636905600000,1199.00,""],[1636992000000,1199.00,""],[1637078400000,1099.00,""],[1637164800000,1099.00,""],[1637251200000,1099.00,""],[1637337600000,1099.00,""],[1637424000000,1099.00,"1099元"],[1637510400000,1099.00,""],[1637596800000,1099.0,""],[1637683200000,1099.00,""],[1637769600000,1099.00,""],[1637856000000,1099.00,""],[1637942400000,1099.00,""],[1638028800000,1199.0000,""],[1638115200000,1199.00,""],[1638201600000,1199.00,""],[1638288000000,1099.0,""],[1638374400000,1099.00,""],[1638460800000,1099.00,""],[1638547200000,1199.0,""],[1638633600000,1199.00,""],[1638720000000,1099.0,""],[1638806400000,1099.00,""],[1638892800000,1099.00,""],[1638979200000,1099.00,"1099元"],[1639065600000,1099.00,""],[1639152000000,1089.0,""],[1639238400000,1089.0,""],[1639324800000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639411200000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639497600000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639584000000,1099.00,""],[1639670400000,1099.0,""],[1639756800000,1099.0,""],[1639843200000,1099.0,""],[1639929600000,1099.00,"1099元"],[1640016000000,1099.0,""],[1640102400000,1099.0,""],[1640188800000,1099.0,""],[1640275200000,1099.00,"1099元"],[1640361600000,1099.0,""],[1640448000000,1069.0,""],[1640534400000,1099.0,""],[1640620800000,1099.0,""],[1640707200000,1099.0,""],[1640793600000,1099.0,""],[1640880000000,1099.0,""],[1640966400000,1099.00,"1099元"],[1641052800000,1099.0,""],[1641139200000,1099.0,""],[1641225600000,1099.0,""],[1641312000000,1099.0,""],[1641398400000,1099.0,""],[1641484800000,1099.0,""],[1641571200000,1099.0,""],[1641657600000,1099.00,"1099元"],[1641744000000,1099.0,""],[1641830400000,1099.0,""],[1641916800000,1099.0,""],[1642003200000,1099.0,""],[1642089600000,1099.0,""],[1642176000000,1099.0,""],[1642262400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642348800000,1099.00,""],[1642435200000,1099.00,""],[1642521600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642608000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642694400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642780800000,1099.0,""],[1642867200000,1099.0,""],[1642953600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643040000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643126400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643212800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643299200000,1099.0,""],[1643385600000,1099.0,""],[1643472000000,1099.0,""],[1643558400000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643644800000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643731200000,1099.0,""],[1643817600000,1099.0,""],[1643904000000,1099.0,""],[1643990400000,1099.0,""],[1644076800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644163200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644249600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644336000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644422400000,1099.00,""],[1644508800000,1099.00,"1099元"],[1644595200000,1199.00,"1199元"],[1644681600000,1199.00,"1089元"],[1644768000000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644854400000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644940800000,1099.00,"1049元"],[1645027200000,1099.00,""],[1645113600000,1099.00,""],[1645200000000,1099.00,""],[1645286400000,1099.00,""],[1645372800000,1099.00,""],[1645459200000,1099.00,""],[1645545600000,1099.00,""],[1645632000000,1099.00,""],[1645718400000,1099.00,"1099元"],[1645804800000,1099.00,"1069元"],[1645891200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1645977600000,1099.00,""],[1646064000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646150400000,1099.00,""],[1646236800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646323200000,1099.00,""],[1646409600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646496000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646582400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646668800000,899.00,"购买1件,当前价:1099.00,满减:每满1080减200"],[1646755200000,1099.00,""],[1646841600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646928000000,1099.0,""],[1647014400000,1099.0,""],[1647100800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647187200000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647273600000,1099.00,""],[1647360000000,1099.00,"1099元"],[1647446400000,1049.0,"购买1件,当前价:1099.0,满减:满1050减50"],[1647532800000,1099.0,""],[1647619200000,1099.00,""],[1647705600000,1099.00,""],[1647792000000,1099.00,""],[1647878400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647964800000,1099.00,""],[1648051200000,1099.00,""],[1648137600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648224000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648310400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648396800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648483200000,1039.00,"购买1件,当前价:1089.00,满减:满1050减50"],[1648569600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648656000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648742400000,1049.00,""],[1648828800000,1049.00,""],[1648915200000,1049.00,"1049元"],[1649001600000,1099.00,""],[1649088000000,1049.00,""],[1649174400000,1049.00,""],[1649260800000,1049.00,""],[1649347200000,1049.00,""],[1649433600000,1099.00,""],[1649520000000,1099.00,""],[1649606400000,1099.00,""],[1649692800000,1049.0,"购买1件,当前价格1049"],[1649779200000,1099.00,""],[1649865600000,1099.00,""],[1649952000000,1049.00,"1049元"],[1650038400000,1099.00,""],[1650124800000,1049.00,""],[1650211200000,1049.00,""],[1650297600000,1049.00,""],[1650384000000,1049.00,""],[1650470400000,1099.00,"1099元"],[1650556800000,1099.00,""],[1650643200000,1099.00,""],[1650729600000,1099.00,""],[1650816000000,1099.00,""],[1650902400000,1099.00,""],[1650988800000,1099.00,""],[1651075200000,1099.00,"1099元"],[1651161600000,1099.00,""],[1651248000000,1099.00,""],[1651334400000,1049.00,""],[1651420800000,1099.00,""],[1651507200000,1099.0000,""],[1651593600000,1099.0000,""],[1651680000000,1099.00,"1099元"],[1651766400000,1089.0,"购买1件,当前价格1089"],[1651852800000,1049.00,""],[1651939200000,1089.00,""],[1652025600000,1099.00,"1099元"],[1652112000000,1089.00,""],[1652198400000,1049.00,""],[1652284800000,1089.00,""],[1652371200000,1049.00,""],[1652457600000,1099.00,""],[1652544000000,1099.00,""],[1652630400000,1099.00,""],[1652716800000,1099.00,"1099元"],[1652803200000,1099.00,""],[1652889600000,1099.00,""],[1652976000000,1049.00,""],[1653062400000,1099.00,""],[1653148800000,1099.00,""],[1653235200000,1099.00,"1099元"],[1653321600000,1099.00,""],[1653408000000,1099.00,""],[1653494400000,1099.00,""],[1653580800000,1099.00,""],[1653667200000,1099.00,""],[1653753600000,1099.00,""],[1653840000000,1099.00,""],[1653926400000,1049.0,""],[1654012800000,1049.00,""],[1654099200000,1049.00,""],[1654185600000,1049.00,""],[1654272000000,1049.00,""],[1654358400000,1049.00,""],[1654444800000,1049.00,""],[1654531200000,1049.00,""],[1654617600000,1049.00,""],[1654704000000,1049.00,""],[1654790400000,1049.00,""],[1654876800000,1049.00,""],[1654963200000,1049.00,""],[1655049600000,1049.00,""],[1655136000000,1049.00,""],[1655222400000,1049.00,""],[1655308800000,1049.00,"1049元"],[1655395200000,1049.00,""],[1655481600000,1049.00,""],[1655568000000,1049.00,""],[1655654400000,1049.00,""],[1655740800000,1049.00,""],[1655827200000,1049.00,"1049元"],[1655913600000,1049.00,""],[1656000000000,1049.00,""],[1656086400000,1049.00,""],[1656172800000,1049.00,"1049元"],[1656259200000,1049.00,""],[1656345600000,1049.00,""],[1656432000000,1049.00,""],[1656518400000,1049.00,""],[1656604800000,1049.00,"1049元"],[1656691200000,1049.00,""],[1656777600000,1049.00,""],[1656864000000,1049.00,"1049元"],[1656950400000,1049.00,""],[1657036800000,1049.00,""],[1657123200000,1049.00,""],[1657209600000,1049.00,""],[1657296000000,1049.00,""],[1657382400000,1049.00,""],[1657468800000,1049.00,""],[1657555200000,1049.00,""],[1657641600000,1049.00,""],[1657728000000,1049.00,""],[1657814400000,1049.00,""],[1657900800000,1049.00,"1049元"],[1657987200000,1049.00,""],[1658073600000,1049.00,""],[1658160000000,1049.00,""],[1658246400000,1049.00,""],[1658332800000,1049.00,"1049元"],[1658419200000,1049.00,""],[1658505600000,1049.00,""],[1658592000000,1049.00,""],[1658678400000,1049.00,""],[1658764800000,1049.00,""],[1658851200000,1049.00,""]

则说明程序执行正常

PS: 若想要将数据持久化至Mysql/MongoDB/Elasticsearch,只需编写对应的ItemPipelines实现, 修改爬虫导入的ITEM_PIPELINES配置即可,实现数据持久化与爬虫逻辑的解耦。

结语

本文为大家简要说明了使用Scrapy的理由,以及通过一个按理为大家演示了如何开发一个Scrapy爬虫项目。后续将持续为大家带来Scrapy更多。

如果大家觉得文章还不错的话,欢迎大家三连(点赞+在看+收藏)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/109924.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

靶机 Chill_Hack

Chill_Hack 信息搜集 存活检测 arp-scan -l 详细扫描 扫描结果 显示允许 ftp 匿名链接 FTP 匿名登录 匿名登陆 ftp 下载文件并查看 anonymous10.4.7.139下载命令 get note.txt查看文件 译 Anurodh告诉我&#xff0c;在命令 Apaar 中有一些字符串过滤后台扫描 扫描结果…

leetCode 5. 最长回文子串 动态规划 + 优化空间 / 中心扩展法 + 双指针

5. 最长回文子串 - 力扣&#xff08;LeetC5. 最长回文子串 - 力扣&#xff08;LeetCode&#xff09;5. 最长回文子串 - 力扣&#xff08;LeetC 给你一个字符串 s&#xff0c;找到 s 中最长的回文子串。如果字符串的反序与原始字符串相同&#xff0c;则该字符串称为回文字符串。…

模型的选择与调优(网格搜索与交叉验证)

1、为什么需要交叉验证 交叉验证目的&#xff1a;为了让被评估的模型更加准确可信 2、什么是交叉验证(cross validation) 交叉验证&#xff1a;将拿到的训练数据&#xff0c;分为训练和验证集。以下图为例&#xff1a;将数据分成4份&#xff0c;其中一份作为验证集。然后经过…

VulnHub lazysysadmin

一、信息收集 1.nmap扫描开发端口 开放了&#xff1a;22、80、445 访问80端口&#xff0c;没有发现什么有价值的信息 2.扫描共享文件 enum4linux--扫描共享文件 使用&#xff1a; enum4linux 192.168.103.182windows访问共享文件 \\192.168.103.182\文件夹名称信息收集&…

UWB安全数据通讯STS-加密、身份认证

DW3000系列才能支持UWB安全数据通讯&#xff0c;DW1000不支持 IEEE 802.15.4a没有数据通讯安全保护机制&#xff0c;IEEE 802.15.4z中指定的扩展得到增强&#xff08;在PHY/RF级别&#xff09;&#xff1a;增添了一个重要特性“扰频时间戳序列&#xff08;STS&#xff09;”&a…

软件开发“自我毁灭”的七宗罪

软件开发是一门具有挑战性的学科&#xff0c;它建立在数以百万计的参数、变量、库以及更多必须绝对正确的因素之上。即便是一个字符不合适&#xff0c;整个堆栈也会随之瓦解。 多年来&#xff0c;软件开发团队已经想出了一些完成工作的规则。从复杂的方法论到新兴的学科和哲学…

百度地图高级进阶开发:圆形区域周边搜索地图监听事件(覆盖物重叠显示层级\图像标注监听事件、setZIndex和setTop方法)

百度地图API 使用百度地图API添加多覆盖物渲染时&#xff0c;会出现覆盖物被相互覆盖而导致都无法触发它们自己的监听&#xff1b;在百度地图API里&#xff0c;map的z-index为0&#xff0c;但是触发任意覆盖物的监听如click时也必定会触发map的监听&#xff1b; 项目需求 在…

最详细STM32,cubeMX 点亮 led

这篇文章将详细介绍 如何在 stm32103 板子上点亮一个LED. 文章目录 前言一、开发环境搭建。二、LED 原理图解读三、什么是 GPIO四、cubeMX 配置工程五、解读 cubeMX 生成的代码六、延时函数七、控制引脚状态函数点亮 LED 八、GPIO 的工作模式九、为什么使用推挽输出驱动 LED总结…

基于svg+js实现简单动态时钟

实现思路 创建SVG容器&#xff1a;首先&#xff0c;创建一个SVG容器元素&#xff0c;用于容纳时钟的各个部分。指定SVG的宽度、高度以及命名空间。 <svg width"200" height"200" xmlns"http://www.w3.org/2000/svg"><!-- 在此添加时钟…

排查手机应用app微信登录问题不跳转失败原因汇总及其解决方案

经过最近我发的文章,我个人觉得解决了不少小问题,因为最近很小白的问题已经没有人私聊问我了,我总结了一下排查手机应用app微信登录问题不跳转失败的原因汇总及其解决方案在这篇文章中,分析微信登录不跳转的原因,并提供解决方案。希望通过这篇文章,能够帮助大家顺利解决这…

紫光同创FPGA实现UDP协议栈网络视频传输,带录像和抓拍功能,基于YT8511和RTL8211,提供2套PDS工程源码和技术支持

目录 1、前言免责声明 2、相关方案推荐我这里已有的以太网方案紫光同创FPGA精简版UDP方案紫光同创FPGA带ping功能UDP方案紫光同创FPGA精简版UDP视频传输方案 3、设计思路框架OV5640摄像头配置及采集数据缓冲FIFOUDP协议栈详解MAC层发送MAC发送模式MAC层接收ARP发送ARP接收ARP缓…

DCDC Buck电路地弹造成的影响

很多读者都应该听过地弹&#xff0c;但是实际遇到的地弹的问题应该很少。本案例就是一个地弹现象导致电源芯片工作不正常的案例。 問題描述 如下图1 &#xff0c;产品其中一个供电是12V转3.3V的电路&#xff0c;产品发货50K左右以后&#xff0c;大约有1%的产品无法启动&#…

4.MidBook项目经验之MonogoDB和easyExcel导入导出

1.数据字典(固定的数据,省市级有层级关系的) //mp表如果没有这个字段,防报错,eleUI需要这个字段TableField(exist false) //父根据id得到子数据 ,从controller开始自动生成代码-->service//hasChildren怎么判断,只需要判断children的parentid的count数量>0就可以了//优化…

vue.js - 断开发送的请求,解决接口重复请求数据错误问题(vue中axios多次相同请求中断上一个)

描述 进入页面时第一个接口还在请求,立即切换tab请求第二个接口。但是第二个接口响应比第一个接口响应快,页面展示的时第一个接口的数据,如图: 解决方法 判断如果是相同的接

visual studio安装时候修改共享组件、工具和SDK路径方法

安装了VsStudio后,如果自己修改了Shared路径&#xff0c;当卸载旧版本&#xff0c;需要安装新版本时发现&#xff0c;之前的Shared路径无法进行修改&#xff0c;这就很坑爹了&#xff0c;因为我运行flutter程序的时候&#xff0c;报错找不到windows sdk的位置&#xff0c;所以我…

怎么把flac音频变为mp3?

怎么把flac音频变为mp3&#xff1f;FLAC音频格式在许多平台和应用程序中都得到支持和应用。FLAC音频格式被广泛支持和应用。许多平台、设备和应用程序都支持FLAC格式&#xff0c;如Windows、macOS和Linux操作系统、各种音乐播放器软件、智能手机和平板电脑、在线音乐平台和流媒…

c++小知识

内联函数 inline 用来替换宏函数 不能分文件编辑 在c语言中#define NULL 0在c中使用nullptr表示空指针class内存的大小计算规则使用的是内存对齐 没有成员&#xff0c;但是还有1个字节&#xff0c;我们使用这个来标记他是个类 类成员函数不存在于类中 为什么每个对象使用的…

基于Linux安装Hive

Hive安装包下载地址 Index of /dist/hive 上传解压 [rootmaster opt]# cd /usr/local/ [rootmaster local]# tar -zxvf /opt/apache-hive-3.1.2-bin.tar.gz重命名及更改权限 mv apache-hive-3.1.2-bin hivechown -R hadoop:hadoop hive配置环境变量 #编辑配置 vi /etc/pro…

云安全—云计算基础

0x00 前言 学习云安全&#xff0c;那么必然要对云计算相关的内容进行学习和了解&#xff0c;所以云安全会分为两个部分来进行&#xff0c;首先是云计算先关的内容。 0x01 云计算 广泛传播 云计算最早大范围传播是2006年&#xff0c;8月&#xff0c;在圣何塞【1】举办的SES&a…

SpringMVC的响应处理

目录 传统同步业务数据的响应 请求资源转发 请求资源重定向 响应数据模型 直接回写数据给客户端 前后端分离异步业务数据响应 在前面的文章中&#xff0c;我们已经介绍了Spring接收请求的部分&#xff0c;接下来看Spring如何给客户端响应数据 传统同步业务数据的响应 准…