当前位置: 首页 > news >正文

深入理解Scrapy


Scrapy是什么

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Scrapy是适用于Python的一个快速、简单、功能强大的web爬虫框架,通常用于抓取web站点并从页面中提取结构化的数据,也可以用来做监控与自动化测试。架构图如下所示:

640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

Scrapy如何工作

理解工作原理更有益于后面的学习(也可先看后面的快速上手后再返回来看这里),Scrapy运行流程图如下所示:

640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

运行过程如下:

  1. 程序启动后将会创建一个/多个Spiders(爬虫)Spiders会将Requests(请求)经过SpiderMiddlewares(爬虫中间件)加工,再交给Engine(引擎)

  2. EngineSpiders传递过来多个请求转交给Scheduler(调度器),由调度器来安排请求。

  3. Scheduler将需要马上执行的请求交回给Engine

  4. Engine将请求经过DownloaderMiddlewares(下载器中间件)加工,再发送给Downloader(下载器)

  5. Downloader使用Requests完成页面/接口的下载,并生成Responses(响应), 将Responses经过DownloaderMiddlewares再转交给Engine

  6. EngineResponses经过SpiderMiddlewares交回给爬虫处理Responses

  7. Spiders处理Responses后产生的结果返回给Engine。(Spiders处理 Responses

  8. 步骤7 中Spiders处理结果返回的Requests对象将回到步骤2;返回的Items(数据结构化对象)或者dict(字典对象)将交给ItemPipelines(数据管道)处理。

  9. 通过定制ItemPipelines来控制数据如何持久化及处理。

开始使用Scrapy

1.安装Scrapy

通过如下命令安装Scrapy

pip install scrapy

Scrapy安装完成会提供一个scrapy工具, 通过命令scrapy --help显示如下则表示安装成功:

> scrapy --help
Scrapy 2.6.2 - no active projectUsage:scrapy <command> [options] [args]Available commands:bench         Run quick benchmark testcommandsfetch         Fetch a URL using the Scrapy downloadergenspider     Generate new spider using pre-defined templatesrunspider     Run a self-contained spider (without creating a project)settings      Get settings valuesshell         Interactive scraping consolestartproject  Create new projectversion       Print Scrapy versionview          Open URL in browser, as seen by Scrapy[ more ]      More commands available when run from project directoryUse "scrapy <command> -h" to see more info about a command

2.创建scrapy项目

通过命令scrapy startproject xxx创建一个Scrapy项目:

scrapy startproject MySpider

命令执行之后在当前目录下会生成一个MySpider的目录,目录结构如下所示:

MySpider/
├─scrapy.cfg
└─MySpider/├─items.py├─middlewares.py├─pipelines.py├─settings.py├─__init__.py└─spiders/└─__init__.py
  • items.py文件存放自定义的Items

  • middlewares.py 文件存放SpiderMiddlewaresDownloaderMiddlewares

  • pipelines.py 文件存放自定义的ItemPipelines

  • settings.py 文件存放全局的配置信息

  • spiders/ 目录存放所有Spiders

之后以第一个MySpider/目录作为项目根目录

3.创建Scrapy爬虫

创建Scrapy爬虫命令scrapy genspider [spidername] [allow_domain] 以慢慢买历史价格接口为例,创建慢慢买爬虫:

scrapy genspider manmanbuy manmanbuy.com

慢慢买历史价格爬取流程:

  1. 访问 http://tool.manmanbuy.com/HistoryLowest.aspx 页面获取隐藏的 <input id="ticket" ...> 标签的 value 值。

  2. 通过 步骤1 获取的 value 值, 加工生成请求头的 Authorization 参数

  3. 生成 请求参数 token 的值

  4. 调用 http://tool.manmanbuy.com/api.ashx 接口获取商品历史价格 (该接口依赖有效cookie, 如何获取有效cookie不是本文重点暂不说明)

此时在spiders/目录下就能找到生成的manmanbuy.py文件,文件内容如下:

import scrapyclass ManmanbuySpider(scrapy.Spider):name = 'manmanbuy'allowed_domains = ['manmanbuy.com']start_urls = ['http://manmanbuy.com/']def parse(self, response):pass

其中name为爬虫名称, allowed_domains 为允许访问的域名, start_urls 为启动爬取的地址。

Spider 支持两种启动爬取方式, 一种为便捷的配置start_urls方式, 启动后将直接爬取配置的url, 另一种为重写start_requests方法,返回自定义初始化的Request

4.Scrapy爬虫开发

4.1. 编写Spiders(MySpider/spiders/manmanbuy.py)

import scrapy
from urllib.parse import quote
import hashlib
import time
import copyclass ManmanbuySpider(scrapy.Spider):name = 'manmanbuy'allowed_domains = ['manmanbuy.com']def start_requests(self):# 以京东单个商品查询历史价格为例, 商品ID: 100011493273item_urls = ['https://item.jd.com/100011493273.html']# 定义请求头headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',}# 第一步先从h5页面获取ticket参数for item_url in item_urls:yield scrapy.Request(url='http://tool.manmanbuy.com/HistoryLowest.aspx?url=' + item_url,headers=headers,# 透传参数meta={'key': item_url})def parse(self, response: scrapy.http.Response):# 从页面中获取ticket值ticket = response.css('#ticket')[0].attrib['value']# 获取下一段接口请求参数req = parse_req({'key': response.meta['key'], 'method': 'getHistoryTrend'})# 请求头headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',# 计算auth'Authorization': parse_basic_auth(ticket),}return scrapy.FormRequest(url='http://tool.manmanbuy.com/api.ashx',method='POST',formdata=req,headers=headers,cookies=self.get_cookies(),# 自定义回调地址callback=self.parse_history_price)def get_cookies(self):# 省略获取cookie逻辑cks = '_ga=GA1.2.604426644.1596510819; ASP.NET_SessionId=bbyuxdftfkcf5mrijdgkmnc5; Hm_lvt_01a310dc95b71311522403c3237671ae=1658906329; Hm_lvt_85f48cee3e51cd48eaba80781b243db3=1658748396,1658906330; _gid=GA1.2.472137414.1658906330; _gat_gtag_UA_145348783_1=1; 60014_mmbuser=VQYJA1IFBTBSVwNdClFVUgdRUQcLUgdXDg1RBgNTAVVUVAZRBQFeAw%3d%3d; Hm_lpvt_85f48cee3e51cd48eaba80781b243db3=1658906625; Hm_lpvt_01a310dc95b71311522403c3237671ae=1658906625'cookies = {}for one in cks.split(';'):k, v = one.strip().split("=")cookies[k] = vreturn cookiesdef parse_history_price(self, response: scrapy.http.Response):# 输出相应self.logger.info(response.text)def parse_basic_auth(ticket):"""这是解析ticket的值啊,就是上面说的那逻辑"""return 'BasicAuth ' + ticket[:160][-4:] + ticket[:160 - 4]def parse_req(d):"""这是解析请求,增加t和token参数"""d['t'] = str(int(time.time() * 1000))n = copy.deepcopy(d)ks = list(n.keys())ks.sort()ask = 'c5c3f201a8e8fc634d37a766a0299218'mask = askfor k in ks:mask += f'{k}{quote(str(n[k])).replace("/", "%2F")}'mask += askmask = mask.upper()md5 = hashlib.md5()md5.update(mask.encode('utf-8'))d['token'] = md5.hexdigest().upper()return d

4.2. 修改Settings(MySpider/settings.py)

# robots.txt 文件检查, 默认为: true, 需要改为Flase
ROBOTSTXT_OBEY = False

运行scrapy crawl manmanbuy命令启动爬虫, 观察日志能够正常获取数据:

{"msg":"","code":0,"data":{"haveTrend":1,"changPriceRemark":"降幅5%","runtime":41,"zouShi_test":2,"changePriceCount":14,"spbh":"1|100011493273","spUrl":"https://item.jd.com/100011493273.html","spPic":"http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg","currentPrice":1049.00,"spName":"荣耀Play5T 22.5W超级快充 5000mAh大电池 6.5英寸护眼屏 全网通8GB+128GB极光蓝","lowerDate":"2022-03-08T00:00:00+08:00","lowerPrice":899.00,"bjid":551120462,"zouShi":2,"siteId":1,"siteName":"京东商城","datePrice":"[1621353600000,1199.00,\"\"],[1621440000000,1199.00,\"\"],[1621526400000,1199.00,\"\"],[1621612800000,1199.00,\"\"],[1621699200000,1199.00,\"\"],[1621785600000,1199.00,\"\"],[1621872000000,1199.00,\"\"],[1621958400000,1199.00,\"\"],[1622044800000,1199.00,\"\"],[1622131200000,1199.00,\"\"],[1622217600000,1199.00,\"\"],[1622304000000,1199.00,\"\"],[1622390400000,1199.00,\"\"],[1622476800000,1199.00,\"\"],[1622563200000,1199.00,\"\"],[1622649600000,1199.00,\"\"],[1622736000000,1199.00,\"\"],[1622822400000,1199.00,\"\"],[1622908800000,1199.00,\"\"],[1622995200000,1199.00,\"\"],[1623081600000,1199.00,\"\"],[1623168000000,1199.00,\"\"],[1623254400000,1199.00,\"\"],[1623340800000,1199.00,\"1199元\"],[1623427200000,1199.00,\"\"],[1623513600000,1199.00,\"\"],[1623600000000,1199.00,\"\"],[1623686400000,1199.00,\"\"],[1623772800000,1199.00,\"\"],[1623859200000,1199.00,\"\"],[1623945600000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1624032000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624118400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624204800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624291200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624377600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624464000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624550400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624636800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624723200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624809600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624896000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624982400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625068800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625155200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625241600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625328000000,1199.00,\"\"],[1625414400000,1199.00,\"\"],[1625500800000,1189.0,\"京东秒杀价:1189\"],[1625587200000,1199.00,\"\"],[1625673600000,1189.0,\"\"],[1625760000000,1199.0,\"\"],[1625846400000,1189.0,\"\"],[1625932800000,1199.0000,\"\"],[1626019200000,1199.0000,\"\"],[1626105600000,1189.0,\"\"],[1626192000000,1199.0,\"\"],[1626278400000,1189.0,\"\"],[1626364800000,1199.0000,\"\"],[1626451200000,1199.0000,\"\"],[1626537600000,1199.0000,\"\"],[1626624000000,1199.0000,\"\"],[1626710400000,1199.0000,\"\"],[1626796800000,1199.0000,\"\"],[1626883200000,1189.00,\"\"],[1626969600000,1199.0000,\"\"],[1627056000000,1199.0000,\"\"],[1627142400000,1199.0000,\"\"],[1627228800000,1199.0000,\"\"],[1627315200000,1189.00,\"\"],[1627401600000,1199.0000,\"\"],[1627488000000,1189.00,\"1189元\"],[1627574400000,1189.00,\"\"],[1627660800000,1199.00,\"\"],[1627747200000,1179.00,\"1179元\"],[1627833600000,1189.0000,\"\"],[1627920000000,1199.00,\"\"],[1628006400000,1199.00,\"\"],[1628092800000,1189.00,\"\"],[1628179200000,1189.0000,\"\"],[1628265600000,1189.0000,\"\"],[1628352000000,1189.0000,\"\"],[1628438400000,1199.00,\"\"],[1628524800000,1189.00,\"\"],[1628611200000,1199.0,\"\"],[1628697600000,1189.0000,\"\"],[1628784000000,1189.0000,\"\"],[1628870400000,1189.0000,\"\"],[1628956800000,1199.00,\"1199元\"],[1629043200000,1189.0000,\"\"],[1629129600000,1199.00,\"\"],[1629216000000,1189.00,\"\"],[1629302400000,1199.0000,\"\"],[1629388800000,1169.0,\"京东秒杀价:1169\"],[1629475200000,1199.00,\"\"],[1629561600000,1199.00,\"\"],[1629648000000,1199.00,\"\"],[1629734400000,1169.00,\"\"],[1629820800000,1199.0,\"\"],[1629907200000,1189.0,\"京东秒杀价:1189\"],[1629993600000,1199.00,\"\"],[1630080000000,1199.00,\"\"],[1630166400000,1199.00,\"\"],[1630252800000,1199.00,\"\"],[1630339200000,1189.00,\"1189元包邮\"],[1630425600000,1189.00,\"\"],[1630512000000,1175.00,\"购买1件,plus价格1175\"],[1630598400000,1189.00,\"\"],[1630684800000,1199.0,\"\"],[1630771200000,1189.0000,\"\"],[1630857600000,1199.0,\"\"],[1630944000000,1189.0000,\"\"],[1631030400000,1199.0,\"\"],[1631116800000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1631203200000,1189.0000,\"\"],[1631289600000,1189.00,\"\"],[1631376000000,1199.00,\"\"],[1631462400000,1189.0000,\"\"],[1631548800000,1189.0,\"\"],[1631635200000,1199.0,\"\"],[1631721600000,1189.00,\"\"],[1631808000000,1199.00,\"\"],[1631894400000,1189.0,\"\"],[1631980800000,1199.0,\"\"],[1632067200000,1169.00,\"\"],[1632153600000,1169.00,\"\"],[1632240000000,1169.00,\"\"],[1632326400000,1189.0,\"\"],[1632412800000,1169.0,\"\"],[1632499200000,1199.0,\"\"],[1632585600000,1169.00,\"\"],[1632672000000,1199.00,\"\"],[1632758400000,1169.0,\"\"],[1632844800000,1199.0000,\"\"],[1632931200000,1169.0,\"\"],[1633017600000,1169.0,\"\"],[1633104000000,1169.0,\"\"],[1633190400000,1169.00,\"\"],[1633276800000,1169.0000,\"\"],[1633363200000,1199.00,\"\"],[1633449600000,1199.00,\"\"],[1633536000000,1169.0,\"\"],[1633622400000,1169.0,\"\"],[1633708800000,1169.0,\"京东秒杀价:1169\"],[1633795200000,1169.0,\"\"],[1633881600000,1199.00,\"\"],[1633968000000,1169.00,\"\"],[1634054400000,1199.0,\"\"],[1634140800000,1169.00,\"\"],[1634227200000,1199.0,\"\"],[1634313600000,1199.0,\"\"],[1634400000000,1169.0,\"\"],[1634486400000,1199.0,\"\"],[1634572800000,1169.00,\"\"],[1634659200000,1199.0000,\"\"],[1634745600000,1169.00,\"\"],[1634832000000,1199.0,\"\"],[1634918400000,1199.0,\"\"],[1635004800000,1199.0,\"\"],[1635091200000,1199.0,\"\"],[1635177600000,1199.0,\"\"],[1635264000000,1199.0,\"\"],[1635350400000,1199.0,\"\"],[1635436800000,1199.0,\"\"],[1635523200000,1099.00,\"1099元 \"],[1635609600000,1099.0,\"\"],[1635696000000,1099.00,\"\"],[1635782400000,1099.00,\"\"],[1635868800000,1099.00,\"\"],[1635955200000,1099.00,\"购买1件,plus价格1099\"],[1636041600000,949.00,\"购买1件,当前价:1099.00,可叠加优惠券2:满880减150\"],[1636128000000,1099.00,\"\"],[1636214400000,1199.0,\"\"],[1636300800000,1099.0,\"\"],[1636387200000,1099.0,\"\"],[1636473600000,1099.0,\"\"],[1636560000000,979.00,\"购买1件,当前价:1099.00,满减:每满1080减120\"],[1636646400000,1099.0000,\"\"],[1636732800000,1199.00,\"\"],[1636819200000,1199.00,\"\"],[1636905600000,1199.00,\"\"],[1636992000000,1199.00,\"\"],[1637078400000,1099.00,\"\"],[1637164800000,1099.00,\"\"],[1637251200000,1099.00,\"\"],[1637337600000,1099.00,\"\"],[1637424000000,1099.00,\"1099元\"],[1637510400000,1099.00,\"\"],[1637596800000,1099.0,\"\"],[1637683200000,1099.00,\"\"],[1637769600000,1099.00,\"\"],[1637856000000,1099.00,\"\"],[1637942400000,1099.00,\"\"],[1638028800000,1199.0000,\"\"],[1638115200000,1199.00,\"\"],[1638201600000,1199.00,\"\"],[1638288000000,1099.0,\"\"],[1638374400000,1099.00,\"\"],[1638460800000,1099.00,\"\"],[1638547200000,1199.0,\"\"],[1638633600000,1199.00,\"\"],[1638720000000,1099.0,\"\"],[1638806400000,1099.00,\"\"],[1638892800000,1099.00,\"\"],[1638979200000,1099.00,\"1099元\"],[1639065600000,1099.00,\"\"],[1639152000000,1089.0,\"\"],[1639238400000,1089.0,\"\"],[1639324800000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639411200000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639497600000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639584000000,1099.00,\"\"],[1639670400000,1099.0,\"\"],[1639756800000,1099.0,\"\"],[1639843200000,1099.0,\"\"],[1639929600000,1099.00,\"1099元\"],[1640016000000,1099.0,\"\"],[1640102400000,1099.0,\"\"],[1640188800000,1099.0,\"\"],[1640275200000,1099.00,\"1099元\"],[1640361600000,1099.0,\"\"],[1640448000000,1069.0,\"\"],[1640534400000,1099.0,\"\"],[1640620800000,1099.0,\"\"],[1640707200000,1099.0,\"\"],[1640793600000,1099.0,\"\"],[1640880000000,1099.0,\"\"],[1640966400000,1099.00,\"1099元\"],[1641052800000,1099.0,\"\"],[1641139200000,1099.0,\"\"],[1641225600000,1099.0,\"\"],[1641312000000,1099.0,\"\"],[1641398400000,1099.0,\"\"],[1641484800000,1099.0,\"\"],[1641571200000,1099.0,\"\"],[1641657600000,1099.00,\"1099元\"],[1641744000000,1099.0,\"\"],[1641830400000,1099.0,\"\"],[1641916800000,1099.0,\"\"],[1642003200000,1099.0,\"\"],[1642089600000,1099.0,\"\"],[1642176000000,1099.0,\"\"],[1642262400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642348800000,1099.00,\"\"],[1642435200000,1099.00,\"\"],[1642521600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642608000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642694400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642780800000,1099.0,\"\"],[1642867200000,1099.0,\"\"],[1642953600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643040000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643126400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643212800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643299200000,1099.0,\"\"],[1643385600000,1099.0,\"\"],[1643472000000,1099.0,\"\"],[1643558400000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643644800000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643731200000,1099.0,\"\"],[1643817600000,1099.0,\"\"],[1643904000000,1099.0,\"\"],[1643990400000,1099.0,\"\"],[1644076800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644163200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644249600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644336000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644422400000,1099.00,\"\"],[1644508800000,1099.00,\"1099元\"],[1644595200000,1199.00,\"1199元\"],[1644681600000,1199.00,\"1089元\"],[1644768000000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644854400000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644940800000,1099.00,\"1049元\"],[1645027200000,1099.00,\"\"],[1645113600000,1099.00,\"\"],[1645200000000,1099.00,\"\"],[1645286400000,1099.00,\"\"],[1645372800000,1099.00,\"\"],[1645459200000,1099.00,\"\"],[1645545600000,1099.00,\"\"],[1645632000000,1099.00,\"\"],[1645718400000,1099.00,\"1099元\"],[1645804800000,1099.00,\"1069元\"],[1645891200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1645977600000,1099.00,\"\"],[1646064000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646150400000,1099.00,\"\"],[1646236800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646323200000,1099.00,\"\"],[1646409600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646496000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646582400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646668800000,899.00,\"购买1件,当前价:1099.00,满减:每满1080减200\"],[1646755200000,1099.00,\"\"],[1646841600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646928000000,1099.0,\"\"],[1647014400000,1099.0,\"\"],[1647100800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647187200000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647273600000,1099.00,\"\"],[1647360000000,1099.00,\"1099元\"],[1647446400000,1049.0,\"购买1件,当前价:1099.0,满减:满1050减50\"],[1647532800000,1099.0,\"\"],[1647619200000,1099.00,\"\"],[1647705600000,1099.00,\"\"],[1647792000000,1099.00,\"\"],[1647878400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647964800000,1099.00,\"\"],[1648051200000,1099.00,\"\"],[1648137600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648224000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648310400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648396800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648483200000,1039.00,\"购买1件,当前价:1089.00,满减:满1050减50\"],[1648569600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648656000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648742400000,1049.00,\"\"],[1648828800000,1049.00,\"\"],[1648915200000,1049.00,\"1049元\"],[1649001600000,1099.00,\"\"],[1649088000000,1049.00,\"\"],[1649174400000,1049.00,\"\"],[1649260800000,1049.00,\"\"],[1649347200000,1049.00,\"\"],[1649433600000,1099.00,\"\"],[1649520000000,1099.00,\"\"],[1649606400000,1099.00,\"\"],[1649692800000,1049.0,\"购买1件,当前价格1049\"],[1649779200000,1099.00,\"\"],[1649865600000,1099.00,\"\"],[1649952000000,1049.00,\"1049元\"],[1650038400000,1099.00,\"\"],[1650124800000,1049.00,\"\"],[1650211200000,1049.00,\"\"],[1650297600000,1049.00,\"\"],[1650384000000,1049.00,\"\"],[1650470400000,1099.00,\"1099元\"],[1650556800000,1099.00,\"\"],[1650643200000,1099.00,\"\"],[1650729600000,1099.00,\"\"],[1650816000000,1099.00,\"\"],[1650902400000,1099.00,\"\"],[1650988800000,1099.00,\"\"],[1651075200000,1099.00,\"1099元\"],[1651161600000,1099.00,\"\"],[1651248000000,1099.00,\"\"],[1651334400000,1049.00,\"\"],[1651420800000,1099.00,\"\"],[1651507200000,1099.0000,\"\"],[1651593600000,1099.0000,\"\"],[1651680000000,1099.00,\"1099元\"],[1651766400000,1089.0,\"购买1件,当前价格1089\"],[1651852800000,1049.00,\"\"],[1651939200000,1089.00,\"\"],[1652025600000,1099.00,\"1099元\"],[1652112000000,1089.00,\"\"],[1652198400000,1049.00,\"\"],[1652284800000,1089.00,\"\"],[1652371200000,1049.00,\"\"],[1652457600000,1099.00,\"\"],[1652544000000,1099.00,\"\"],[1652630400000,1099.00,\"\"],[1652716800000,1099.00,\"1099元\"],[1652803200000,1099.00,\"\"],[1652889600000,1099.00,\"\"],[1652976000000,1049.00,\"\"],[1653062400000,1099.00,\"\"],[1653148800000,1099.00,\"\"],[1653235200000,1099.00,\"1099元\"],[1653321600000,1099.00,\"\"],[1653408000000,1099.00,\"\"],[1653494400000,1099.00,\"\"],[1653580800000,1099.00,\"\"],[1653667200000,1099.00,\"\"],[1653753600000,1099.00,\"\"],[1653840000000,1099.00,\"\"],[1653926400000,1049.0,\"\"],[1654012800000,1049.00,\"\"],[1654099200000,1049.00,\"\"],[1654185600000,1049.00,\"\"],[1654272000000,1049.00,\"\"],[1654358400000,1049.00,\"\"],[1654444800000,1049.00,\"\"],[1654531200000,1049.00,\"\"],[1654617600000,1049.00,\"\"],[1654704000000,1049.00,\"\"],[1654790400000,1049.00,\"\"],[1654876800000,1049.00,\"\"],[1654963200000,1049.00,\"\"],[1655049600000,1049.00,\"\"],[1655136000000,1049.00,\"\"],[1655222400000,1049.00,\"\"],[1655308800000,1049.00,\"1049元\"],[1655395200000,1049.00,\"\"],[1655481600000,1049.00,\"\"],[1655568000000,1049.00,\"\"],[1655654400000,1049.00,\"\"],[1655740800000,1049.00,\"\"],[1655827200000,1049.00,\"1049元\"],[1655913600000,1049.00,\"\"],[1656000000000,1049.00,\"\"],[1656086400000,1049.00,\"\"],[1656172800000,1049.00,\"1049元\"],[1656259200000,1049.00,\"\"],[1656345600000,1049.00,\"\"],[1656432000000,1049.00,\"\"],[1656518400000,1049.00,\"\"],[1656604800000,1049.00,\"1049元\"],[1656691200000,1049.00,\"\"],[1656777600000,1049.00,\"\"],[1656864000000,1049.00,\"1049元\"],[1656950400000,1049.00,\"\"],[1657036800000,1049.00,\"\"],[1657123200000,1049.00,\"\"],[1657209600000,1049.00,\"\"],[1657296000000,1049.00,\"\"],[1657382400000,1049.00,\"\"],[1657468800000,1049.00,\"\"],[1657555200000,1049.00,\"\"],[1657641600000,1049.00,\"\"],[1657728000000,1049.00,\"\"],[1657814400000,1049.00,\"\"],[1657900800000,1049.00,\"1049元\"],[1657987200000,1049.00,\"\"],[1658073600000,1049.00,\"\"],[1658160000000,1049.00,\"\"],[1658246400000,1049.00,\"\"],[1658332800000,1049.00,\"1049元\"],[1658419200000,1049.00,\"\"],[1658505600000,1049.00,\"\"],[1658592000000,1049.00,\"\"],[1658678400000,1049.00,\"\"],[1658764800000,1049.00,\"\"],[1658851200000,1049.00,\"\"]","ZheKouCount":95},"count":0}

5.Scrapy数据持久化开发

5.1. 编写Items(NySpider/items.py)

import scrapyclass HistoryPriceItem(scrapy.Item):"""自定义历史价格存储Item"""# 商品URLitemUrl = scrapy.Field()# 图片URLpicUrl = scrapy.Field()# 历史价格信息detailPrice = scrapy.Field()

5.2. 编写ItemPipelines(MySpider/pipelines.py), 以文件存储为例:

import scrapy.crawler
from itemadapter import ItemAdapter
from scrapy import signalsclass FilePipeline:def __init__(self, filename='store.txt'):self.filename = filenamedef process_item(self, item, spider):# 使用适配器包装item, 防止直接对item进行修改/删除影响后续Pipelineadapter = ItemAdapter(item)# 写入文件self.fp.write(adapter.get('itemUrl') + "    " + adapter.get('picUrl') + "    " + adapter.get('detailPrice') + '\n')return item@classmethoddef from_crawler(cls, crawler:scrapy.crawler.Crawler):s = cls()# 通过信号绑定行为# 爬虫启动时创建文件fpcrawler.signals.connect(s.opened, signal=signals.spider_opened)# 爬虫停止时关闭文件fpcrawler.signals.connect(s.closed, signal=signals.spider_closed)return sdef closed(self, spider):self.fp.close()def opened(self, spider):self.fp = open(self.filename, 'w', encoding='utf-8')self.fp.write('商品URL    主图URL    历史价格信息\n')

5.3. 修改Spiders(MySpider/spiders/manmanbuy.py)

import scrapy
import json
from MySpider.items import HistoryPriceItemclass ManmanbuySpider(scrapy.Spider):# 省略未修改内容custom_settings = {# 配置使用的Item管道'ITEM_PIPELINES': {'MySpider.pipelines.FilePipeline': 300,}}def parse_history_price(self, response: scrapy.http.Response):# 解析价格响应self.logger.info(response.text)data = json.loads(response.text)# 返回Itemreturn HistoryPriceItem(itemUrl=data['data']['spUrl'], picUrl=data['data']['spPic'], detailPrice=data['data']['datePrice'])

这里说明一下scrapy有5种添加配置方式,常用的有3种,高优先级配置会覆盖低优先级相同的Key的配置,不同的Key的配置则组合起来,按优先级从高到底分别是:

  1. 命令行配置

  2. 爬虫配置

  3. 项目全局配置

Spiders中的custom_settings参数就是爬虫配置

5.4. 运行scrapy crawl manmanbuy命令启动爬虫,观察当前目录发现生成一个store.txt文件,文件内容如下:

商品URL    主图URL    历史价格信息
https://item.jd.com/100011493273.html    http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg    [1621353600000,1199.00,""],[1621440000000,1199.00,""],[1621526400000,1199.00,""],[1621612800000,1199.00,""],[1621699200000,1199.00,""],[1621785600000,1199.00,""],[1621872000000,1199.00,""],[1621958400000,1199.00,""],[1622044800000,1199.00,""],[1622131200000,1199.00,""],[1622217600000,1199.00,""],[1622304000000,1199.00,""],[1622390400000,1199.00,""],[1622476800000,1199.00,""],[1622563200000,1199.00,""],[1622649600000,1199.00,""],[1622736000000,1199.00,""],[1622822400000,1199.00,""],[1622908800000,1199.00,""],[1622995200000,1199.00,""],[1623081600000,1199.00,""],[1623168000000,1199.00,""],[1623254400000,1199.00,""],[1623340800000,1199.00,"1199元"],[1623427200000,1199.00,""],[1623513600000,1199.00,""],[1623600000000,1199.00,""],[1623686400000,1199.00,""],[1623772800000,1199.00,""],[1623859200000,1199.00,""],[1623945600000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1624032000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624118400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624204800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624291200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624377600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624464000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624550400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624636800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624723200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624809600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624896000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624982400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625068800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625155200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625241600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625328000000,1199.00,""],[1625414400000,1199.00,""],[1625500800000,1189.0,"京东秒杀价:1189"],[1625587200000,1199.00,""],[1625673600000,1189.0,""],[1625760000000,1199.0,""],[1625846400000,1189.0,""],[1625932800000,1199.0000,""],[1626019200000,1199.0000,""],[1626105600000,1189.0,""],[1626192000000,1199.0,""],[1626278400000,1189.0,""],[1626364800000,1199.0000,""],[1626451200000,1199.0000,""],[1626537600000,1199.0000,""],[1626624000000,1199.0000,""],[1626710400000,1199.0000,""],[1626796800000,1199.0000,""],[1626883200000,1189.00,""],[1626969600000,1199.0000,""],[1627056000000,1199.0000,""],[1627142400000,1199.0000,""],[1627228800000,1199.0000,""],[1627315200000,1189.00,""],[1627401600000,1199.0000,""],[1627488000000,1189.00,"1189元"],[1627574400000,1189.00,""],[1627660800000,1199.00,""],[1627747200000,1179.00,"1179元"],[1627833600000,1189.0000,""],[1627920000000,1199.00,""],[1628006400000,1199.00,""],[1628092800000,1189.00,""],[1628179200000,1189.0000,""],[1628265600000,1189.0000,""],[1628352000000,1189.0000,""],[1628438400000,1199.00,""],[1628524800000,1189.00,""],[1628611200000,1199.0,""],[1628697600000,1189.0000,""],[1628784000000,1189.0000,""],[1628870400000,1189.0000,""],[1628956800000,1199.00,"1199元"],[1629043200000,1189.0000,""],[1629129600000,1199.00,""],[1629216000000,1189.00,""],[1629302400000,1199.0000,""],[1629388800000,1169.0,"京东秒杀价:1169"],[1629475200000,1199.00,""],[1629561600000,1199.00,""],[1629648000000,1199.00,""],[1629734400000,1169.00,""],[1629820800000,1199.0,""],[1629907200000,1189.0,"京东秒杀价:1189"],[1629993600000,1199.00,""],[1630080000000,1199.00,""],[1630166400000,1199.00,""],[1630252800000,1199.00,""],[1630339200000,1189.00,"1189元包邮"],[1630425600000,1189.00,""],[1630512000000,1175.00,"购买1件,plus价格1175"],[1630598400000,1189.00,""],[1630684800000,1199.0,""],[1630771200000,1189.0000,""],[1630857600000,1199.0,""],[1630944000000,1189.0000,""],[1631030400000,1199.0,""],[1631116800000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1631203200000,1189.0000,""],[1631289600000,1189.00,""],[1631376000000,1199.00,""],[1631462400000,1189.0000,""],[1631548800000,1189.0,""],[1631635200000,1199.0,""],[1631721600000,1189.00,""],[1631808000000,1199.00,""],[1631894400000,1189.0,""],[1631980800000,1199.0,""],[1632067200000,1169.00,""],[1632153600000,1169.00,""],[1632240000000,1169.00,""],[1632326400000,1189.0,""],[1632412800000,1169.0,""],[1632499200000,1199.0,""],[1632585600000,1169.00,""],[1632672000000,1199.00,""],[1632758400000,1169.0,""],[1632844800000,1199.0000,""],[1632931200000,1169.0,""],[1633017600000,1169.0,""],[1633104000000,1169.0,""],[1633190400000,1169.00,""],[1633276800000,1169.0000,""],[1633363200000,1199.00,""],[1633449600000,1199.00,""],[1633536000000,1169.0,""],[1633622400000,1169.0,""],[1633708800000,1169.0,"京东秒杀价:1169"],[1633795200000,1169.0,""],[1633881600000,1199.00,""],[1633968000000,1169.00,""],[1634054400000,1199.0,""],[1634140800000,1169.00,""],[1634227200000,1199.0,""],[1634313600000,1199.0,""],[1634400000000,1169.0,""],[1634486400000,1199.0,""],[1634572800000,1169.00,""],[1634659200000,1199.0000,""],[1634745600000,1169.00,""],[1634832000000,1199.0,""],[1634918400000,1199.0,""],[1635004800000,1199.0,""],[1635091200000,1199.0,""],[1635177600000,1199.0,""],[1635264000000,1199.0,""],[1635350400000,1199.0,""],[1635436800000,1199.0,""],[1635523200000,1099.00,"1099元 "],[1635609600000,1099.0,""],[1635696000000,1099.00,""],[1635782400000,1099.00,""],[1635868800000,1099.00,""],[1635955200000,1099.00,"购买1件,plus价格1099"],[1636041600000,949.00,"购买1件,当前价:1099.00,可叠加优惠券2:满880减150"],[1636128000000,1099.00,""],[1636214400000,1199.0,""],[1636300800000,1099.0,""],[1636387200000,1099.0,""],[1636473600000,1099.0,""],[1636560000000,979.00,"购买1件,当前价:1099.00,满减:每满1080减120"],[1636646400000,1099.0000,""],[1636732800000,1199.00,""],[1636819200000,1199.00,""],[1636905600000,1199.00,""],[1636992000000,1199.00,""],[1637078400000,1099.00,""],[1637164800000,1099.00,""],[1637251200000,1099.00,""],[1637337600000,1099.00,""],[1637424000000,1099.00,"1099元"],[1637510400000,1099.00,""],[1637596800000,1099.0,""],[1637683200000,1099.00,""],[1637769600000,1099.00,""],[1637856000000,1099.00,""],[1637942400000,1099.00,""],[1638028800000,1199.0000,""],[1638115200000,1199.00,""],[1638201600000,1199.00,""],[1638288000000,1099.0,""],[1638374400000,1099.00,""],[1638460800000,1099.00,""],[1638547200000,1199.0,""],[1638633600000,1199.00,""],[1638720000000,1099.0,""],[1638806400000,1099.00,""],[1638892800000,1099.00,""],[1638979200000,1099.00,"1099元"],[1639065600000,1099.00,""],[1639152000000,1089.0,""],[1639238400000,1089.0,""],[1639324800000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639411200000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639497600000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639584000000,1099.00,""],[1639670400000,1099.0,""],[1639756800000,1099.0,""],[1639843200000,1099.0,""],[1639929600000,1099.00,"1099元"],[1640016000000,1099.0,""],[1640102400000,1099.0,""],[1640188800000,1099.0,""],[1640275200000,1099.00,"1099元"],[1640361600000,1099.0,""],[1640448000000,1069.0,""],[1640534400000,1099.0,""],[1640620800000,1099.0,""],[1640707200000,1099.0,""],[1640793600000,1099.0,""],[1640880000000,1099.0,""],[1640966400000,1099.00,"1099元"],[1641052800000,1099.0,""],[1641139200000,1099.0,""],[1641225600000,1099.0,""],[1641312000000,1099.0,""],[1641398400000,1099.0,""],[1641484800000,1099.0,""],[1641571200000,1099.0,""],[1641657600000,1099.00,"1099元"],[1641744000000,1099.0,""],[1641830400000,1099.0,""],[1641916800000,1099.0,""],[1642003200000,1099.0,""],[1642089600000,1099.0,""],[1642176000000,1099.0,""],[1642262400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642348800000,1099.00,""],[1642435200000,1099.00,""],[1642521600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642608000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642694400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642780800000,1099.0,""],[1642867200000,1099.0,""],[1642953600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643040000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643126400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643212800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643299200000,1099.0,""],[1643385600000,1099.0,""],[1643472000000,1099.0,""],[1643558400000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643644800000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643731200000,1099.0,""],[1643817600000,1099.0,""],[1643904000000,1099.0,""],[1643990400000,1099.0,""],[1644076800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644163200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644249600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644336000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644422400000,1099.00,""],[1644508800000,1099.00,"1099元"],[1644595200000,1199.00,"1199元"],[1644681600000,1199.00,"1089元"],[1644768000000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644854400000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644940800000,1099.00,"1049元"],[1645027200000,1099.00,""],[1645113600000,1099.00,""],[1645200000000,1099.00,""],[1645286400000,1099.00,""],[1645372800000,1099.00,""],[1645459200000,1099.00,""],[1645545600000,1099.00,""],[1645632000000,1099.00,""],[1645718400000,1099.00,"1099元"],[1645804800000,1099.00,"1069元"],[1645891200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1645977600000,1099.00,""],[1646064000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646150400000,1099.00,""],[1646236800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646323200000,1099.00,""],[1646409600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646496000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646582400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646668800000,899.00,"购买1件,当前价:1099.00,满减:每满1080减200"],[1646755200000,1099.00,""],[1646841600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646928000000,1099.0,""],[1647014400000,1099.0,""],[1647100800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647187200000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647273600000,1099.00,""],[1647360000000,1099.00,"1099元"],[1647446400000,1049.0,"购买1件,当前价:1099.0,满减:满1050减50"],[1647532800000,1099.0,""],[1647619200000,1099.00,""],[1647705600000,1099.00,""],[1647792000000,1099.00,""],[1647878400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647964800000,1099.00,""],[1648051200000,1099.00,""],[1648137600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648224000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648310400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648396800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648483200000,1039.00,"购买1件,当前价:1089.00,满减:满1050减50"],[1648569600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648656000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648742400000,1049.00,""],[1648828800000,1049.00,""],[1648915200000,1049.00,"1049元"],[1649001600000,1099.00,""],[1649088000000,1049.00,""],[1649174400000,1049.00,""],[1649260800000,1049.00,""],[1649347200000,1049.00,""],[1649433600000,1099.00,""],[1649520000000,1099.00,""],[1649606400000,1099.00,""],[1649692800000,1049.0,"购买1件,当前价格1049"],[1649779200000,1099.00,""],[1649865600000,1099.00,""],[1649952000000,1049.00,"1049元"],[1650038400000,1099.00,""],[1650124800000,1049.00,""],[1650211200000,1049.00,""],[1650297600000,1049.00,""],[1650384000000,1049.00,""],[1650470400000,1099.00,"1099元"],[1650556800000,1099.00,""],[1650643200000,1099.00,""],[1650729600000,1099.00,""],[1650816000000,1099.00,""],[1650902400000,1099.00,""],[1650988800000,1099.00,""],[1651075200000,1099.00,"1099元"],[1651161600000,1099.00,""],[1651248000000,1099.00,""],[1651334400000,1049.00,""],[1651420800000,1099.00,""],[1651507200000,1099.0000,""],[1651593600000,1099.0000,""],[1651680000000,1099.00,"1099元"],[1651766400000,1089.0,"购买1件,当前价格1089"],[1651852800000,1049.00,""],[1651939200000,1089.00,""],[1652025600000,1099.00,"1099元"],[1652112000000,1089.00,""],[1652198400000,1049.00,""],[1652284800000,1089.00,""],[1652371200000,1049.00,""],[1652457600000,1099.00,""],[1652544000000,1099.00,""],[1652630400000,1099.00,""],[1652716800000,1099.00,"1099元"],[1652803200000,1099.00,""],[1652889600000,1099.00,""],[1652976000000,1049.00,""],[1653062400000,1099.00,""],[1653148800000,1099.00,""],[1653235200000,1099.00,"1099元"],[1653321600000,1099.00,""],[1653408000000,1099.00,""],[1653494400000,1099.00,""],[1653580800000,1099.00,""],[1653667200000,1099.00,""],[1653753600000,1099.00,""],[1653840000000,1099.00,""],[1653926400000,1049.0,""],[1654012800000,1049.00,""],[1654099200000,1049.00,""],[1654185600000,1049.00,""],[1654272000000,1049.00,""],[1654358400000,1049.00,""],[1654444800000,1049.00,""],[1654531200000,1049.00,""],[1654617600000,1049.00,""],[1654704000000,1049.00,""],[1654790400000,1049.00,""],[1654876800000,1049.00,""],[1654963200000,1049.00,""],[1655049600000,1049.00,""],[1655136000000,1049.00,""],[1655222400000,1049.00,""],[1655308800000,1049.00,"1049元"],[1655395200000,1049.00,""],[1655481600000,1049.00,""],[1655568000000,1049.00,""],[1655654400000,1049.00,""],[1655740800000,1049.00,""],[1655827200000,1049.00,"1049元"],[1655913600000,1049.00,""],[1656000000000,1049.00,""],[1656086400000,1049.00,""],[1656172800000,1049.00,"1049元"],[1656259200000,1049.00,""],[1656345600000,1049.00,""],[1656432000000,1049.00,""],[1656518400000,1049.00,""],[1656604800000,1049.00,"1049元"],[1656691200000,1049.00,""],[1656777600000,1049.00,""],[1656864000000,1049.00,"1049元"],[1656950400000,1049.00,""],[1657036800000,1049.00,""],[1657123200000,1049.00,""],[1657209600000,1049.00,""],[1657296000000,1049.00,""],[1657382400000,1049.00,""],[1657468800000,1049.00,""],[1657555200000,1049.00,""],[1657641600000,1049.00,""],[1657728000000,1049.00,""],[1657814400000,1049.00,""],[1657900800000,1049.00,"1049元"],[1657987200000,1049.00,""],[1658073600000,1049.00,""],[1658160000000,1049.00,""],[1658246400000,1049.00,""],[1658332800000,1049.00,"1049元"],[1658419200000,1049.00,""],[1658505600000,1049.00,""],[1658592000000,1049.00,""],[1658678400000,1049.00,""],[1658764800000,1049.00,""],[1658851200000,1049.00,""]

则说明程序执行正常

PS: 若想要将数据持久化至Mysql/MongoDB/Elasticsearch,只需编写对应的ItemPipelines实现, 修改爬虫导入的ITEM_PIPELINES配置即可,实现数据持久化与爬虫逻辑的解耦。

结语

本文为大家简要说明了使用Scrapy的理由,以及通过一个按理为大家演示了如何开发一个Scrapy爬虫项目。后续将持续为大家带来Scrapy更多。

如果大家觉得文章还不错的话,欢迎大家三连(点赞+在看+收藏)

相关文章:

深入理解Scrapy

Scrapy是什么 An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way. Scrapy是适用于Python的一个快速、简单、功能强大的web爬虫框架&#xff0c;通常用于抓取web站点并从页面中提取结构化的数…...

想做WMS仓库管理系统,找了好久才找到云表

公司内部仓库管理原方式均基于人工电子表格管理方式来实现收发存管理&#xff0c;没有流程化管理&#xff0c;无法保证数据的准确性和及时性&#xff0c;同时现场操作和数据核对会出现不同步的情况&#xff0c;无法提高仓库的运作效率&#xff0c;因此&#xff0c;我们基于云表…...

公司销售个人号如何管理?

微信管理系统可以帮助企业解决哪些问题呢&#xff1f; 一、解决聊天记录监管问题 1.聊天记录的保存&#xff0c;让公司的管理者可以随时查看公司任意销售与客户的聊天记录&#xff0c;不用一个一个员工逐一去看&#xff0c;方便管理&#xff1b; 2.敏感词监控&#xff0c;管理者…...

COLE HERSEE 48408 工业4.0、制造业X和元宇宙

COLE HERSEE 48408 工业4.0、制造业X和元宇宙 需要数据来释放工业4.0的全部潜力——价值链中的所有公司都可以访问大量数据。一个新的互联数据生态系统旨在提供解决方案:制造业x。 在德国联邦经济事务和气候行动部以及BDI、VDMA和ZVEI贸易协会的密切合作下&#xff0c;实施制…...

【Vue基础-数字大屏】加载动漫效果

一、需求描述 当网页正在加载而处于空白页面状态时&#xff0c;可以在该页面上显示加载动画提示。 二、步骤代码 1、全局下载npm install -g json-server npm install -g json-server 2、在src目录下新建文件夹mock&#xff0c;新建文件data.json存放模拟数据 {"one&…...

CSS 样式简写

在CSS中有许多简写的样式&#xff0c;它们被广泛使用。简写最好按照如下顺序进行书写 font font: font-style font-weight font-size/line-height font-familyfont-style italic//斜体 normal//正常字体(默认)font-weight 一般填写数字 400 normal(默认值) 700 bold(默认值)f…...

SQL Server创建数据库

简单创建写法 默认初始大小为5MB,增长速度为2MB create database DBTEST自定义 用户创建的数据库都被存放在sys.database中&#xff0c;每个数据库在表中占一行&#xff0c;name字段存放的数据库的名称&#xff0c;具体字段可以看此博客sys.database系统表详细说明 所以判断…...

树莓派安装.NET 6.0

首先安装.Net Core依赖&#xff08;未使用&#xff09; sudo apt install -y libunwind8 libunwind8-dev gettext libicu-dev liblttng-ust-dev libcurl4 libcurl4-openssl-dev libssl-dev uuid-dev unzip libgdiplus libc6-dev libkrb5-3 需要安装的依赖微软官方文档已经列出…...

小华HC32F448串口使用

目录 1. 串口GPIO配置 2. 串口波特率配置 3. 串口接收超时配置 4. 串口中断注册 5. 串口初始化 6. 串口数据接收处理 7. DMA接收配置和处理 1. 串口GPIO配置 端口号和Pin脚号跟STM32没什么区别。 串口复用功能跟STM32大不一样。 如下图&#xff0c;选自HC32F448 表 2…...

Redis实现简易消息队列的三种方式

Redis实现简易消息队列的三种方式 消息队列简介 消息队列是一种用于在计算机系统中传递和处理数据的重要工具。如果你完全不了解消息队列&#xff0c;不用担心&#xff0c;我将尽力以简单明了的方式来解释它。 首先&#xff0c;想象一下你正在玩一个游戏&#xff0c;而游戏中…...

基于SpringBoot的在线小说阅读平台系统

基于SpringBoot的在线小说阅读平台系统的设计与实现~ 开发语言&#xff1a;Java数据库&#xff1a;MySQL技术&#xff1a;SpringBootMyBatisVue工具&#xff1a;IDEA/Ecilpse、Navicat、Maven 系统展示 主页 个人中心 登录界面 管理员界面 摘要 基于Spring Boot的在线小说阅读…...

VMware Workstation 与 Hyper-V 不兼容。请先从系统中移除 Hyper-V 角色

引用地址...

uniapp h5 MD5加密

文章目录 1.当使用 CryptoJS 进行 MD5 加密时&#xff0c;你需要先引入 CryptoJS 库并确保它已经正确安装。下面是一个更详细的示例代码&#xff1a;2.然后&#xff0c;在需要使用 MD5 加密的地方&#xff0c;引入 CryptoJS 代码库&#xff1a;3.接下来&#xff0c;我们定义一个…...

2023_Spark_实验十八:安装FinalShell

下载安装包 链接&#xff1a;https://pan.baidu.com/s/14cOJDcezzuwUYowPsOA-sg?pwd6htc 提取码&#xff1a;6htc 下载文件名称&#xff1a;FinalShell.zip 二、安装 三、启动FinalShell 四、连接远程 linux 服务器 先确保linux系统已经开启&#xff0c;不然连接不上 左边…...

文件服务器管理服务器怎么设置

文件服务器是一种提供文件存储和共享服务的服务器&#xff0c;它可以方便企业内部的员工共享文件&#xff0c;提高工作效率。为了更好地管理和维护文件服务器&#xff0c;需要对其进行合理的设置。下面小编将介绍文件服务器管理服务器的基本设置方法。 一、选择合适的操作系统 …...

LeetCode每日一题——Single Number

文章目录 一、题目二、题解 一、题目 136. Single Number Given a non-empty array of integers nums, every element appears twice except for one. Find that single one. You must implement a solution with a linear runtime complexity and use only constant extra …...

有什么手机软件能分离人声和音乐?

很多人在制作混剪视频&#xff0c;需要二次创作的时候&#xff0c;就经常会把人声分离、背景音乐伴奏提取出来&#xff0c;然后重新加入自己的创意跟想法。下面就一起来看看如何用手机软件分离人声和音乐的吧&#xff01; 音分轨 一款可以分离人声和背景音乐的手机软件&#x…...

私人服务器可以干嘛

目录 搭建个人网站或博客&#xff1a; 远程桌面&#xff1a; 作为网盘储存&#xff1a; 作为测试和学习环境&#xff1a; 推广产品&#xff1a; 游戏私服(注意,仅限于个人自己单机玩)&#xff1a; 个人服务器可以用于多种用途&#xff0c;以下是一些常见的用途&#xff1a;…...

【EI会议征稿】第三届高性能计算与通信工程国际学术会议(HPCCE 2023)

第三届高性能计算与通信工程国际学术会议(HPCCE 2023) 第三届高性能计算与通信工程国际学术会议&#xff08;HPCCE 2023&#xff09;将于2023年12月22-24日在长沙召开。HPCCE 2023将围绕“高性能计算与通信工程”的最新研究领域&#xff0c;为来自国内外高等院校、科学研究所、…...

项目管理,如何做到流程标准化?

在PMP管理学习规范化、标准化和流程化的背景下&#xff0c;我们在日常工作中会遇到各种大小不一的工作项目。为了能够确保项目按时高质量地完成&#xff0c;项目管理变得至关重要。项目管理可以简单地解释为&#xff0c;在给定的时间和资源限制下&#xff0c;通过协调有限资源&…...

windows编译ollvm笔记

准备工作 1.找到Android SDK目录配置好cmake环境变量 E:\AndroidSDK\cmake\3.18.1&#xff08;E:\AndroidSDK为 Android SDK目录地址&#xff09;。 下载llvm-mingw编译环境(gcc编译器的windows版本&#xff0c;即可以在windows平台上使用gcc编译器)&#xff0c;下载地址&…...

问:TCP/IP协议栈在内核态的好还是用户态的好

“TCP/IP协议栈到底是内核态的好还是用户态的好&#xff1f;” 问题的根源在于&#xff0c;干嘛非要这么刻意地去区分什么内核态和用户态。 引子 为了不让本文成为干巴巴的说教&#xff0c;在文章开头&#xff0c;我以一个实例分析开始。 最近一段时间&#xff0c;我几乎每…...

JavaScript-Vue基础语法-创建-组件-路由

文章目录 1.创建vue项目1.1.自定义创建项目1.2.项目结构解析1.3.主要文件1.4.其它 2.项目运行3.Vue组件概念3.1.组件基础概念3.2.单文件组件三要素3.3.组件注册3.4.组件通信 4.Vue路由概念4.1.简单使用4.2.路由参数4.3.嵌套路由4.4.路由导航4.5.代码导航4.6.路由守卫 5.总结 HT…...

前端开发中的 TypeScript 泛型:深入解析

前端开发中的 TypeScript 泛型&#xff1a;深入解析 TypeScript&#xff08;简称 TS&#xff09;是一种由微软开发的强类型超集 JavaScript 语言&#xff0c;它为前端开发者提供了更严格的类型检查和更强大的工具支持。其中&#xff0c;泛型是 TypeScript 中的一个强大概念&am…...

06-spring的beanFactoryPostProcessor的执行

文章目录 1. 接口BeanFactoryPostProcessor1.1 英文说明及要点2. BeanDefinitionRegistryPostProcessor3. 执行逻辑4. 几个重要实现类1. 接口BeanFactoryPostProcessor 1.1 英文说明及要点 Factory hook that allows for custom modification of an application context’s b…...

想要精通算法和SQL的成长之路 - 分割数组的最大值

想要精通算法和SQL的成长之路 - 分割数组的最大值 前言一. 分割数组的最大值1.1 二分法 前言 想要精通算法和SQL的成长之路 - 系列导航 一. 分割数组的最大值 原题链接 首先面对这个题目&#xff0c;我们可以捕获几个关键词&#xff1a; 非负整数。非空连续子数组。 那么我…...

【深度学习】【Opencv】【GPU】python/C++调用onnx模型【基础】

【深度学习】【Opencv】【GPU】python/C调用onnx模型【基础】 提示:博主取舍了很多大佬的博文并亲测有效,分享笔记邀大家共同学习讨论 文章目录 【深度学习】【Opencv】【GPU】python/C调用onnx模型【基础】前言Python版本OpenCVWindows平台安装OpenCVopencv调用onnx模型 C版本…...

Oracle update 关联更新优化方法

关联更新顾名思义就是指&#xff0c;更新的数据从关联的表中获取并update到目标表。并且该SQL将会是一个天然的嵌套循环。有两种优化思路解决&#xff1a; 1、PLSQL 根据rowid更新 是否需要加order by rowid的考量&#xff1a; 如果buffer cache足够大&#xff0c;能够放得下要…...

USB协议学习(一)帧格式以及协议抓取

USB协议学习&#xff08;一&#xff09;帧格式以及协议抓取 笔者来聊聊MPU的理解 这里写自定义目录标题 USB协议学习&#xff08;一&#xff09;帧格式以及协议抓取MPU的概念以及作用MPU的配置新的改变功能快捷键合理的创建标题&#xff0c;有助于目录的生成如何改变文本的样式…...

前端工程化知识系列(8)

目录 71.你有经验使用TypeScript或Flow等类型检查工具来提高前端代码的可维护性和质量吗&#xff1f;72. 如何处理前端应用的搜索引擎优化&#xff08;SEO&#xff09;问题&#xff0c;特别是在单页面应用&#xff08;SPA&#xff09;中&#xff1f;73. 你了解渐进式Web应用&am…...

网站建设评审/免费的网站域名查询565wcc

在贴代码之前&#xff0c;首先需要给QQ邮箱开服务IMAP/SMTP服务&#xff0c;详细开通方法见 "开通方法"(可能需要发送收费短信&#xff0c;所以只要开通这一个服务就好了)。 这边主要就是为了一个服务的授权码&#xff0c;如下图&#xff1a; 服务开好之后&#xff0…...

网站开发语言学习/搜索引擎优化的七个步骤

在小程序开发中&#xff0c;var that this的声明很常见。举个例子&#xff0c;代码如下&#xff01; 示例代码1 //index.js Page({ data: { toastHidden: true, }, loadData: function () { var that this//这里声明了that&#xff1b;将this存在that里面 wx.request…...

wordpress模块管理系统/在线制作网站免费

10月25日&#xff0c;“电子信息系AAA软件特色专业java知识竞赛”在实验楼二楼阶梯教室圆满落下帷幕&#xff0c;我系合作办学AAA特色专业2015级四个班级近200名学生参加了本次活动。AAA教育新乡基地负责人郎建辉、辅导员刘瑞姣、教员王和超和康一等老师出席评选本次比赛&#…...

现在的官方网站怎么做的/百度网站优化软件

笔记内容整理自mooc上北京理工大学嵩天老师python系列课程数据分析与展示&#xff0c;本人小白一枚&#xff0c;如有不对&#xff0c;多加指正 1.python自带的图像库PIL 1.1常用API Image.open() Image.fromarray() im.save() convert(L) b.astype(uint8)(这个API用于处理后的数…...

用wordpress仿一个网站模板下载/怎样做一个产品营销方案

弱视治疗的基础&#xff0c;是“大脑神经可塑性”。基于这一理论&#xff0c;大龄弱视患者&#xff0c;找到正确的治疗方法&#xff0c;能很快提高视力&#xff0c;甚至立体视。前几期推送的文章&#xff0c;分享了成人弱视治疗案例。弱视患者的视觉训练&#xff0c;主要包括&a…...

网站运营编辑做什么的/上海网络推广招聘

文章目录数字三角形思路数字三角形 题目描述 上图给出了一个数字三角形。从三角形的顶部到底部有很多条不同的路径。对于每条路径&#xff0c;把路径上面的数加起来可以得到一个和&#xff0c;你的任务就是找到最大的和。 路径上的每一步只能从一个数走到下一层和它最近的左边…...