Python学习从0开始——项目一day01爬虫
- 一、导入代码
- 二、使用的核心库
- 三、功能测试
- 3.1初始代码
- 3.2新建文件
- 3.3代码调试
- 四、页面元素解析
- 4.1网页
- 4.2修改代码
- 4.3子页面
- 4.4修改代码
一、导入代码
在Inscode新建一个python类型的项目,然后打开终端,粘贴以下代码,回车clone项目。
git clone https://gitee.com/52itstyle/Python.git
这个是gitee上找的一个python项目,项目源地址。
二、使用的核心库
requests库是第三方库,使用其提供的API比使用python自带的urllib更为简洁,且能够处理多种HTTP请求,功能也很强大。
#导入requests库
import requests
#导入文件操作库
import os
#bs4全名BeautifulSoup,是编写python爬虫常用库之一,主要用来解析html标签。
import bs4
from bs4 import BeautifulSoup
#基础类库
import sys
#Python 3.x 解决中文编码问题
import importlib
importlib.reload(sys)
三、功能测试
3.1初始代码
初始代码位置:Python/Day01/脚本
,打开终端运行命令:
#切换目录
cd Python/Day01/脚本
#输出
/root/Python_02/Python/Day01/脚本
#运行脚本
python3 mzitu_linux.py
#输出报错
File "/root/Python_02/Python/Day01/脚本/mzitu_linux.py", line 21save_path = '/mnt/data/mzitu'^
SyntaxError: invalid non-printable character U+200B
#打开mzitu_linux.py文件,定位原代码21行,修改save_path
save_path ='./picture'
#打开56、68、72行的注释
#重新运行
python3 mzitu_linux.py
#很慢,把网址复制到浏览器直接拒绝访问
键盘Ctrl+C组合停止运行
3.2新建文件
在脚本的同级目录下新进learn文件夹,新建spider.py文件,将mzitu_linux.py里的内容复制过来
3.3代码调试
#问题一:网站不可访问。解决:修改爬图地址
#定位代码18行
mziTu = 'https://image.baidu.com/'
#终端执行
cd ../
cd learn/
python3 spider.py
#输出报错
Traceback (most recent call last):File "/root/Python_02/Python/Day01/learn/spider.py", line 106, in <module>main()File "/root/Python_02/Python/Day01/learn/spider.py", line 90, in mainimg_max = soup.find('div', class_='nav-links').find_all('a')[3].text
AttributeError: 'NoneType' object has no attribute 'find_all'
以上报错是正常的,切换爬取网站后,页面元素的解析肯定会发生改变,接下来一步步修改解析。
四、页面元素解析
4.1网页
#进入百度图片的网址
https://image.baidu.com/
键盘F12调出控制台,切换到Element标签页,组合键Ctrl+Shift+C选中合辑的图片,然后审查元素。
选中’<a>'标签,右键copy>copy emelemt审查元素,关注target和href
<a class="bd-home-content-album-item
" target="_blank" href="https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E5%9F%8E%E5%B8%82%E5%BB%BA%E7%AD%91%E6%91%84%E5%BD%B1%E4%B8%93%E9%A2%98&fr=searchindex_album%20&album_tab=%E5%BB%BA%E7%AD%91&album_id=7&rn=30" data-type="0"> <div class="bd-home-content-album-item-pic" style="background-image: url(https://t7.baidu.com/it/u=1595072465,3644073269&fm=193&f=GIF); background-color: #EACFC5"> </div> <div class="bd-home-content-album-item-inner-border"></div> <div class="bd-home-content-album-item-title"> 城市建筑摄影专题 </div>
</a>
选中’<a>'标签,右键copy>copy selector复制选择器
#bd-home-content-album > a:nth-child(1)
由以上可推:根据元素的唯一id:‘bd-home-content-album’可以找到’<div>‘标签内的所有’<a>‘标签,当前复制的’<a>‘标签是其父元素的第一个子’<a>'元素。
4.2修改代码
#修改39行
# 获取页面的栏目地址all_a = soup_sub.find('div',id='bd-home-content-album').find_all('a',target='_blank')
# 修改主方法,此页面无分页
def main():res = requests.get(mziTu, headers=headers)# 使用自带的html.parser解析soup = BeautifulSoup(res.text, 'html.parser')# 创建文件夹createFile(save_path)file = save_pathcreateFile(file)print("开始执行")download(mziTu, file)
切换到终端,运行脚本:
python3 spider.py
#输出报错
开始执行
内页第几页:2
套图地址:https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E6%B8%90%E5%8F%98%E9%A3%8E%E6%A0%BC%E6%8F%92%E7%94%BB&fr=albumslist&album_tab=%E8%AE%BE%E8%AE%A1%E7%B4%A0%E6%9D%90&album_id=409&rn=30
'NoneType' object has no attribute 'find_all'
内页第几页:4
套图地址:https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E5%AE%A0%E7%89%A9%E5%9B%BE%E7%89%87&fr=albumslist&album_tab=%E5%8A%A8%E7%89%A9&album_id=688&rn=30
'NoneType' object has no attribute 'find_all'
内页第几页:6
套图地址:https://image.baidu.com/search/albumslist?tn=albumslist&word=%E4%BA%BA%E7%89%A9&album_tab=%E4%BA%BA%E7%89%A9&rn=15&fr=searchindex_album
'NoneType' object has no attribute 'find_all'
父页面解析的元素和初始代码不同,子页面也不同,继续修改。
4.3子页面
复制打印的套图地址进入子页面,同样的操作,定位子页面图片:
<a class="albumsdetail-item" href="/search/detail?tn=baiduimagedetail&word=%E5%9F%8E%E5%B8%82%E5%BB%BA%E7%AD%91%E6%91%84%E5%BD%B1%E4%B8%93%E9%A2%98&album_tab=%E5%BB%BA%E7%AD%91&album_id=7&ie=utf-8&fr=albumsdetail&cs=1595072465,3644073269&pi=3977&pn=0&ic=0&objurl=https%3A%2F%2Ft7.baidu.com%2Fit%2Fu%3D1595072465%2C3644073269%26fm%3D193%26f%3DGIF" target="_blank" data-index="0" width="310.4" style="width: 310.4px; height: 310px;"><img class="albumsdetail-item-img" src="https://t7.baidu.com/it/u=1595072465,3644073269&fm=193&f=GIF" style="width: 310.4px; height: 310px; background-color: rgb(234, 207, 197);"><div class="albumsdetail-item-inner-border"></div>
</a>
元素选择器:
#imgList > div:nth-child(1) > a:nth-child(1)
数量元素选择器:
#bd-albumsdetail-content > div.albumsdetail-cover.clearfix > div.albumsdetail-info > div.albumsdetail-info-text > p.albumsdetail-info-num > span
4.4修改代码
#修改53行,也可以根据元素去获取这个数值,在这不是重点,直接赋值了
# 获取套图的最大数量pic_max = "791"
#修改62行img = soup_sub_2.find('div',id='imgList').find('img')
#切换终端执行代码
python3 spider.py
#输出报错
开始执行
内页第几页:2
套图地址:https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E6%B8%90%E5%8F%98%E9%A3%8E%E6%A0%BC%E6%8F%92%E7%94%BB&fr=albumslist&album_tab=%E8%AE%BE%E8%AE%A1%E7%B4%A0%E6%9D%90&album_id=409&rn=30
套图数量:791
子内页第几页:1
https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E6%B8%90%E5%8F%98%E9%A3%8E%E6%A0%BC%E6%8F%92%E7%94%BB&fr=albumslist&album_tab=%E8%AE%BE%E8%AE%A1%E7%B4%A0%E6%9D%90&album_id=409&rn=30/1
'NoneType' object has no attribute 'find'
内页第几页:4
套图地址:https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E5%AE%A0%E7%89%A9%E5%9B%BE%E7%89%87&fr=albumslist&album_tab=%E5%8A%A8%E7%89%A9&album_id=688&rn=30
套图数量:791
子内页第几页:1
https://image.baidu.com/search/albumsdetail?tn=albumsdetail&word=%E5%AE%A0%E7%89%A9%E5%9B%BE%E7%89%87&fr=albumslist&album_tab=%E5%8A%A8%E7%89%A9&album_id=688&rn=30/1
'NoneType' object has no attribute 'find'
内页第几页:6
套图地址:https://image.baidu.com/search/albumslist?tn=albumslist&word=%E4%BA%BA%E7%89%A9&album_tab=%E4%BA%BA%E7%89%A9&rn=15&fr=searchindex_album
套图数量:791
子内页第几页:1
https://image.baidu.com/search/albumslist?tn=albumslist&word=%E4%BA%BA%E7%89%A9&album_tab=%E4%BA%BA%E7%89%A9&rn=15&fr=searchindex_album/1
'NoneType' object has no attribute 'find'
明明已经根据元素选择器来查找了,为什么没有找到元素呢?打印父元素看看:
#63行插入打印父元素print(soup_sub_2.find('div',id='bd-albumsdetail-content'))
#终端执行
python3 spider.py
#输出
<div id="bd-albumsdetail-content">
</div>
问题找到了,根本原因是该div内的元素是在运行时动态渲染和加载的,造成我们通过浏览器访问是能看到该元素的,但是爬虫爬不到。这就需要我们另想办法解决。
是否是动态渲染,我们可以更早的发现:
打开控制台,切换到network,可以看到多次发送的请求,这些请求网址实际上来自
查看第一条请求的返回值,随便选择一条发送图片的请求复制参数,在response页Ctrl+F调出搜索框,定位返回值所在位置。
详细数据如下,稍微调整了一下格式:
linkData: '[{\x22pid\x22:3977,\x22width\x22:1100,\x22height\x22:1100,\x22oriwidth\x22:1200,\x22oriheight\x22:1200,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=1595072465,3644073269&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811557570\x22,\x22contSign\x22:\x221595072465,3644073269\x22},
{\x22pid\x22:3978,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=4198287529,2774471735&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.veer.com\\\/photo\\\/147317368?utm_source=baidu&utm_medium=imagesearch&chid=902\x22,\x22contSign\x22:\x224198287529,2774471735\x22},
{\x22pid\x22:3979,\x22width\x22:1200,\x22height\x22:813,\x22oriwidth\x22:1200,\x22oriheight\x22:813,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=1956604245,3662848045&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809773493\x22,\x22contSign\x22:\x221956604245,3662848045\x22},
{\x22pid\x22:3980,\x22width\x22:1200,\x22height\x22:760,\x22oriwidth\x22:1200,\x22oriheight\x22:760,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2529476510,3041785782&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/805192561\x22,\x22contSign\x22:\x222529476510,3041785782\x22},
{\x22pid\x22:3981,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=727460147,2222092211&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811065917\x22,\x22contSign\x22:\x22727460147,2222092211\x22},
{\x22pid\x22:3982,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2511982910,2454873241&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810968731\x22,\x22contSign\x22:\x222511982910,2454873241\x22},
{\x22pid\x22:3983,\x22width\x22:1200,\x22height\x22:686,\x22oriwidth\x22:1200,\x22oriheight\x22:686,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=825057118,3516313570&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810073156\x22,\x22contSign\x22:\x22825057118,3516313570\x22},
{\x22pid\x22:3984,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3435942975,1552946865&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811932564\x22,\x22contSign\x22:\x223435942975,1552946865\x22},
{\x22pid\x22:3985,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3569419905,626536365&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809770618\x22,\x22contSign\x22:\x223569419905,626536365\x22},
{\x22pid\x22:3986,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3779234486,1094031034&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810970358\x22,\x22contSign\x22:\x223779234486,1094031034\x22},
{\x22pid\x22:3987,\x22width\x22:1200,\x22height\x22:482,\x22oriwidth\x22:1200,\x22oriheight\x22:482,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2397542458,3133539061&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811063723\x22,\x22contSign\x22:\x222397542458,3133539061\x22},
{\x22pid\x22:3988,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2763645735,2016465681&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809771013\x22,\x22contSign\x22:\x222763645735,2016465681\x22},
{\x22pid\x22:3989,\x22width\x22:1149,\x22height\x22:1100,\x22oriwidth\x22:1200,\x22oriheight\x22:1149,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3911840071,2534614245&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810877786\x22,\x22contSign\x22:\x223911840071,2534614245\x22},
{\x22pid\x22:3990,\x22width\x22:1200,\x22height\x22:687,\x22oriwidth\x22:1200,\x22oriheight\x22:687,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3908717,2002330211&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810968672\x22,\x22contSign\x22:\x223908717,2002330211\x22},
{\x22pid\x22:3991,\x22width\x22:1200,\x22height\x22:799,\x22oriwidth\x22:1200,\x22oriheight\x22:799,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=318887420,2894941323&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810056726\x22,\x22contSign\x22:\x22318887420,2894941323\x22},
{\x22pid\x22:3992,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=1063451194,1129125124&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.veer.com\\\/photo\\\/146287060?utm_source=baidu&utm_medium=imagesearch&chid=902\x22,\x22contSign\x22:\x221063451194,1129125124\x22},
{\x22pid\x22:3993,\x22width\x22:800,\x22height\x22:1200,\x22oriwidth\x22:800,\x22oriheight\x22:1200,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3785402047,1898752523&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810970018\x22,\x22contSign\x22:\x223785402047,1898752523\x22},
{\x22pid\x22:3994,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3691080281,11347921&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809782140\x22,\x22contSign\x22:\x223691080281,11347921\x22},
{\x22pid\x22:3995,\x22width\x22:1200,\x22height\x22:799,\x22oriwidth\x22:1200,\x22oriheight\x22:799,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2374506090,1216769752&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.veer.com\\\/photo\\\/146290795?utm_source=baidu&utm_medium=imagesearch&chid=902\x22,\x22contSign\x22:\x222374506090,1216769752\x22},
{\x22pid\x22:3996,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=1285847167,3193778276&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809771315\x22,\x22contSign\x22:\x221285847167,3193778276\x22},
{\x22pid\x22:3997,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3251197759,2520670799&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/814059806\x22,\x22contSign\x22:\x223251197759,2520670799\x22},
{\x22pid\x22:3998,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=602106375,407124525&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/813923414\x22,\x22contSign\x22:\x22602106375,407124525\x22},
{\x22pid\x22:3999,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2906406936,2666005453&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811706433\x22,\x22contSign\x22:\x222906406936,2666005453\x22},
{\x22pid\x22:4000,\x22width\x22:1200,\x22height\x22:798,\x22oriwidth\x22:1200,\x22oriheight\x22:798,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3124693600,356058981&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/805197127\x22,\x22contSign\x22:\x223124693600,356058981\x22},
{\x22pid\x22:4001,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=3646282624,1156077026&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810999167\x22,\x22contSign\x22:\x223646282624,1156077026\x22},
{\x22pid\x22:4002,\x22width\x22:1200,\x22height\x22:797,\x22oriwidth\x22:1200,\x22oriheight\x22:797,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=4158958181,280757487&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810880655\x22,\x22contSign\x22:\x224158958181,280757487\x22},
{\x22pid\x22:4003,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=2371362259,3988640650&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809782065\x22,\x22contSign\x22:\x222371362259,3988640650\x22},
{\x22pid\x22:4004,\x22width\x22:800,\x22height\x22:1200,\x22oriwidth\x22:800,\x22oriheight\x22:1200,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=355704943,1318565630&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/810998065\x22,\x22contSign\x22:\x22355704943,1318565630\x22},
{\x22pid\x22:4005,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=655876807,3707807800&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/809770741\x22,\x22contSign\x22:\x22655876807,3707807800\x22},
{\x22pid\x22:4006,\x22width\x22:1200,\x22height\x22:800,\x22oriwidth\x22:1200,\x22oriheight\x22:800,\x22thumbnailUrl\x22:\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=1423490396,3473826719&fm=193&f=GIF\x22,\x22fromUrl\x22:\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811796379\x22,\x22contSign\x22:\x221423490396,3473826719\x22}]',
拿出一条数据来看:
{\x22pid\x22:4006,
\x22width\x22:1200,
\x22height\x22:800,
\x22oriwidth\x22:1200,
\x22oriheight\x22:800,
\x22thumbnailUrl\x22:
\x22https:\\\/\\\/t7.baidu.com\\\/it\\\/u=1423490396,3473826719&fm=193&f=GIF\x22,
\x22fromUrl\x22:
\x22https:\\\/\\\/www.vcg.com\\\/creative\\\/811796379\x22,\x22contSign\x22:\x221423490396,3473826719\x22}]',
下一篇继续。