COVID-19 肺炎疫情数据实时监控(python 爬虫 + pyecharts 数据可视化 + wordcloud 词云图)


文章目录

    • 【1x00】前言
    • 【2x00】思维导图
    • 【3x00】数据结构分析
    • 【4x00】主函数 main()
    • 【5x00】数据获取模块 data_get
      • 【5x01】初始化函数 init()
      • 【5x02】中国总数据 china_total_data()
      • 【5x03】全球总数据 global_total_data()
      • 【5x04】中国每日数据 china_daily_data()
      • 【5x05】境外每日数据 foreign_daily_data()
    • 【6x00】词云图绘制模块 data_wordcloud
      • 【6x01】中国累计确诊词云图 foreign_daily_data()
      • 【6x02】全球累计确诊词云图 foreign_daily_data()
    • 【7x00】地图绘制模块 data_map
      • 【7x01】中国累计确诊地图 china_total_map()
      • 【7x02】全球累计确诊地图 global_total_map()
      • 【7x03】中国每日数据折线图 china_daily_map()
      • 【7x04】境外每日数据折线图 foreign_daily_map()
    • 【8x00】结果截图
      • 【8x01】数据储存 Excel
      • 【8x02】词云图
      • 【8x03】地图 + 折线图
    • 【9x00】完整代码


这里是一段防爬虫文本,请读者忽略。
本文原创首发于 CSDN,作者 TRHX。
博客首页:https://itrhx.blog.csdn.net/
本文链接:https://itrhx.blog.csdn.net/article/details/107140534
未经授权,禁止转载!恶意转载,后果自负!尊重原创,远离剽窃!

【1x00】前言

本来两三个月之前就想搞个疫情数据实时数据展示的,由于各种不可抗拒因素一而再再而三的鸽了,最近终于抽空写了一个,数据是用 Python 爬取的百度疫情实时大数据报告,请求库用的 requests,解析用的 Xpath 语法,词云用的 wordcloud 库,数据可视化用 pyecharts 绘制的地图和折线图,数据储存在 Excel 表格里面,使用 openpyxl 对表格进行处理。

本程序实现了累计确诊地图展示和每日数据变化折线图展示,其他更多数据的获取和展示均可在程序中进行拓展,可以将程序部署在服务器上,设置定时运行,即可实时展示数据,pyecharts 绘图模块也可以整合到 Web 框架(Django、Flask等)中使用。

在获取数据时有全球境外两个概念,全球包含中国,境外不包含中国,后期绘制的四个图:中国累计确诊地图、全球累计确诊地图(包含中国)、中国每日数据折线图、境外每日数据折线图(不包含中国)。

注意项:直接向该网页发送请求获取的响应中,没有每个国家的每日数据,该数据获取的地址是:https://voice.baidu.com/newpneumonia/get?target=trend&isCaseIn=1&stage=publish

  • 预览地址:http://cov.itrhx.com/

  • 数据来源:https://voice.baidu.com/act/newpneumonia/newpneumonia/

  • pyecharts 文档:https://pyecharts.org/

  • openpyxl 文档:https://openpyxl.readthedocs.io/

  • wordcloud 文档:http://amueller.github.io/word_cloud/

【2x00】思维导图

01

【3x00】数据结构分析

通过查看百度的疫情数据页面,可以看到很多整齐的数据,猜测就是疫情相关的数据,保存该页面,对其进行格式化,很容易可以分析出所有的数据都在 <script type="application/json" id="captain-config"></script> 里面,其中 title 里面是一些 Unicode 编码,将其转为中文后更容易得到不同的分类数据。

02

由于数据繁多,可以将数据主体部分提取出来,删除一些重复项和其他杂项,留下数据大体位置并分析数据结构,便于后期的数据提取,经过处理后的数据大致结构如下:

<script type="application/json" id="captain-config">{"component": [{"mapLastUpdatedTime": "2020.07.05 16:13",        // 国内疫情数据最后更新时间"caseList": [                                    // caseList 列表,每一个元素是一个字典{"confirmed": "1",                        // 每个字典包含中国每个省的每一项疫情数据"died": "0","crued": "1","relativeTime": "1593792000","confirmedRelative": "0","diedRelative": "0","curedRelative": "0","curConfirm": "0","curConfirmRelative": "0","icuDisable": "1","area": "西藏","subList": [                            // subList 列表,每一个元素是一个字典{"city": "拉萨",                 // 每个字典包含该省份对应的每个城市疫情数据"confirmed": "1","died": "0","crued": "1","confirmedRelative": "0","curConfirm": "0","cityCode": "100"}]}],"caseOutsideList": [                           // caseOutsideList 列表,每一个元素是一个字典{"confirmed": "241419",                 // 每个字典包含各国的每一项疫情数据"died": "34854","crued": "191944","relativeTime": "1593792000","confirmedRelative": "223","curConfirm": "14621","icuDisable": "1","area": "意大利","subList": [                          // subList 列表,每一个元素是一个字典{"city": "伦巴第",              // 每个字典包含每个国家对应的每个城市疫情数据"confirmed": "94318","died": "16691","crued": "68201","curConfirm": "9426"}]}],"summaryDataIn": {                           // summaryDataIn 国内总的疫情数据"confirmed": "85307","died": "4648","cured": "80144","asymptomatic": "99","asymptomaticRelative": "7","unconfirmed": "7","relativeTime": "1593792000","confirmedRelative": "19","unconfirmedRelative": "1","curedRelative": "27","diedRelative": "0","icu": "6","icuRelative": "0","overseasInput": "1931","unOverseasInputCumulative": "83375","overseasInputRelative": "6","unOverseasInputNewAdd": "13","curConfirm": "515","curConfirmRelative": "-8","icuDisable": "1"},"summaryDataOut": {                           // summaryDataOut 国外总的疫情数据"confirmed": "11302569","died": "528977","curConfirm": "4410601","cured": "6362991","confirmedRelative": "206165","curedRelative": "190018","diedRelative": "4876","curConfirmRelative": "11271","relativeTime": "1593792000"},"trend": {                                    // trend 字典,包含国内每日的疫情数据"updateDate": [],                         // 日期"list": [                                 // list 列表,每项数据及其对应的值{"name": "确诊","data": []},{"name": "疑似","data": []},{"name": "治愈","data": []},{"name": "死亡","data": []},{"name": "新增确诊","data": []},{"name": "新增疑似","data": []},{"name": "新增治愈","data": []},{"name": "新增死亡","data": []},{"name": "累计境外输入","data": []},{"name": "新增境外输入","data": []}]},"foreignLastUpdatedTime": "2020.07.05 16:13",       // 国外疫情数据最后更新时间"globalList": [                                     // globalList 列表,每一个元素是一个字典{"area": "亚洲",                              // 按照不同洲进行分类"subList": [                                // subList 列表,每个洲各个国家的疫情数据{"died": "52","confirmed": "6159","crued": "4809","curConfirm": "1298","confirmedRelative": "0","relativeTime": "1593792000","country": "塔吉克斯坦"}],"died": "56556",                            // 每个洲总的疫情数据"crued": "1625562","confirmed": "2447873","curConfirm": "765755","confirmedRelative": "60574"},{"area": "其他",                             // 其他特殊区域疫情数据"subList": [{"died": "13","confirmed": "712","crued": "651","curConfirm": "48","confirmedRelative": "0","relativeTime": "1593792000","country": "钻石公主号邮轮"}],"died": "13",                              // 其他特殊区域疫情总的数据"crued": "651","confirmed": "712","curConfirm": "48","confirmedRelative": "0"},{"area": "热门",                            // 热门国家疫情数据"subList": [{"died": "5206","confirmed": "204610","crued": "179492","curConfirm": "19912","confirmedRelative": "1172","relativeTime": "1593792000","country": "土耳其"}],"died": "528967",                         // 热门国家疫情总的数据"crued": "6362924","confirmed": "11302357","confirmedRelative": "216478","curConfirm": "4410466"}],"allForeignTrend": {                            // allForeignTrend 字典,包含国外每日的疫情数据"updateDate": [],                       // 日期"list": [                               // list 列表,每项数据及其对应的值{"name": "累计确诊","data": []},{"name": "治愈","data": []},{"name": "死亡","data": []},{"name": "现有确诊","data": []},{"name": "新增确诊","data": []}]},"topAddCountry": [                    // 确诊增量最高的国家{"name": "美国","value": 53162}],"topOverseasInput": [                // 境外输入最多的省份{"name": "黑龙江","value": 386}]}]}
</script>

【4x00】主函数 main()

分别将数据获取、词云图绘制、地图绘制写入三个文件:data_get()data_wordcloud()data_map(),然后使用一个主函数文件 main.py 来调用这三个文件里面的函数。

import data_get
import data_wordcloud
import data_mapdata_dict = data_get.init()
data_get.china_total_data(data_dict)
data_get.global_total_data(data_dict)
data_get.china_daily_data(data_dict)
data_get.foreign_daily_data(data_dict)data_wordcloud.china_wordcloud()
data_wordcloud.global_wordcloud()data_map.all_map()

【5x00】数据获取模块 data_get

【5x01】初始化函数 init()

使用 xpath 语法 //script[@id="captain-config"]/text() 提取里面的值,利用 json.loads 方法将其转换为字典对象,以便后续的其他函数调用。

def init():headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.13 Safari/537.36'}url = 'https://voice.baidu.com/act/newpneumonia/newpneumonia/'response = requests.get(url=url, headers=headers)tree = etree.HTML(response.text)dict1 = tree.xpath('//script[@id="captain-config"]/text()')print(type(dict1[0]))dict2 = json.loads(dict1[0])return dict2

【5x02】中国总数据 china_total_data()

def china_total_data(data):"""1、中国省/直辖市/自治区/行政区疫情数据省/直辖市/自治区/行政区:area现有确诊:    curConfirm累计确诊:    confirmed累计治愈:    crued累计死亡:    died现有确诊增量: curConfirmRelative累计确诊增量: confirmedRelative累计治愈增量: curedRelative累计死亡增量: diedRelative"""wb = openpyxl.Workbook()            # 创建工作簿ws_china = wb.active                # 获取工作表ws_china.title = "中国省份疫情数据"   # 命名工作表ws_china.append(['省/直辖市/自治区/行政区', '现有确诊', '累计确诊', '累计治愈','累计死亡', '现有确诊增量', '累计确诊增量','累计治愈增量', '累计死亡增量'])china = data['component'][0]['caseList']for province in china:ws_china.append([province['area'],province['curConfirm'],province['confirmed'],province['crued'],province['died'],province['curConfirmRelative'],province['confirmedRelative'],province['curedRelative'],province['diedRelative']])"""2、中国城市疫情数据城市:city现有确诊:curConfirm累计确诊:confirmed累计治愈:crued累计死亡:died累计确诊增量:confirmedRelative"""ws_city = wb.create_sheet('中国城市疫情数据')ws_city.append(['城市', '现有确诊', '累计确诊','累计治愈', '累计死亡', '累计确诊增量'])for province in china:for city in province['subList']:# 某些城市没有 curConfirm 数据,则将其设置为 0,crued 和 died 为空时,替换成 0if 'curConfirm' not in city:city['curConfirm'] = '0'if city['crued'] == '':city['crued'] = '0'if city['died'] == '':city['died'] = '0'ws_city.append([city['city'], '0', city['confirmed'],city['crued'], city['died'], city['confirmedRelative']])"""3、中国疫情数据更新时间:mapLastUpdatedTime"""time_domestic = data['component'][0]['mapLastUpdatedTime']ws_time = wb.create_sheet('中国疫情数据更新时间')ws_time.column_dimensions['A'].width = 22  # 调整列宽ws_time.append(['中国疫情数据更新时间'])ws_time.append([time_domestic])wb.save('COVID-19-China.xlsx')print('中国疫情数据已保存至 COVID-19-China.xlsx!')

【5x03】全球总数据 global_total_data()

全球总数据在提取完成后,进行地图绘制时发现并没有中国的数据,因此在写入全球数据时注意要单独将中国的数据插入 Excel 中。

def global_total_data(data):"""1、全球各国疫情数据国家:country现有确诊:curConfirm累计确诊:confirmed累计治愈:crued累计死亡:died累计确诊增量:confirmedRelative"""wb = openpyxl.Workbook()ws_global = wb.activews_global.title = "全球各国疫情数据"# 按照国家保存数据countries = data['component'][0]['caseOutsideList']ws_global.append(['国家', '现有确诊', '累计确诊', '累计治愈', '累计死亡', '累计确诊增量'])for country in countries:ws_global.append([country['area'],country['curConfirm'],country['confirmed'],country['crued'],country['died'],country['confirmedRelative']])# 按照洲保存数据continent = data['component'][0]['globalList']for area in continent:ws_foreign = wb.create_sheet(area['area'] + '疫情数据')ws_foreign.append(['国家', '现有确诊', '累计确诊', '累计治愈', '累计死亡', '累计确诊增量'])for country in area['subList']:ws_foreign.append([country['country'],country['curConfirm'],country['confirmed'],country['crued'],country['died'],country['confirmedRelative']])# 在“全球各国疫情数据”和“亚洲疫情数据”两张表中写入中国疫情数据ws1, ws2 = wb['全球各国疫情数据'], wb['亚洲疫情数据']original_data = data['component'][0]['summaryDataIn']add_china_data = ['中国',original_data['curConfirm'],original_data['confirmed'],original_data['cured'],original_data['died'],original_data['confirmedRelative']]ws1.append(add_china_data)ws2.append(add_china_data)"""2、全球疫情数据更新时间:foreignLastUpdatedTime"""time_foreign = data['component'][0]['foreignLastUpdatedTime']ws_time = wb.create_sheet('全球疫情数据更新时间')ws_time.column_dimensions['A'].width = 22  # 调整列宽ws_time.append(['全球疫情数据更新时间'])ws_time.append([time_foreign])wb.save('COVID-19-Global.xlsx')print('全球疫情数据已保存至 COVID-19-Global.xlsx!')

【5x04】中国每日数据 china_daily_data()

def china_daily_data(data):"""i_dict = data['component'][0]['trend']i_dict['updateDate']:日期i_dict['list'][0]:确诊i_dict['list'][1]:疑似i_dict['list'][2]:治愈i_dict['list'][3]:死亡i_dict['list'][4]:新增确诊i_dict['list'][5]:新增疑似i_dict['list'][6]:新增治愈i_dict['list'][7]:新增死亡i_dict['list'][8]:累计境外输入i_dict['list'][9]:新增境外输入"""ccd_dict = data['component'][0]['trend']update_date = ccd_dict['updateDate']              # 日期china_confirmed = ccd_dict['list'][0]['data']     # 每日累计确诊数据china_crued = ccd_dict['list'][2]['data']         # 每日累计治愈数据china_died = ccd_dict['list'][3]['data']          # 每日累计死亡数据wb = openpyxl.load_workbook('COVID-19-China.xlsx')# 写入每日累计确诊数据ws_china_confirmed = wb.create_sheet('中国每日累计确诊数据')ws_china_confirmed.append(['日期', '数据'])for data in zip(update_date, china_confirmed):ws_china_confirmed.append(data)# 写入每日累计治愈数据ws_china_crued = wb.create_sheet('中国每日累计治愈数据')ws_china_crued.append(['日期', '数据'])for data in zip(update_date, china_crued):ws_china_crued.append(data)# 写入每日累计死亡数据ws_china_died = wb.create_sheet('中国每日累计死亡数据')ws_china_died.append(['日期', '数据'])for data in zip(update_date, china_died):ws_china_died.append(data)wb.save('COVID-19-China.xlsx')print('中国每日累计确诊/治愈/死亡数据已保存至 COVID-19-China.xlsx!')

【5x05】境外每日数据 foreign_daily_data()

def foreign_daily_data(data):"""te_dict = data['component'][0]['allForeignTrend']te_dict['updateDate']:日期te_dict['list'][0]:累计确诊te_dict['list'][1]:治愈te_dict['list'][2]:死亡te_dict['list'][3]:现有确诊te_dict['list'][4]:新增确诊"""te_dict = data['component'][0]['allForeignTrend']update_date = te_dict['updateDate']                # 日期foreign_confirmed = te_dict['list'][0]['data']     # 每日累计确诊数据foreign_crued = te_dict['list'][1]['data']         # 每日累计治愈数据foreign_died = te_dict['list'][2]['data']          # 每日累计死亡数据wb = openpyxl.load_workbook('COVID-19-Global.xlsx')# 写入每日累计确诊数据ws_foreign_confirmed = wb.create_sheet('境外每日累计确诊数据')ws_foreign_confirmed.append(['日期', '数据'])for data in zip(update_date, foreign_confirmed):ws_foreign_confirmed.append(data)# 写入累计治愈数据ws_foreign_crued = wb.create_sheet('境外每日累计治愈数据')ws_foreign_crued.append(['日期', '数据'])for data in zip(update_date, foreign_crued):ws_foreign_crued.append(data)# 写入累计死亡数据ws_foreign_died = wb.create_sheet('境外每日累计死亡数据')ws_foreign_died.append(['日期', '数据'])for data in zip(update_date, foreign_died):ws_foreign_died.append(data)wb.save('COVID-19-Global.xlsx')print('境外每日累计确诊/治愈/死亡数据已保存至 COVID-19-Global.xlsx!')

【6x00】词云图绘制模块 data_wordcloud

【6x01】中国累计确诊词云图 foreign_daily_data()

def china_wordcloud():wb = openpyxl.load_workbook('COVID-19-China.xlsx')  # 获取已有的xlsx文件ws_china = wb['中国省份疫情数据']                     # 获取中国省份疫情数据表ws_china.delete_rows(1)                             # 删除第一行china_dict = {}                                     # 将省份及其累计确诊按照键值对形式储存在字典中for data in ws_china.values:china_dict[data[0]] = int(data[2])word_cloud = wordcloud.WordCloud(font_path='C:/Windows/Fonts/simsun.ttc',background_color='#CDC9C9',min_font_size=15,width=900, height=500)word_cloud.generate_from_frequencies(china_dict)word_cloud.to_file('WordCloud-China.png')print('中国省份疫情词云图绘制完毕!')

【6x02】全球累计确诊词云图 foreign_daily_data()

def global_wordcloud():wb = openpyxl.load_workbook('COVID-19-Global.xlsx')ws_global = wb['全球各国疫情数据']ws_global.delete_rows(1)global_dict = {}for data in ws_global.values:global_dict[data[0]] = int(data[2])word_cloud = wordcloud.WordCloud(font_path='C:/Windows/Fonts/simsun.ttc',background_color='#CDC9C9',width=900, height=500)word_cloud.generate_from_frequencies(global_dict)word_cloud.to_file('WordCloud-Global.png')print('全球各国疫情词云图绘制完毕!')

这里是一段防爬虫文本,请读者忽略。
本文原创首发于 CSDN,作者 TRHX。
博客首页:https://itrhx.blog.csdn.net/
本文链接:https://itrhx.blog.csdn.net/article/details/107140534
未经授权,禁止转载!恶意转载,后果自负!尊重原创,远离剽窃!

【7x00】地图绘制模块 data_map

【7x01】中国累计确诊地图 china_total_map()

def china_total_map():wb = openpyxl.load_workbook('COVID-19-China.xlsx')  # 获取已有的xlsx文件ws_time = wb['中国疫情数据更新时间']                   # 获取文件中中国疫情数据更新时间表ws_data = wb['中国省份疫情数据']                      # 获取文件中中国省份疫情数据表ws_data.delete_rows(1)                              # 删除第一行province = []                                       # 省份curconfirm = []                                     # 累计确诊for data in ws_data.values:province.append(data[0])curconfirm.append(data[2])time_china = ws_time['A2'].value                    # 更新时间# 设置分级颜色pieces = [{'max': 0, 'min': 0, 'label': '0', 'color': '#FFFFFF'},{'max': 9, 'min': 1, 'label': '1-9', 'color': '#FFE5DB'},{'max': 99, 'min': 10, 'label': '10-99', 'color': '#FF9985'},{'max': 999, 'min': 100, 'label': '100-999', 'color': '#F57567'},{'max': 9999, 'min': 1000, 'label': '1000-9999', 'color': '#E64546'},{'max': 99999, 'min': 10000, 'label': '≧10000', 'color': '#B80909'}]# 绘制地图ct_map = (Map().add(series_name='累计确诊人数', data_pair=[list(z) for z in zip(province, curconfirm)], maptype="china").set_global_opts(title_opts=opts.TitleOpts(title="中国疫情数据(累计确诊)",subtitle='数据更新至:' + time_china + '\n\n来源:百度疫情实时大数据报告'),visualmap_opts=opts.VisualMapOpts(max_=300, is_piecewise=True, pieces=pieces)))return ct_map

【7x02】全球累计确诊地图 global_total_map()

def global_total_map():wb = openpyxl.load_workbook('COVID-19-Global.xlsx')ws_time = wb['全球疫情数据更新时间']ws_data = wb['全球各国疫情数据']ws_data.delete_rows(1)country = []                        # 国家curconfirm = []                     # 累计确诊for data in ws_data.values:country.append(data[0])curconfirm.append(data[2])time_global = ws_time['A2'].value   # 更新时间# 国家名称中英文映射表name_map = {"Somalia": "索马里","Liechtenstein": "列支敦士登","Morocco": "摩洛哥","W. Sahara": "西撒哈拉","Serbia": "塞尔维亚","Afghanistan": "阿富汗","Angola": "安哥拉","Albania": "阿尔巴尼亚","Andorra": "安道尔共和国","United Arab Emirates": "阿拉伯联合酋长国","Argentina": "阿根廷","Armenia": "亚美尼亚","Australia": "澳大利亚","Austria": "奥地利","Azerbaijan": "阿塞拜疆","Burundi": "布隆迪","Belgium": "比利时","Benin": "贝宁","Burkina Faso": "布基纳法索","Bangladesh": "孟加拉国","Bulgaria": "保加利亚","Bahrain": "巴林","Bahamas": "巴哈马","Bosnia and Herz.": "波斯尼亚和黑塞哥维那","Belarus": "白俄罗斯","Belize": "伯利兹","Bermuda": "百慕大","Bolivia": "玻利维亚","Brazil": "巴西","Barbados": "巴巴多斯","Brunei": "文莱","Bhutan": "不丹","Botswana": "博茨瓦纳","Central African Rep.": "中非共和国","Canada": "加拿大","Switzerland": "瑞士","Chile": "智利","China": "中国","Côte d'Ivoire": "科特迪瓦","Cameroon": "喀麦隆","Dem. Rep. Congo": "刚果(布)","Congo": "刚果(金)","Colombia": "哥伦比亚","Cape Verde": "佛得角","Costa Rica": "哥斯达黎加","Cuba": "古巴","N. Cyprus": "北塞浦路斯","Cyprus": "塞浦路斯","Czech Rep.": "捷克","Germany": "德国","Djibouti": "吉布提","Denmark": "丹麦","Dominican Rep.": "多米尼加","Algeria": "阿尔及利亚","Ecuador": "厄瓜多尔","Egypt": "埃及","Eritrea": "厄立特里亚","Spain": "西班牙","Estonia": "爱沙尼亚","Ethiopia": "埃塞俄比亚","Finland": "芬兰","Fiji": "斐济","France": "法国","Gabon": "加蓬","United Kingdom": "英国","Georgia": "格鲁吉亚","Ghana": "加纳","Guinea": "几内亚","Gambia": "冈比亚","Guinea-Bissau": "几内亚比绍","Eq. Guinea": "赤道几内亚","Greece": "希腊","Grenada": "格林纳达","Greenland": "格陵兰岛","Guatemala": "危地马拉","Guam": "关岛","Guyana": "圭亚那合作共和国","Honduras": "洪都拉斯","Croatia": "克罗地亚","Haiti": "海地","Hungary": "匈牙利","Indonesia": "印度尼西亚","India": "印度","Br. Indian Ocean Ter.": "英属印度洋领土","Ireland": "爱尔兰","Iran": "伊朗","Iraq": "伊拉克","Iceland": "冰岛","Israel": "以色列","Italy": "意大利","Jamaica": "牙买加","Jordan": "约旦","Japan": "日本","Siachen Glacier": "锡亚琴冰川","Kazakhstan": "哈萨克斯坦","Kenya": "肯尼亚","Kyrgyzstan": "吉尔吉斯斯坦","Cambodia": "柬埔寨","Korea": "韩国","Kuwait": "科威特","Lao PDR": "老挝","Lebanon": "黎巴嫩","Liberia": "利比里亚","Libya": "利比亚","Sri Lanka": "斯里兰卡","Lesotho": "莱索托","Lithuania": "立陶宛","Luxembourg": "卢森堡","Latvia": "拉脱维亚","Moldova": "摩尔多瓦","Madagascar": "马达加斯加","Mexico": "墨西哥","Macedonia": "马其顿","Mali": "马里","Malta": "马耳他","Myanmar": "缅甸","Montenegro": "黑山","Mongolia": "蒙古国","Mozambique": "莫桑比克","Mauritania": "毛里塔尼亚","Mauritius": "毛里求斯","Malawi": "马拉维","Malaysia": "马来西亚","Namibia": "纳米比亚","New Caledonia": "新喀里多尼亚","Niger": "尼日尔","Nigeria": "尼日利亚","Nicaragua": "尼加拉瓜","Netherlands": "荷兰","Norway": "挪威","Nepal": "尼泊尔","New Zealand": "新西兰","Oman": "阿曼","Pakistan": "巴基斯坦","Panama": "巴拿马","Peru": "秘鲁","Philippines": "菲律宾","Papua New Guinea": "巴布亚新几内亚","Poland": "波兰","Puerto Rico": "波多黎各","Dem. Rep. Korea": "朝鲜","Portugal": "葡萄牙","Paraguay": "巴拉圭","Palestine": "巴勒斯坦","Qatar": "卡塔尔","Romania": "罗马尼亚","Russia": "俄罗斯","Rwanda": "卢旺达","Saudi Arabia": "沙特阿拉伯","Sudan": "苏丹","S. Sudan": "南苏丹","Senegal": "塞内加尔","Singapore": "新加坡","Solomon Is.": "所罗门群岛","Sierra Leone": "塞拉利昂","El Salvador": "萨尔瓦多","Suriname": "苏里南","Slovakia": "斯洛伐克","Slovenia": "斯洛文尼亚","Sweden": "瑞典","Swaziland": "斯威士兰","Seychelles": "塞舌尔","Syria": "叙利亚","Chad": "乍得","Togo": "多哥","Thailand": "泰国","Tajikistan": "塔吉克斯坦","Turkmenistan": "土库曼斯坦","Timor-Leste": "东帝汶","Tonga": "汤加","Trinidad and Tobago": "特立尼达和多巴哥","Tunisia": "突尼斯","Turkey": "土耳其","Tanzania": "坦桑尼亚","Uganda": "乌干达","Ukraine": "乌克兰","Uruguay": "乌拉圭","United States": "美国","Uzbekistan": "乌兹别克斯坦","Venezuela": "委内瑞拉","Vietnam": "越南","Vanuatu": "瓦努阿图","Yemen": "也门","South Africa": "南非","Zambia": "赞比亚","Zimbabwe": "津巴布韦","Aland": "奥兰群岛","American Samoa": "美属萨摩亚","Fr. S. Antarctic Lands": "南极洲","Antigua and Barb.": "安提瓜和巴布达","Comoros": "科摩罗","Curaçao": "库拉索岛","Cayman Is.": "开曼群岛","Dominica": "多米尼加","Falkland Is.": "福克兰群岛马尔维纳斯","Faeroe Is.": "法罗群岛","Micronesia": "密克罗尼西亚","Heard I. and McDonald Is.": "赫德岛和麦克唐纳群岛","Isle of Man": "曼岛","Jersey": "泽西岛","Kiribati": "基里巴斯","Saint Lucia": "圣卢西亚","N. Mariana Is.": "北马里亚纳群岛","Montserrat": "蒙特塞拉特","Niue": "纽埃","Palau": "帕劳","Fr. Polynesia": "法属波利尼西亚","S. Geo. and S. Sandw. Is.": "南乔治亚岛和南桑威奇群岛","Saint Helena": "圣赫勒拿","St. Pierre and Miquelon": "圣皮埃尔和密克隆群岛","São Tomé and Principe": "圣多美和普林西比","Turks and Caicos Is.": "特克斯和凯科斯群岛","St. Vin. and Gren.": "圣文森特和格林纳丁斯","U.S. Virgin Is.": "美属维尔京群岛","Samoa": "萨摩亚"}pieces = [{'max': 0, 'min': 0, 'label': '0', 'color': '#FFFFFF'},{'max': 49, 'min': 1, 'label': '1-49', 'color': '#FFE5DB'},{'max': 99, 'min': 50, 'label': '50-99', 'color': '#FFC4B3'},{'max': 999, 'min': 100, 'label': '100-999', 'color': '#FF9985'},{'max': 9999, 'min': 1000, 'label': '1000-9999', 'color': '#F57567'},{'max': 99999, 'min': 10000, 'label': '10000-99999', 'color': '#E64546'},{'max': 999999, 'min': 100000, 'label': '100000-999999', 'color': '#B80909'},{'max': 9999999, 'min': 1000000, 'label': '≧1000000', 'color': '#8A0808'}]gt_map = (Map().add(series_name='累计确诊人数', data_pair=[list(z) for z in zip(country, curconfirm)], maptype="world", name_map=name_map, is_map_symbol_show=False).set_series_opts(label_opts=opts.LabelOpts(is_show=False)).set_global_opts(title_opts=opts.TitleOpts(title="全球疫情数据(累计确诊)",subtitle='数据更新至:' + time_global + '\n\n来源:百度疫情实时大数据报告'),visualmap_opts=opts.VisualMapOpts(max_=300, is_piecewise=True, pieces=pieces),))return gt_map

【7x03】中国每日数据折线图 china_daily_map()

def china_daily_map():wb = openpyxl.load_workbook('COVID-19-China.xlsx')ws_china_confirmed = wb['中国每日累计确诊数据']ws_china_crued = wb['中国每日累计治愈数据']ws_china_died = wb['中国每日累计死亡数据']ws_china_confirmed.delete_rows(1)ws_china_crued.delete_rows(1)ws_china_died.delete_rows(1)x_date = []               # 日期y_china_confirmed = []    # 每日累计确诊y_china_crued = []        # 每日累计治愈y_china_died = []         # 每日累计死亡for china_confirmed in ws_china_confirmed.values:y_china_confirmed.append(china_confirmed[1])for china_crued in ws_china_crued.values:x_date.append(china_crued[0])y_china_crued.append(china_crued[1])for china_died in ws_china_died.values:y_china_died.append(china_died[1])fi_map = (Line(init_opts=opts.InitOpts(height='420px')).add_xaxis(xaxis_data=x_date).add_yaxis(series_name="中国累计确诊数据",y_axis=y_china_confirmed,label_opts=opts.LabelOpts(is_show=False),).add_yaxis(series_name="中国累计治愈趋势",y_axis=y_china_crued,label_opts=opts.LabelOpts(is_show=False),).add_yaxis(series_name="中国累计死亡趋势",y_axis=y_china_died,label_opts=opts.LabelOpts(is_show=False),).set_global_opts(title_opts=opts.TitleOpts(title="中国每日累计确诊/治愈/死亡趋势"),legend_opts=opts.LegendOpts(pos_bottom="bottom", orient='horizontal'),tooltip_opts=opts.TooltipOpts(trigger="axis"),yaxis_opts=opts.AxisOpts(type_="value",axistick_opts=opts.AxisTickOpts(is_show=True),splitline_opts=opts.SplitLineOpts(is_show=True),),xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),))return fi_map

【7x04】境外每日数据折线图 foreign_daily_map()

def foreign_daily_map():wb = openpyxl.load_workbook('COVID-19-Global.xlsx')ws_foreign_confirmed = wb['境外每日累计确诊数据']ws_foreign_crued = wb['境外每日累计治愈数据']ws_foreign_died = wb['境外每日累计死亡数据']ws_foreign_confirmed.delete_rows(1)ws_foreign_crued.delete_rows(1)ws_foreign_died.delete_rows(1)x_date = []                # 日期y_foreign_confirmed = []   # 累计确诊y_foreign_crued = []       # 累计治愈y_foreign_died = []        # 累计死亡for foreign_confirmed in ws_foreign_confirmed.values:y_foreign_confirmed.append(foreign_confirmed[1])for foreign_crued in ws_foreign_crued.values:x_date.append(foreign_crued[0])y_foreign_crued.append(foreign_crued[1])for foreign_died in ws_foreign_died.values:y_foreign_died.append(foreign_died[1])fte_map = (Line(init_opts=opts.InitOpts(height='420px')).add_xaxis(xaxis_data=x_date).add_yaxis(series_name="境外累计确诊趋势",y_axis=y_foreign_confirmed,label_opts=opts.LabelOpts(is_show=False),).add_yaxis(series_name="境外累计治愈趋势",y_axis=y_foreign_crued,label_opts=opts.LabelOpts(is_show=False),).add_yaxis(series_name="境外累计死亡趋势",y_axis=y_foreign_died,label_opts=opts.LabelOpts(is_show=False),).set_global_opts(title_opts=opts.TitleOpts(title="境外每日累计确诊/治愈/死亡趋势"),legend_opts=opts.LegendOpts(pos_bottom="bottom", orient='horizontal'),tooltip_opts=opts.TooltipOpts(trigger="axis"),yaxis_opts=opts.AxisOpts(type_="value",axistick_opts=opts.AxisTickOpts(is_show=True),splitline_opts=opts.SplitLineOpts(is_show=True),),xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),))return fte_map

【8x00】结果截图

【8x01】数据储存 Excel

03

04

【8x02】词云图

05

06

【8x03】地图 + 折线图

07

【9x00】完整代码

预览地址:http://cov.itrhx.com/

完整代码地址(点亮 star 有 buff 加成):https://github.com/TRHX/Python3-Spider-Practice/tree/master/SpiderDataVisualization/COVID-19

其他爬虫实战代码合集(持续更新):https://github.com/TRHX/Python3-Spider-Practice

爬虫实战专栏(持续更新):https://itrhx.blog.csdn.net/article/category/9351278


这里是一段防爬虫文本,请读者忽略。
本文原创首发于 CSDN,作者 TRHX。
博客首页:https://itrhx.blog.csdn.net/
本文链接:https://itrhx.blog.csdn.net/article/details/107140534
未经授权,禁止转载!恶意转载,后果自负!尊重原创,远离剽窃!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/437671.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【转】Postman系列二:Postman中get接口实战讲解(接口测试介绍,接口测试流程,头域操作)

一&#xff1a;接口测试介绍 接口测试&#xff1a;就是针对软件对外提供服务的接口输入输出进行测试&#xff0c;以及接口间相互逻辑的测试&#xff0c;验证接口功能和接口描述文档的一致性。 接口测试好处&#xff1a;接口测试通常能对系统测试的更为彻底&#xff0c;更高的保…

Python3 爬虫实战 — 前程无忧招聘信息爬取 + 数据可视化

爬取时间&#xff1a;2020-07-11&#xff08;2020年10月测试&#xff0c;增加了反爬&#xff0c;此代码已失效&#xff01;&#xff01;&#xff01;&#xff09;实现目标&#xff1a;根据用户输入的关键字爬取相关职位信息存入 MongoDB&#xff0c;读取数据进行可视化展示。涉…

【转】Postman系列三:Postman中post接口实战(上传文件、json请求)

一&#xff1a;接口测试过程中GET请求与POST请求的主要区别 从开发角度我们看get与post的主要区别是&#xff1a; 1.Get是用来从服务器上获得数据&#xff0c;而Post是用来向服务器上传递数据&#xff1b; 2.Get安全性比Post低&#xff1a;Get将表单中数据的按照keyvalue的形式…

Hadoop datanode正常启动,但是jps差不多datanode进程,而且Live nodes中却缺少节点

启动时可以看到启动成功&#xff0c;但是在chun2&#xff0c;jps的时候却没有了datanode进程&#xff0c;而且web端Live nodes也缺少了 百度搜索之后查到是因为hdfs.site.xml配置文件里dfs.data.dir配置的路径重复&#xff0c;就是多个节点存放data数据的目录路径相同了&#x…

【转】Postman系列四:Postman接口请求设置环境变量和全局变量、测试沙箱和测试断言、测试集运行与导入数据文件

一&#xff1a;Postman中接口请求设置环境变量和全局变量 全局变量和环境变量可以通过Pre-request Script和Tests设置&#xff0c;会在下面测试沙箱和测试断言中讲到。 全局变量的设置&#xff1a;官网参考https://learning.getpostman.com/docs/postman/environments_and_glob…

Python 算法之递归与尾递归,斐波那契数列以及汉诺塔的实现

文章目录递归概念递归要素递归与迭代的区别示例一&#xff1a;阶乘示例二&#xff1a;斐波那契数列示例三&#xff1a;汉诺塔问题尾递归Python 中尾递归的解决方案递归概念 递归&#xff1a;程序调用自身的编程技巧称为递归&#xff08; recursion&#xff09;。用一种通俗的话…

【转】Postman系列五:Postman中电商网站cookie、token检验与参数传递实战

一&#xff1a;Postman中电商网站cookie实战 Postman接口请求使用cookie两种方式&#xff1a; 1.直接在header&#xff08;头域&#xff09;中添加cookie&#xff0c;适用于已知请求cookie头域的情况 2.使用Postman的cookie管理机制&#xff0c;即可以手动添加&#xff0c;同时…

Python 数据结构之栈的实现

文章目录栈的概念栈的特点栈的操作Python 实现栈栈的简单应用&#xff1a;括号匹配问题栈的简单应用&#xff1a;倒序输出一组元素栈的概念 栈&#xff08;stack&#xff09;又名堆栈&#xff0c;栈是一种线性数据结构&#xff0c;用先进后出或者是后进先出的方式存储数据&…

CSDN 2020 博客之星实时数据排名(Python 爬虫 + PyEcharts)

CSDN 2020 博客之星实时数据排名&#xff1a;csdn.itrhx.com CSDN 一年一度的博客之星评选开始了&#xff0c;官网地址&#xff1a;https://bss.csdn.net/m/topic/blog_star2020 &#xff0c;由于官网是按照随机编号排序的&#xff0c;没有按照票数多少排序&#xff0c;为了方便…

【转】注册Azure AD 应用程序

作者&#xff1a;陈希章 发表于2017年3月22日 在此前的文章中&#xff0c;我给大家介绍了分别用Graph 浏览器以及第三方工具&#xff08;POSTMAN&#xff09;快速体验Microsoft Graph的功能&#xff0c;其中有一个重要的环节就是&#xff0c;开发人员需要访问Microsoft Graph的…

Python + GitHub Actions 实现 CSDN 自动签到与抽奖(非 selenium 版本)

文章目录【1x00】技术栈【2x00】代码实现签到与抽奖【3x00】签到结果通知【03x01】Server 酱【03x02】企业微信【03x03】钉钉【4x00】自动签到【5x00】完整代码【6x00】如何使用【06x01】方法一&#xff1a;直接 Fork 代码&#xff08;推荐&#xff09;【06x01】方法二&#xf…

Spark安装配置

Scala基础语法学习的差不多了&#xff0c;先把spark安装上 首先官网下载解压安装后 进入到conf目录下修改文件名 修改spark-env.sh&#xff08;配置jdk路径&#xff09; export JAVA_HOME/usr/local/java/jdk1.8.0_221修改slaves&#xff08;添加子节点名&#xff09; chun…

【转】注册Azure AD 2.0 应用程序

作者&#xff1a;陈希章 发表于 2017年3月22日 上一篇 介绍了Microsoft Graph应用程序的一些概念&#xff0c;以及目前还比较普遍的Azure AD 1.0应用程序的注册方式。但正如我多次提到的那样&#xff0c;虽然目前功能还在不断完善&#xff0c;但Azure AD 2.0会逐渐成为主流&…

Python 采集 Facebook 评论插件、留言外挂程序

实现时间&#xff1a;2021-05-30实现难度&#xff1a;★★★☆☆☆实现目标&#xff1a;采集 Facebook 评论插件、留言外挂程序的所有评论。完整代码&#xff1a;https://github.com/TRHX/Python3-Spider-Practice/tree/master/CommentPlugin/facebook-comments其他爬虫实战代码…

写第一个spark程序(wordcount)

首先启动集群与spark 其次把spark目录下的README.md上传到hdfs 进入spark下的bin目录&#xff0c;运行spark-shell ./spark-shell运行 val textFile sc.textFile("hdfs://chun1:9000/spark/README.md")val wordCounts textFile.flatMap(line>line.split("…

【中国版Office 365 应用程序注册】

中国版Office 365是由世纪互联进行运营的一个云服务&#xff0c;单纯从技术角度来看的话&#xff0c;它基本保持了与国际版的同步。但是由于两个版本本质上是完全独立的&#xff0c;其中最关键的就是账号系统是分开的&#xff0c;所以从使用角度来看&#xff0c;不管是直接用户…

Python 中如何解决 asyncio 文件描述符最大数量限制问题

文章目录问题复现问题分析事件循环 EventLoopI/O 多路复用select 的缺点解决方法1.更换事件循环选择器2.限制并发量3.修改最大文件描述符限制WindowsLinux总结WindowsLinux问题复现 Windows 平台下&#xff0c;Python 版本 3.5&#xff0c;使用异步框架 asyncio&#xff0c;有…

【转】掀起Azure AD的盖头来——深入理解Microsoft Graph应用程序和服务权限声明

引子 这是一篇计划外的文章。我们都知道要进行Microsoft Graph的开发的话&#xff0c;需要进行应用程序注册。这个在此前我已经有专门的文章写过了。但这里存在一个小的问题&#xff1a;国内版的Office 365在申请好之后&#xff0c;并没有像国际版那样&#xff0c;有一个对应的…

Python3 学习系列 丨 博客目录索引

整个博客有关 Python 学习目录索引&#xff0c;方便快捷定位查询基础学习篇 Python3 基础学习笔记 C01【变量和简单数据类型】Python3 基础学习笔记 C02【列表】Python3 基础学习笔记 C03【操作列表】Python3 基础学习笔记 C04【if 语句】Python3 基础学习笔记 C05【字典】Pyt…

【转】日邮物流:实现智慧物流,这个云上对了!

和阳光、空气、水、网络一样&#xff0c;「物流」早已成为当代企业、个人赖以生存的必要条件。2020第一季度全球物流受疫情影响面临挑战&#xff0c;业内普遍预计全球物流及供应链将重新优化布局。借此时机&#xff0c;物流业纷纷将目光投向“数字化智慧物流”方向&#xff0c;…