熊猫tv新功能介绍_熊猫简单介绍

熊猫tv新功能介绍

Out of all technologies that is introduced in Data Analysis, Pandas is one of the most popular and widely used library.

在Data Analysis引入的所有技术中,P andas是最受欢迎和使用最广泛的库之一。

So what are we going to cover :

那么我们要讲的是:

  1. Installation of pandas

    熊猫的安装
  2. Key components of pandas

    大熊猫的主要成分
  3. Read/Import data from CSV file

    从CSV文件读取/导入数据
  4. Write/Export data to CSV files

    将数据写入/导出到CSV文件
  5. Viewing and selecting data

    查看和选择数据

1.安装熊猫 (1. Installation of pandas)

Let’s take care of the boring but important stuff first. Setting up the space to work with pandas.

首先让我们处理无聊但重要的事情。 设置与熊猫共处的空间。

If you are using conda as your environment with miniconda or Anaconda then:

如果您使用的 畅达 miniconda Python 那么 你的环境

  • Activate your environment

    激活您的环境

conda activate ./env

conda激活./env

  • Install pandas package

    安装熊猫包

conda install pandas

conda安装熊猫

If you are using virtual environment with virtualenv then :

如果您通过virtualenv使用虚拟环境,则:

  • Activate your environment

    激活您的环境

source ./env/bin/activate

源./env/bin/activate

  • Install pandas package

    安装熊猫包

pip install pandas

点安装熊猫

If you are using virtual environment with pipenv then :

如果您通过pipenv使用虚拟环境,则:

  • create and environment and install pandas in that environment

    在该环境中创建和环境并安装熊猫

pipenv install pandas

pipenv安装熊猫

  • Activate the environment

    激活环境

pipenv shell

皮壳

2.大熊猫的主要成分 (2. Key components of pandas)

Pandas provides two compound data types, which are the key components of pandas that gives us so much flexibility on selecting, viewing and manipulating the data. Those two key components are:

熊猫提供了两种复合数据类型,它们是熊猫的关键组成部分,这使我们在选择,查看和操作数据方面具有如此大的灵活性。 这两个关键组成部分是:

  • Pandas Series

    熊猫系列
  • Pandas Data Frame

    熊猫数据框

熊猫系列 (Pandas Series)

It is an one dimensional array offered by pandas. It can store different types of data ( meaning int,string, float, boolean etc..)

它是熊猫提供的一维数组。 它可以存储不同类型的数据(表示int,string,float,boolean等。)

A pandas series data be created as:

将熊猫系列数据创建为:

import pandas as pd

将熊猫作为pd导入

student_pass_percentage_in_country = pd.Series([“90”, “67”, “85”])

student_pass_percentage_in_country = pd.Series([“ 90”,“ 67”,“ 85”])

countries = pd.Series([“India”, “USA”, “China”])

国家= pd.Series([“印度”,“美国”,“中国”])

Image for post

熊猫数据框 (Pandas Data Frame)

It is the one where most of the magic happens. It is a two dimensional array , you can think of it as an excel sheet.

这是大多数魔术发生的地方。 它是一个二维数组,您可以将其视为Excel工作表。

  • The index in pandas starts from 0.

    熊猫的索引从0开始。
  • The row is referred as axis=1 and column as axis=0.

    该行称为axis = 1,而列称为axis = 0。
  • Its first column represents the index.

    它的第一列代表索引。
  • More then one row can be associated with one index. So there are two ways of looking for data: one by index, one by position. Position also starts from 0.

    多于一行可以与一个索引相关联。 因此,有两种查找数据的方法:一种是按索引,一种是按位置。 位置也从0开始。

A pandas data frame can be created as:

熊猫数据框可以创建为:

student_pass_percent_by_country = pd.DataFrame({ ‘Country’: countries, ‘Pass Percent’: student_pass_percentage_in_country})

student_pass_percent_by_country = pd.DataFrame({'Country':国家,'Pass Percent':student_pass_percentage_in_country})

Image for post

3.从CSV文件读取/导入数据 (3. Read / import data from CSV file)

First lets see how CSV file data looks like.

首先,让我们看看CSV文件数据的外观。

A CSV file contains data in comma separated format, which looks like:

CSV文件包含逗号分隔格式的数据,如下所示:

Image for post
It looks like an excel sheet if you view on any excel viewer
如果您在任何excel查看器上查看,它看起来像一个excel工作表
Image for post
This is how it is in its raw format, when opened in any editor , in this i have opened in VS Coded
在任何编辑器中打开时,它都是原始格式,在VS Coded中已打开

Reading CSV data is very straight forward in pandas. It provides you two functions : read_csv(‘file_path’) or read_csv(‘file_url’) , the data gets stored in data frame.

在熊猫中,读取CSV数据非常简单。 它提供了两个功能:read_csv('file_path')或read_csv('file_url'),数据被存储在数据框中。

i have taken this public repository from curran, so that you can use it as well.

我已经从curran那里获取了这个公共存储库,以便您也可以使用它。

csv_data = pd.read_csv(‘https://github.com/curran/data/blob/gh-pages/indiaGovOpenData/All_India_Index-February2016.csv’)

csv_data = pd.read_csv(' https://github.com/curran/data/blob/gh-pages/indiaGovOpenData/All_India_Index-February2016.csv ')

Image for post

As you can see it right away tells us how many rows and columns are there in the data.

如您所见,它立即告诉我们数据中有多少行和多少列。

4.将数据写入/导出到CSV文件 (4. Write/Export data to CSV files)

Exporting data to CSV file is as simple as importing it. Pandas has a function called : to_csv(‘file_name’), this will export the data from a data frame to CSV file.

将数据导出到CSV文件就像导入数据一样简单。 熊猫有一个名为:to_csv('file_name')的函数,它将数据从数据帧导出到CSV文件。

csv_data.to_csv(‘new_exported_data.csv;’)

csv_data.to_csv('new_exported_data.csv;')

5.查看和选择数据 (5. Viewing and Selecting data)

As we get to work with a lot of data so if we can view and select the data the way we want, it can give us more insights on the data at the first place.

当我们开始处理大量数据时,如果我们可以按照自己的方式查看和选择数据,那么它首先可以为我们提供关于数据的更多见解。

To view a snippet of data , ( 5 rows by default ):

要查看数据片段,(默认为5行):

csv_data.head()

csv_data.head()

To view more then just 5 records, let’s say you want to see 23 records from the top:

要查看仅5条记录,假设您要从顶部查看23条记录:

csv_data.head(23)

csv_data.head(23)

Image for post

To view a snippet of data from bottom:

要从底部查看数据片段:

csv_data.tail()

csv_data.tail()

To view more then just 5 records from bottom, let’s say you want to see 11 records from the bottom:

要从底部仅查看5条记录,假设您要从底部查看11条记录:

csv_data.tail(11)

csv_data.tail(11)

Image for post

To list out all the columns in the data:

列出数据中的所有列:

csv_data.columns

csv_data.columns

Image for post

In pandas dataframe we can assign more then one data in an index. and the index starts from 0.

在pandas数据框中,我们可以在一个索引中分配多个数据。 索引从0开始。

sample_data = pd.DataFrame({‘name’: [‘Arun’, ‘Shiva’, ‘Rafah’], ‘age’: [12, 34, 45]}, index=[1, 1, 2])

sample_data = pd.DataFrame({'name':['Arun','Shiva','Rafah'],'age':[12,34,45]},index = [1,1,2])

Image for post

One thing you have noticed above is that , i can create data frame from plan python lists as well.

您在上面注意到的一件事是,我也可以从计划python列表创建数据框。

View data at index 3:

查看索引3的数据:

sample_data.loc[1]

sample_data.loc [1]

View data at position 3:

查看位置3的数据:

sample_data.iloc[1]

sample_data.iloc [1]

Image for post

Selecting a column , you can select a column in two ways

选择列,您可以通过两种方式选择列

a. Dot notation:

一个。 点表示法:

sample_data.age

sample_data.age

b. Index/Attribute notation:

b。 索引/属性符号:

sample_data[‘age’]

sample_data ['age']

The first option (a) will not work if the column name has spaces. So select one and stick to that.

如果列名包含空格,则第一个选项(a)将不起作用。 因此,选择一个并坚持下去。

Selecting only those data where age is greater than 20:

仅选择年龄大于20的那些数据:

sample_data[sample_data[‘age’] > 20]

sample_data [sample_data ['age']> 20]

Image for post

I have just listed only most used functions here. I am planning to keep updating the article as i am going to refer it as well if i forget anything. If you have any questions or want to discuss any project feel free to comment here.

我在这里只列出了最常用的功能。 我打算继续更新文章,因为如果我忘记了任何内容,我也会参考它。 如果您有任何疑问或想要讨论任何项目,请在此处发表评论。

Thank you for reading :)

谢谢您的阅读:)

翻译自: https://medium.com/@lax_17478/data-analysis-a-complete-introduction-to-pandas-part-1-3dd06922144a

熊猫tv新功能介绍

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391647.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

数据转换软件_数据转换

数据转换软件📈Python金融系列 (📈Python for finance series) Warning: There is no magical formula or Holy Grail here, though a new world might open the door for you.警告 :这里没有神奇的配方或圣杯,尽管新世界可能为您…

10张图带你深入理解Docker容器和镜像

【编者的话】本文用图文并茂的方式介绍了容器、镜像的区别和Docker每个命令后面的技术细节,能够很好的帮助读者深入理解Docker。这篇文章希望能够帮助读者深入理解Docker的命令,还有容器(container)和镜像(image&#…

matlab界area_Matlab的数据科学界

matlab界area意见 (Opinion) My personal interest in Data Science spans back to 2011. I was learning more about Economies and wanted to experiment with some of the ‘classic’ theories and whilst many of them held ground, at a micro level, many were also pur…

hdf5文件和csv的区别_使用HDF5文件并创建CSV文件

hdf5文件和csv的区别In my last article, I discussed the steps to download NASA data from GES DISC. The data files downloaded are in the HDF5 format. HDF5 is a file format, a technology, that enables the management of very large data collections. Thus, it is…

机器学习常用模型:决策树_fairmodels:让我们与有偏见的机器学习模型作斗争

机器学习常用模型:决策树TL; DR (TL;DR) The R Package fairmodels facilitates bias detection through model visualizations. It implements a few mitigation strategies that could reduce bias. It enables easy to use checks for fairness metrics and comparison betw…

高德地图如何将比例尺放大到10米?

2019独角兽企业重金招聘Python工程师标准>>> var map new AMap.Map(container, {resizeEnable: true,expandZoomRange:true,zoom:20,zooms:[3,20],center: [116.397428, 39.90923] }); alert(map.getZoom());http://lbs.amap.com/faq/web/javascript-api/expand-zo…

Android 手把手带你玩转自己定义相机

本文已授权微信公众号《鸿洋》原创首发,转载请务必注明出处。概述 相机差点儿是每一个APP都要用到的功能,万一老板让你定制相机方不方?反正我是有点方。关于相机的两天奋斗总结免费送给你。Intent intent new Intent(); intent.setAction(M…

100米队伍,从队伍后到前_我们的队伍

100米队伍,从队伍后到前The last twelve months have brought us a presidential impeachment trial, the coronavirus pandemic, sweeping racial justice protests triggered by the death of George Floyd, and a critical presidential election. News coverage of these e…

idea使用 git 撤销commit

2019独角兽企业重金招聘Python工程师标准>>> 填写commit的id 就可以取消这一次的commit 转载于:https://my.oschina.net/u/3559695/blog/1596669

mongodb数据可视化_使用MongoDB实时可视化开放数据

mongodb数据可视化Using Python to connect to Taiwan Government PM2.5 open data API, and schedule to update data in real time to MongoDB — Part 2使用Python连接到台湾政府PM2.5开放数据API,并计划将数据实时更新到MongoDB —第2部分 目标 (Goal) This ti…

4.kafka的安装部署

为了安装过程对一些参数的理解,我先在这里提一下kafka一些重点概念,topic,broker,producer,consumer,message,partition,依赖于zookeeper, kafka是一种消息队列,他的服务端是由若干个broker组成的,broker会向zookeeper,producer生成者对应一个…

ecshop 前台个人中心修改侧边栏 和 侧边栏显示不全 或 导航现实不全

怎么给个人中心侧边栏加项或者减项 在模板文件default/user_menu.lbi 文件里添加或者修改,一般看到页面都会知道怎么加,怎么删,这里就不啰嗦了 添加一个栏目以后,这个地址跳的页面怎么写 这是最基本的一个包括左侧个人信息,头部导航栏 <!DOCTYPE html PUBLIC "-//W3C//…

面向对象编程思想-观察者模式

一、引言 相信猿友都大大小小经历过一些面试&#xff0c;其中有道经典题目&#xff0c;场景是猫咪叫了一声&#xff0c;老鼠跑了&#xff0c;主人被惊醒&#xff08;设计有扩展性的可加分&#xff09;。对于初学者来说&#xff0c;可能一脸懵逼&#xff0c;这啥跟啥啊是&#x…

Python:在Pandas数据框中查找缺失值

How to find Missing values in a data frame using Python/Pandas如何使用Python / Pandas查找数据框中的缺失值 介绍&#xff1a; (Introduction:) When you start working on any data science project the data you are provided is never clean. One of the most common …

监督学习-回归分析

一、数学建模概述 监督学习&#xff1a;通过已有的训练样本进行训练得到一个最优模型&#xff0c;再利用这个模型将所有的输入映射为相应的输出。监督学习根据输出数据又分为回归问题&#xff08;regression&#xff09;和分类问题&#xff08;classfication&#xff09;&#…

微服务架构技能

2019独角兽企业重金招聘Python工程师标准>>> 微服务架构技能 博客分类&#xff1a; 架构 &#xff08;StuQ 微服务技能图谱&#xff09; 2课程简介 本课程分为基础篇和高级篇两部分&#xff0c;旨在通过完整的案例&#xff0c;呈现微服务的开发、测试、构建、部署、…

Tableau Desktop认证:为什么要关心以及如何通过

Woah, Tableau!哇&#xff0c;Tableau&#xff01; By now, almost everyone’s heard of the data visualization software that brought visual analytics to the public. Its intuitive drag and drop interface makes connecting to data, creating graphs, and sharing d…

约束布局constraint-layout导入失败的解决方案 - 转

今天有同事用到了约束布局&#xff0c;但是导入我的工程出现错误 **提示错误&#xff1a; Could not find com.Android.support.constraint:constraint-layout:1.0.0-alpha3** 我网上查了一下资料&#xff0c;都说是因为我的androidStudio版本是最新的稳定版导入这个包就会报这…

算法复习:冒泡排序

思想&#xff1a;对于一个列表,每个数都是一个"气泡 "&#xff0c;数字越大表示"越重 "&#xff0c;最重的气泡移动到列表最后一位&#xff0c;冒泡排序后的结果就是“气泡”按照它们的重量依次移动到列表中它们相应的位置。 算法&#xff1a;搜索整个列表…

前端基础进阶(七):函数与函数式编程

纵观JavaScript中所有必须需要掌握的重点知识中&#xff0c;函数是我们在初学的时候最容易忽视的一个知识点。在学习的过程中&#xff0c;可能会有很多人、很多文章告诉你面向对象很重要&#xff0c;原型很重要&#xff0c;可是却很少有人告诉你&#xff0c;面向对象中所有的重…