用于数据分析的Python – Pandas

大熊猫 (Pandas)

  • Pandas is an open-source library built on top of NumPy

    Pandas是建立在NumPy之上的开源库

  • It allows for fast analysis and data cleaning and preparation

    它允许快速分析以及数据清理和准备

  • It excels in performance and productivity

    它在性能和生产力方面都非常出色

  • It also has built-in visualization features

    它还具有内置的可视化功能

  • It can work with data from a wide variety of sources

    它可以处理来自各种来源的数据

如何安装熊猫? (How to install Pandas?)

Using PIP

使用画中画

(venv) -bash-4.2$ pip install pandas
Requirement already satisfied: pandas in ./venv/lib/python3.6/site-packages (0.25.1)
Requirement already satisfied: python-dateutil>=2.6.1 in ./venv/lib/python3.6/site-packages (from pandas) (2.8.0)
Requirement already satisfied: pytz>=2017.2 in ./venv/lib/python3.6/site-packages (from pandas) (2019.2)
Requirement already satisfied: numpy>=1.13.3 in ./venv/lib/python3.6/site-packages (from pandas) (1.17.2)
Requirement already satisfied: six>=1.5 in ./venv/lib/python3.6/site-packages (from python-dateutil>=2.6.1->pandas) (1.12.0)
venv) -bash-4.2$

Series

系列

One-dimensional ndarray with axis labels, including time series. It is capable of holding data of any type. The axis labels are collectively known as an index. Series is very similar to a NumPy array, built on NumPy array object. However, the difference being a series can be indexed by labels.

具有轴标签的一维ndarray,包括时间序列 。 它能够保存任何类型的数据。 轴标签统称为索引。 系列与建立在NumPy数组对象上的NumPy数组非常相似。 但是,区别在于可以通过标签对系列进行索引。

Syntax:

句法:

class pandas.Series(
data=None, 
index=None, dtype=None, 
name=None, 
copy=False, 
fastpath=False
)

Below snippets shows examples of creating a series,

以下代码片段显示了创建系列的示例,

import numpy as np
import pandas as pd
labels = ['a','e','i','o'] #python list
data = [1,2,3,4] #python list
arr = np.array(data) #NumPy array
d = {'a':1,'b':2,'c':3} #python dict
# creating a series object with default index
print(pd.Series(data = data))
# creating a series object with labels as index
print(pd.Series(data = data, index = labels))
# creating a series with NumPy array
print(pd.Series(arr,index = labels))
# creating a series with dictionary, 
# here the key becomes the index
print(pd.Series(d))
# Series can also hold built-in func
print(pd.Series(data = [sum, print, len]))

Output

输出量

0    1
1    2
2    3
3    4
dtype: int64
a    1
e    2
i    3
o    4
dtype: int64
a    1
e    2
i    3
o    4
dtype: int64
a    1
b    2
c    3
dtype: int64
0       <built-in function sum>
1       <built-in function print>
2       <built-in function len>
dtype: object

系列操作 (Operations on Series)

Create two series object

创建两个系列对象

import pandas as pd
ser1 = pd.Series([1,2,3,4],['Delhi','Bangalore','Mysore', 'Pune'])
print(ser1)
ser2 = pd.Series([1,2,5,4],['Delhi','Bangalore','Vizag','Pune'])
print(ser2)

Output

输出量

Delhi        1
Bangalore    2
Mysore       3
Pune         4
dtype: int64
Delhi        1
Bangalore    2
Vizag        5
Pune         4
dtype: int64

To retrieve the information from the series, is similar to the python dictionary, pass on the index-label of the given data type. In the above example, the index-label is of type String.

要从系列中检索信息,类似于python字典,传递给定数据类型的index-label。 在上面的示例中,索引标签的类型为String。

print(ser1['Delhi'])
# Output: 1

Now let's trying adding the two series,

现在让我们尝试添加两个系列,

print(ser1+ser2)
'''
Output:
Bangalore    4.0
Delhi        2.0
Mysore       NaN
Pune         8.0
Vizag        NaN
dtype: float64
'''

The pandas, adds the values of the index-labels. In case the match is not found, it will be put a NaN (null value). When the operations are performed on series or any NumPy/Pandas based object, the integers will be converted to float.

pandas ,添加索引标签的值。 如果找不到匹配项,则将其放入NaN(空值)。 当对序列或任何基于NumPy / Pandas的对象执行操作时,整数将转换为float。

翻译自: https://www.includehelp.com/python/python-for-data-analysis-pandas.aspx

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/544037.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

1022词法分析实验总结

经过这次词法分析的实验之后&#xff0c;收获良多。弥补了一些知识空洞&#xff0c;以前不懂的知识也弄懂了。 显然这都得力于组员之间的合作与帮助&#xff0c;一人负责编写&#xff0c;其他在旁边给想法同时学习。程序中运用了许多for&#xff0c;if&#xff0c;while等的循环…

SpringCloud基于RocketMQ实现分布式事务

前言分布式事务是在微服务开发中经常会遇到的一个问题&#xff0c;之前的文章中我们已经实现了利用Seata来实现强一致性事务&#xff0c;其实还有一种广为人知的方案就是利用消息队列来实现分布式事务&#xff0c;保证数据的最终一致性&#xff0c;也就是我们常说的柔性事务。消…

c# uri.host_C#| 具有示例的Uri.Host属性

c# uri.hostUri.Host属性 (Uri.Host Property) Uri.Host Property is the instance property of Uri class which used to get host components from URI. This property returns a string value. This property may generate System.InvalidOperationException exception. Uri…

VS使用和错误收集

USE&#xff1a; VS引用相对路径&#xff08;需要说明的是&#xff0c;“..\”表示退出这一目录&#xff09; VS 2005项目中添加lib库以及代码中的相对路径 ERROR&#xff1a; fatal error C1189: #error : Building MFC application with /MD[d] (CRT dll version) requires M…

漫画:怎么证明sleep不释放锁,而wait释放锁?

wait 加锁示例public class WaitDemo {private static Object locker new Object();public static void main(String[] args) throws InterruptedException {WaitDemo waitDemo new WaitDemo();// 启动新线程&#xff0c;防止主线程被休眠new Thread(() -> {try {waitDemo…

机器学习 导论_机器学习导论

机器学习 导论什么是机器学习&#xff1f; (What is Machine Learning?) Machine learning can be vaguely defined as a computers ability to learn without being explicitly programmed, this, however, is an older definition of machine learning. A more modern defin…

就国内某个程序员问答网站的简单的分析

为什么80%的码农都做不了架构师&#xff1f;>>> 一、数据抓取 分析页面数据&#xff0c;设计数据表结构数据只要包含投票、回答数、问题状态、最后谁回答过、浏览数、问题标题、标签&#xff0c;数据样例如下&#xff1a;由于一开只打算爬问题标题&#xff0c;问题…

树的结构 数据结构_段树| 数据结构

树的结构 数据结构What is a segment tree? 什么是段树&#xff1f; A segment tree is a full binary tree where each node represents an interval. A node may store one or more data members of an interval which can be queried later. 段树是完整的二叉树&#xff0…

iOS开发中 常用枚举和常用的一些运算符(易错总结)

1、色值的随机值&#xff1a;#define kColorValue arc4random_uniform(256)/255.0 // arc4random_uniform(256)/255.0; 求出0.0~1.0之间的数字view.backgroundColor [UIColor colorWithRed:kColorValue green: kColorValue blue: kColorValue alpha: 0.5]; 2、定时器的使用&…

明明加了唯一索引,为什么还是产生重复数据?

前段时间我踩过一个坑&#xff1a;在mysql8的一张innodb引擎的表中&#xff0c;加了唯一索引&#xff0c;但最后发现数据竟然还是重复了。到底怎么回事呢&#xff1f;本文通过一次踩坑经历&#xff0c;聊聊唯一索引&#xff0c;一些有意思的知识点。1.还原问题现场前段时间&…

python字符串 切片_用于切片字符串的Python程序

python字符串 切片Given a string and number of characters (N), we have to slice and print the starting N characters from the given string using python program. 给定一个字符串和字符数( N )&#xff0c;我们必须使用python程序从给定的字符串中切片并打印开始的N个字…

nmap入门之主机发现

2019独角兽企业重金招聘Python工程师标准>>> #主机发现&#xff08;HOST DISCOVERY&#xff09; ##仅列出IP&#xff0c;不扫描 nmap -sL 192.168.70.0/24 > nmap_result.txt 2>&1##仅ping扫描&#xff0c;不扫描端口 nmap -sn 192.168.70.0/24##不ping扫…

面试官:为什么ConcurrentHashMap要放弃分段锁?

今天我们来讨论一下一个比较经典的面试题就是 ConcurrentHashMap 为什么放弃使用了分段锁&#xff0c;这个面试题阿粉相信很多人肯定觉得有点头疼&#xff0c;因为很少有人在开发中去研究这块的内容&#xff0c;今天阿粉就来给大家讲一下这个 ConcurrentHashMap 为什么在 JDK8 …

ruby .each_Ruby中带有示例的Array.each方法

ruby .eachRuby Array.each方法 (Ruby Array.each method) Array.each method can be easily termed as a method which helps you to iterate over the Array. This method first processes the first element then goes on the second and the process keeps on going on unt…

面试突击72:输入URL之后会执行什么流程?

作者 | 磊哥来源 | Java面试真题解析&#xff08;ID&#xff1a;aimianshi666&#xff09;转载请联系授权&#xff08;微信ID&#xff1a;GG_Stone&#xff09;在浏览器中输入 URL 之后&#xff0c;它会执行以下几个流程&#xff1a;执行 DNS 域名解析&#xff1b;封装 HTTP 请…

二层交换网络_网络中的第2层交换

二层交换网络二层交换简介 (Introduction to Layer 2 Switching) As you know hubs are not intelligent devices. Whenever a hub receives a frame, it broadcasts the frame in all ports. Also, the hub represents a single collision domain i.e. when any 2 hosts send …

最小化托盘示例工程

http://files.cnblogs.com/files/kekec2/BuyTicket.rar.gif转载于:https://www.cnblogs.com/kekec2/p/4914572.html

面试必备:TCP 经典 15 连问!

TCP协议是大厂面试必问的知识点。整理了15道非常经典的TCP面试题&#xff0c;希望大家都找到理想的offer呀1. 讲下TCP三次握手流程开始客户端和服务器都处于CLOSED状态&#xff0c;然后服务端开始监听某个端口&#xff0c;进入LISTEN状态第一次握手(SYN1, seqx)&#xff0c;发…

range函数python_range()函数以及Python中的示例

range函数pythonPython range()函数 (Python range() function) The range() is a built-in function in Python which returns the sequence of values. It is used where we need to perform a specific action for a limited number of times. In general, if we write rang…

ISP QoS Lab

ISP QoS Lab1-PQ优先级队列&#xff08;PQ&#xff0c;Priority Queue&#xff09;中&#xff0c;有高、中、普通、低优先级四个队列。数据包根据事先的定义放在不同的队列中&#xff0c;路由器按照高、中、普通、低顺序服务&#xff0c;只有高优先级的队列为空后才为中优先级的…