Python酷库之旅-第三方库Pandas(029)

一、用法精讲

74、pandas.api.interchange.from_dataframe函数

74-1、语法

74-2、参数

74-3、功能

74-4、返回值

74-5、说明

74-6、用法

74-6-1、数据准备

74-6-2、代码示例

74-6-3、结果输出

75、pandas.Series类

75-1、语法

75-2、参数

75-3、功能

75-4、返回值

75-5、说明

75-6、用法

75-6-1、数据准备

75-6-2、代码示例

75-6-3、结果输出

76、pandas.Series.index属性

76-1、语法

76-2、参数

76-3、功能

76-4、返回值

76-5、说明

76-6、用法

76-6-1、数据准备

76-6-2、代码示例

76-6-3、结果输出

77、pandas.Series.array方法

77-1、语法

77-2、参数

77-3、功能

77-4、返回值

77-5、说明

77-6、用法

77-6-1、数据准备

77-6-2、代码示例

77-6-3、结果输出

78、pandas.Series.values属性

78-1、语法

78-2、参数

78-3、功能

78-4、返回值

78-5、说明

78-6、用法

78-6-1、数据准备

78-6-2、代码示例

78-6-3、结果输出

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

一、用法精讲

74、pandas.api.interchange.from_dataframe函数

74-1、语法

# 74、pandas.api.interchange.from_dataframe函数
pandas.api.interchange.from_dataframe(df, allow_copy=True)
Build a pd.DataFrame from any DataFrame supporting the interchange protocol.Parameters:
df
DataFrameXchg
Object supporting the interchange protocol, i.e. __dataframe__ method.allow_copy
bool, default: True
Whether to allow copying the memory to perform the conversion (if false then zero-copy approach is requested).Returns:
pd.DataFrame

74-2、参数

74-2-1、df(必须)：一个类似于数据框的对象，表示要转换为Pandas Data的数据，该对象可以是任何实现了数据框接口的对象，如来自其他(例如Dask、Vaex等)的DataFrame。

74-2-2、allow_copy(可选，默认值为True)：指示在转换过程中是否允许复制数据。如果设置为True，则在需要的情况下，方法可以复制数据来保证数据的一致性和完整性；如果设置为False，方法会尝试避免复制数据，这样可以提高性能和减少内存使用，但可能会导致一些限制。

74-3、功能

用于从其他数据框架接口中导入数据框架的Pandas API方法，它将其他数据框架对象转换为Pandas DataFrame。

74-4、返回值

返回值是一个Interchange DataFrame对象，该对象是一个通用的数据框架标准，用于在不同的数据处理库之间交换数据。

74-5、说明

无

74-6、用法

74-6-1、数据准备

无

74-6-2、代码示例

# 74、pandas.api.interchange.from_dataframe函数
import pandas as pd
from pandas.api.interchange import from_dataframe
# 创建一个示例DataFrame
data = {'Name': ['Myelsa', 'Bryce', 'Jimmy'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
# 使用from_dataframe方法转换DataFrame
interchange_df = from_dataframe(df, allow_copy=True)
# 打印转换后的Interchange DataFrame信息
print(type(interchange_df))
print(interchange_df)

74-6-3、结果输出

# 74、pandas.api.interchange.from_dataframe函数
# <class 'pandas.core.frame.DataFrame'>
#      Name  Age         City
# 0  Myelsa   25     New York
# 1   Bryce   30  Los Angeles
# 2   Jimmy   35      Chicago

75、pandas.Series类

75-1、语法

# 75、pandas.Series类
pandas.Series(data=None, index=None, dtype: 'Dtype | None' = None, name=None, copy: 'bool | None' = None, fastpath: 'bool | lib.NoDefault' = <no_default>) -> 'None'One-dimensional ndarray with axis labels (including time series).Labels need not be unique but must be a hashable type. The object
supports both integer- and label-based indexing and provides a host of
methods for performing operations involving the index. Statistical
methods from ndarray have been overridden to automatically exclude
missing data (currently represented as NaN).Operations between Series (+, -, /, \*, \*\*) align values based on their
associated index values-- they need not be the same length. The result
index will be the sorted union of the two indexes.Parameters
----------
data : array-like, Iterable, dict, or scalar valueContains data stored in Series. If data is a dict, argument order ismaintained.
index : array-like or Index (1d)Values must be hashable and have the same length as `data`.Non-unique index values are allowed. Will default toRangeIndex (0, 1, 2, ..., n) if not provided. If data is dict-likeand index is None, then the keys in the data are used as the index. If theindex is not None, the resulting Series is reindexed with the index values.
dtype : str, numpy.dtype, or ExtensionDtype, optionalData type for the output Series. If not specified, this will beinferred from `data`.See the :ref:`user guide <basics.dtypes>` for more usages.
name : Hashable, default NoneThe name to give to the Series.
copy : bool, default FalseCopy input data. Only affects Series or 1d ndarray input. See examples.Notes
-----
Please reference the :ref:`User Guide <basics.series>` for more information.

75-2、参数

75-2-1、data(可选，默认值为None)：表示Series数据，可以是列表、NumPy数组、字典或标量值(如单个数值)，如果是标量值，会将该值赋给Series的每一个元素。

75-2-2、index(可选，默认值为None)：表示索引标签，用于定义Series的索引，如果没有提供，默认会生成一个从0开始的整数索引，长度必须与data的长度相同。

75-2-3、dtype(可选，默认值为None)：表示数据类型。如果没有提供，Pandas会尝试自动推断data的数据类型。

75-2-4、name(可选，默认值为None)：表示Series的名称，可以为Series对象命名，方便在DataFrame中引用。

75-2-5、copy(可选，默认值为None)：如果设为True，则会复制data，通常在传递的是其他Pandas对象时使用，以确保数据不会被修改。

75-2-6、fastpath(可选)：内部使用参数，用于优化性能，通常用户不需要显式设置这个参数。

75-3、功能

pandas.Series是Pandas库中最基本的数据结构之一，它类似于一维数组，可以存储任意类型的数据(整数、浮点数、字符串等)，该构造函数允许我们从多种数据类型创建一个Series对象。

75-4、返回值

创建一个pandas.Series对象时，返回值是一个pandas Series对象，该对象具有以下特性：

75-4-1、一维数据结构：Series是一维的，可以看作是一个带有标签的数组。

75-4-2、索引：每个数据元素都有一个对应的标签(索引)，可以通过索引来访问数据。

75-4-3、数据类型：Series中的所有数据类型是一致的(如果在创建时未指定不同类型)。

75-5、说明

无

75-6、用法

75-6-1、数据准备

无

75-6-2、代码示例

# 75、pandas.Series类
# 75-1、从列表创建Series
import pandas as pd
data = [1, 2, 3, 4, 5]
series1 = pd.Series(data)
print(series1, end='\n\n')# 75-2、从字典创建Series
import pandas as pd
data = {'a': 1, 'b': 2, 'c': 3}
series2 = pd.Series(data)
print(series2, end='\n\n')# 75-3、指定索引和数据类型
import pandas as pd
data = [1.5, 2.5, 3.5]
index = ['a', 'b', 'c']
series3 = pd.Series(data, index=index, dtype=float, name='Example Series')
print(series3, end='\n\n')# 75-4、从标量值创建Series
import pandas as pd
scalar_data = 10
series4 = pd.Series(scalar_data, index=['a', 'b', 'c'])
print(series4)

75-6-3、结果输出

# 75、pandas.Series类
# 75-1、从列表创建Series
# 0    1
# 1    2
# 2    3
# 3    4
# 4    5
# dtype: int64# 75-2、从字典创建Series
# a    1
# b    2
# c    3
# dtype: int64# 75-3、指定索引和数据类型
# a    1.5
# b    2.5
# c    3.5
# Name: Example Series, dtype: float64# 75-4、从标量值创建Series
# a    10
# b    10
# c    10
# dtype: int64

76、pandas.Series.index属性

76-1、语法

# 76、pandas.Series.index属性
pandas.Series.index
The index (axis labels) of the Series.The index of a Series is used to label and identify each element of the underlying data. The index can be thought of as an immutable ordered set (technically a multi-set, as it may contain duplicate labels), and is used to index and align data in pandas.Returns:
Index
The index labels of the Series.

76-2、参数

无

76-3、功能

提供对Series中数据索引的访问。

76-4、返回值

返回值是一个pandas.Index对象，它包含了Series中每个数据点的索引标签。

76-5、说明

在Pandas中，Series是一个一维的、长度可变的、能够存储任何数据类型的数组(尽管在实践中，它通常用于存储相同类型的数据)，并且每个元素都有一个与之关联的索引标签。

76-6、用法

76-6-1、数据准备

无

76-6-2、代码示例

# 76、pandas.Series.index属性
import pandas as pd
# 创建一个带有自定义索引的Series
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
# 访问Series的index属性
index_obj = s.index
# 76-1、打印index_obj的类型
print(type(index_obj), end='\n\n')# 76-2、打印index_obj的内容
print(index_obj, end='\n\n')# 76-3、将索引转换为列表
index_list = index_obj.tolist()
print(index_list, end='\n\n')# 76-4、获取索引的NumPy数组
index_array = index_obj.values
print(index_array)

76-6-3、结果输出

# 76、pandas.Series.index属性
# 76-1、打印index_obj的类型
# <class 'pandas.core.indexes.base.Index'># 76-2、打印index_obj的内容
# Index(['a', 'b', 'c', 'd'], dtype='object')# 76-3、将索引转换为列表
# ['a', 'b', 'c', 'd']# 76-4、获取索引的NumPy数组
# ['a' 'b' 'c' 'd']

77、pandas.Series.array方法

77-1、语法

# 77、pandas.Series.array方法
pandas.Series.array
The ExtensionArray of the data backing this Series or Index.Returns:
ExtensionArray
An ExtensionArray of the values stored within. For extension types, this is the actual array. For NumPy native types, this is a thin (no copy) wrapper around numpy.ndarray..array differs from .values, which may require converting the data to a different form.

77-2、参数

无

77-3、功能

获取存储在Series对象中的数据的底层数组表示。

77-4、返回值

返回值取决于Series中数据的类型：

77-4-1、对于NumPy原生类型的数据(如整数、浮点数、字符串等)，.array方法将返回一个NumpyExtensionArray对象，这是一个对内部NumPy ndarray的封装，但不进行数据的复制，这意味着返回的数组与Series中的数据共享相同的内存区域，除非进行显式的数据复制操作。
77-4-2、对于扩展类型的数据(如分类数据、时间戳、时间间隔等)，.array方法将返回实际的ExtensionArray对象，这些对象是为了支持Pandas中非NumPy原生类型的数据而设计的，这些扩展数组提供了与NumPy数组类似的接口，但具有额外的功能或属性，以适应特定类型的数据。

77-5、说明

返回值的特点：

77-5-1、类型依赖性：返回值的具体类型取决于Series中数据的类型。

77-5-2、内存共享(对于NumPy原生类型)：在大多数情况下，返回的数组与Series中的数据共享相同的内存区域，从而避免不必要的数据复制。

77-5-3、灵活性：通过提供对底层数组的访问，.array方法允许用户进行更底层的操作或优化，尽管这通常不是Pandas推荐的常规用法。

77-6、用法

77-6-1、数据准备

无

77-6-2、代码示例

# 77、pandas.Series.array方法
import pandas as pd
# 创建一个简单的Series（包含 NumPy 原生类型的数据）
s_numpy = pd.Series([1, 2, 3, 4])
# 使用.array方法获取底层数组
arr_numpy = s_numpy.array
print(arr_numpy)
print(type(arr_numpy))

77-6-3、结果输出

# 77、pandas.Series.array方法
# <NumpyExtensionArray>
# [1, 2, 3, 4]
# Length: 4, dtype: int64
# <class 'pandas.core.arrays.numpy_.NumpyExtensionArray'>

78、pandas.Series.values属性

78-1、语法

# 78、pandas.Series.values属性
pandas.Series.values
Return Series as ndarray or ndarray-like depending on the dtype.WarningWe recommend using Series.array or Series.to_numpy(), depending on whether you need a reference to the underlying data or a NumPy array.Returns:
numpy.ndarray or ndarray-like

78-2、参数

无

78-3、功能

用于获取Series中数据的NumPy表示。

78-4、返回值

返回一个NumPy ndarray，其中包含了Series中的所有数据，但通常不包括索引信息。

78-5、说明

使用.values属性是获取Series中数据的一种快速方式，尤其是当你需要将数据传递给需要NumPy数组作为输入的函数或库时，然而，需要注意的是，返回的NumPy数组可能与原始的Series数据共享内存(对于非对象数据类型)，这意味着如果你修改了返回的数组，原始的Series数据也可能会被修改(尽管Pandas在许多情况下都会尝试避免这种情况)。

78-6、用法

78-6-1、数据准备

无

78-6-2、代码示例

# 78、pandas.Series.values属性
import pandas as pd
# 创建一个简单的Series
s = pd.Series([1, 2, 3, 4])
# 使用.values属性获取NumPy数组
np_array = s.values
# 输出结果  
print(np_array)
print(type(np_array))
# 修改NumPy数组（注意：这可能会影响原始的Series，但Pandas通常会避免这种情况）
np_array[0] = 10
# 检查Series是否被修改（对于非对象类型，通常不会）
print(s)
# 如果你想要一个确保不会修改原始Series的副本，可以使用.copy()
np_array_copy = s.values.copy()
np_array_copy[0] = 100
print(s)

78-6-3、结果输出

# 78、pandas.Series.values属性
# [1 2 3 4]
# <class 'numpy.ndarray'>
# 0    10
# 1     2
# 2     3
# 3     4
# dtype: int64
# 0    10
# 1     2
# 2     3
# 3     4
# dtype: int64