天天看点

Python数据分析与展示:pandas库的数据排序-12pandas库的数据排序代码示例

基本统计(含排序) 分布/累计统计 数据特征 相关性、周期性等 数据挖掘(形成知识)

一组数据表达一个或多个含义

摘要 - 数据形成有损特征的过程

pandas库的数据排序

.sort_index()方法在指定轴上根据索引进行排序,默认升序

.sort_index(axis=0, ascending=True)

.sort_values()方法在指定轴上根据数值进行排序,默认升序

Series.sort_values(axis=0, ascending=True)

DataFrame.sort_values(by, axis=0, ascending=True)

by :axis轴上的某个索引或索引列表

NaN统一放到排序末尾

代码示例

# -*- coding: utf-8 -*-

# @File    : pandas_sort.py
# @Date    : 2018-05-20

# pandas数据排序

import pandas as pd
import numpy as np

# 数据准备
df = pd.DataFrame(np.arange(20).reshape(4, 5), index=["c", "a", "d", "b"])
print(df)
"""
    0   1   2   3   4
c   0   1   2   3   4
a   5   6   7   8   9
d  10  11  12  13  14
b  15  16  17  18  19
"""

# 索引升序排序,默认axis=0,行索引
print(df.sort_index())
"""
    0   1   2   3   4
a   5   6   7   8   9
b  15  16  17  18  19
c   0   1   2   3   4
d  10  11  12  13  14
"""

# 索引降序排序
print(df.sort_index(ascending=False))
"""
    0   1   2   3   4
d  10  11  12  13  14
c   0   1   2   3   4
b  15  16  17  18  19
a   5   6   7   8   9
"""

# 对axis-1排序,列索引
print(df.sort_index(axis=1, ascending=False))
"""
    4   3   2   1   0
c   4   3   2   1   0
a   9   8   7   6   5
d  14  13  12  11  10
b  19  18  17  16  15
"""

# 值排序,行排序
print(df.sort_values(2, ascending=False))
"""
    0   1   2   3   4
b  15  16  17  18  19
d  10  11  12  13  14
a   5   6   7   8   9
c   0   1   2   3   4
"""

# 列排序,选择排序关键字
print(df.sort_values("a", axis=1, ascending=False))
"""
    4   3   2   1   0
c   4   3   2   1   0
a   9   8   7   6   5
d  14  13  12  11  10
b  19  18  17  16  15
"""

# NaN统一放到排序末尾
a = pd.DataFrame(np.arange(12).reshape(3, 4), index=["a", "b", "c"])
b = pd.DataFrame(np.arange(20).reshape(4, 5), index=["a", "b", "c", "d"])
c = a + b
print(c)
"""
      0     1     2     3   4
a   0.0   2.0   4.0   6.0 NaN
b   9.0  11.0  13.0  15.0 NaN
c  18.0  20.0  22.0  24.0 NaN
d   NaN   NaN   NaN   NaN NaN
"""

print(c.sort_values(2, ascending=False))
"""
      0     1     2     3   4
c  18.0  20.0  22.0  24.0 NaN
b   9.0  11.0  13.0  15.0 NaN
a   0.0   2.0   4.0   6.0 NaN
d   NaN   NaN   NaN   NaN NaN
"""

print(c.sort_values(2, ascending=True))
"""
      0     1     2     3   4
a   0.0   2.0   4.0   6.0 NaN
b   9.0  11.0  13.0  15.0 NaN
c  18.0  20.0  22.0  24.0 NaN
d   NaN   NaN   NaN   NaN NaN
"""