Sunday, September 8, 2024
Homepandaspandas

pandas [one]

An extension library for the Python language for data analysis; based on Numpy

Install

pip install pandas

Version

import pandas
pandas.__version__   # 0.25.2

One-dimensional array Series

Data in order

List data

  • By default, an integer index is created starting from 0; indexable values
var = pandas.Series([1, 2, 3])
var[1]   # 2

0    1
1    2
2    3
dtype: int64

  • Specify index
var = pandas.Series([1, 2, 3], index=["x", "y", "z"])
var["x"]   # 1

x    1
y    2
z    3
dtype: int64

dictionary data

info = {"name": "Tom", "age": 15, "sex": "male", "email": "123@qq.com"}
var = pandas.Series(info)

name Tom
age 15
sex male
email 123@qq.com
dtype: object

  • Specify data and set name
pandas.Series(info, index=["name", "age"], name="student information")

name Tom
age 15
Name: student information, dtype: object

Series operations

runoob index value, slice value

var = pandas.Series({"name": "Tom", "age": 15, "sex": "male", "email": "123@qq.com"})
# The following operations,Data starts with ↑ The initial data shall prevail

var["name"]  # Tom
var[0]       # Tom

# slice:return new Series
var["name":"email"]
var[:"email"]        # Specify index:Contains the specified index
var[:3]              # Default index:Pay attention to the head and not the tail
var.drop(["sex"])    # Returns a new index label deleted Series

for index, value in var.items():
    print(f"Index: {index}, Value: {value}")  # Index: name, Value: Tom ....

# Modify original data:increase/delete/change
var["height"] = 178
del var["email"]
var["age"] = 18

Basic operations

  • arithmetic operations
var = pandas.Series({"name": "Mei", "age": 15, "sex": "M"})
var * 2
# pandas.Series({"name": "MeiMei", "age": 30, "sex": "MM"})

name MeiMei
age 30
sex MM
dtype: object

  • filter

Data of different types cannot be compared: TypeError

var = pandas.Series({"name": "Tom", "sex": "M"})
var[var > 'M']    
# pandas.Series({"name": "Tom"})

var = pandas.Series({"Mei": 21, "Lily": 15, "Tom": 19})
var[var > 18]    
# pandas.Series({"Mei": 21, "Tom": 19})
  • Math functions
import numpy
var = pandas.Series({"Mei": 21, "Lily": 15, "Tom": 19})
numpy.sum(var)   # 55
numpy.max(var)   # 21
numpy.sqrt(var)  
# pandas.Series({"Mei": 4.582576, "Lily": 3.872983, "Tom": 4.358899})

Series method

Statistical data without modifying source data

var = pandas.Series({"noodle": 21, "rice": 2, "bread": 6, "coke": 15})
var.sum()   # 44
var.max()   # 21
var.min()   # 2
var.mean()  # average value 11.0 
var.std()   # standard deviation 8.602325267042627
var.idxmax()   # index of maximum value noodle
var.idxmin()   # index of minimum value rice
var.head(2)    # pandas.Series({"noodle": 21, "rice": 2}),default5
var.tail(2)    # pandas.Series({"bread": 6, "coke": 15}),default5

var.astype('float64')
# pandas.Series({"noodle": 21.0, "rice": 2.0, "bread": 6.0, "coke": 15.0})

var.describe() # Descriptive statistics

count 4.000000
mean 11.000000
std 8.602325
min 2.000000
25% 5.000000
50%      10.500000
75%    16.500000
max      21.000000
dtype: float64

Properties of Series

var = pandas.Series({"noodle": 21, "rice": 2, "bread": 6, "coke": 15})
var.size      # Number of elements 4
var.shape     # shape (4,)
var.dtype     # type of data int64
var.values    # value arraynumpy.ndarray  [21  2  6 15]
var.index     # index Index(['noodle', 'rice', 'bread', 'coke'], dtype='object')
RELATED ARTICLES

Most Popular

Recent Comments