About Pandas version: This article is written based on pandas2.2.0.
Updates on the content of this article: As the stable version of pandas changes, this article continues to be updated and continuously improved and supplemented.
Portal: Pandas API Reference Directory
Portal: Pandas version updates and new features
Portal: Pandas tutorial series from easy to deep
Directory of this section
- Pandas.DataFrame.loc[]
-
- grammar:
- return value:
- Syntax description:
-
-
- Filter 1 row, filter 1 column, and filter cells
-
-
- Filter multiple rows and columns
-
- Scope screening
-
- Boolean filtering
-
- Callable filtering
- Related methods:
- Example:
Pandas.DataFrame.loc[]
The Pandas.DataFrame.loc[]
method is used to filter DataFrame
data by index, column name.
- Notice! In this method, the number you pass is not interpreted as a
natural index
, but is passed as a string toDataFrame.loc
as the value of the row index, or the value of the column name. - > ⚠️ Note:
>
> * In this method, the number you pass will not be understood as anatural index
, but will only be passed as a string toDataFrame.loc
as the value of the row index, or the value of the column name.
> * If you perform range filtering on aDataFrame
withmulti-level index
, it must be sorted first. It is recommended to usedf.sort_index(inplace=True)
to sort and then perform range filtering.
> * Support assignment of original data after filtering Example 10
grammar:
DataFrame.loc ['row index','column name']
return value:
- Series or DataFrame or Scalar
- The filtering range, whether the
DataFrame
has multi-level indexes, etc. will all affect the specific return form. - If the filter result is
Series
orScalar
, the filter condition can be forced to be returned inDataFrame
style by enclosing [ ] square brackets. example 1
Syntax description:
1. Filter 1 row, filter 1 column, and filter cells
- DataFrame.loc[index, column name] Example 1
Index filter conditions and column name filter conditions are separated by commas.
- Filter 1 row: DataFrame.loc[_'index'_,: ] When only passing index conditions, the red commas and colons can be omitted. Red colons indicate all columns.
-
Filter 1 column: DataFrame.loc[:, _'Column name'_] _The red colon must be present, indicating all rows. _
-
Filter cells: DataFrame.loc[_'index'_, _'column name'_]
If DataFrame
has multiple levels of indexes and column names, when you want to filter non-top data, you need to use tuple
to pass the level of index and column names. Example 2
2. Filter multiple rows and columns
- DataFrame.loc[['Index 1', 'Index 2', …],['Column name 1', 'Column name 2', …]]Example 3
Multiple index filter conditions are wrapped in square brackets, and multiple column name filter conditions are wrapped in square brackets. The two conditions are separated by commas.
- Filter multiple rows: DataFrame.loc[[_'Index 1', 'Index 2'_, …], ] _When only passing index conditions, the red comma can be omitted. _
- Filter multiple columns: DataFrame.loc[, [_'Column Name 1', 'Column Name 2'_, …]]
- Filter multiple rows and columns at the same time: DataFrame.loc[[_'index 1','index 2', …], ['column name 1','column name 2', … _]]
- > ⚠️ Note:
>
> * Multiple conditions must be wrapped in [ ] square brackets!
>
> * UnlikePython slice
, elements at the beginning and end positions wrapped by [ ] will be included in the filter condition.
>
> * IfDataFrame
has multiple levels of indexes and column names, when you want to filter non-top-level data, you need to usetuple
to pass the level of index and column names. Example 4
>
3. Scope filtering
- DataFrame.loc[['Index1':'Index2'] Example 5
Supports line range filtering, and the starting and ending ranges are separated by English colons. Range filtering of columns is not supported.
- Filter only the row range: DataFrame.loc[[_'Index 1':'Index 2'_], ] _When only passing index conditions, the red comma can be omitted. _
-
**Filter row range + filter 1 column: ** DataFrame.loc[[_'index 1':'index 2'_], _'column name 1'_]
-
**Filter row range + filter multiple columns: ** DataFrame.loc[[_'index 1':'index 2'], ['column name 1','column name 2', …_ ]]
-
⚠️ Note:
- The start and end ranges must be separated by: English colon!
-
Range must be wrapped in [ ] square brackets!
-
Unlike
Python slice
, the range, start and end positions wrapped by [ ] will be included in the filter condition.
If DataFrame
has multiple levels of indexes, column names, and starting range, it must be accurate to the lowest index or column name. Because the top-level index and column name may represent multiple rows or columns, they cannot be used as start conditions. Example 6
If you perform range filtering on a DataFrame
with a multi-level index
, it must be sorted first. It is recommended to use df.sort_index(inplace=True)
to sort and then perform range filtering.
4. Boolean filtering
- DataFrame.loc[row conditions, column conditions]
-
Row filtering: You can pass a
boolean list
with the same length as the row index indicating which rows to keep and which rows to discard. Example 7 -
Row filtering: Rows can be filtered using Boolean operations. If the number of Boolean operations exceeds 3, it is recommended to use advanced indexing
-
⚠️ Note:
Boolean operations on rows are done through column names. Expressed in the form of
df[column name]
. Example 8For multiple conditions, you can use
&
,|
to express union or, but you cannot useand
,or
. Example 8 - Column filtering: Boolean operations are not supported.
5. Callable filtering
- DataFrame.loc[Callable]
You can use Callable
to filter, which is also a Boolean filter in principle. Example 9
Related methods:
➡️ Related methods
- DataFrame.at
Access a single value for a row/column label pair.
DataFrame.iloc
Filter data-natural index method
- DataFrame.xs
Returns a cross-section (row(s) or column(s)) from the Series/DataFrame.
Series.loc
Access group of values using labels.
Example:
Example 1: DataFrame with only single-level index, filtering single data
-
- Filter 1 row and return
Series
by default. If you apply [ ] to the filtering conditions, you can force the return toDataFrame
.
- Filter 1 row and return
import pandas as pd
# ConstructDF
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
# General single row filtering,returnSeries
df.loc['cobra']
# ... max_speed 1
# ... shield 2
# ... Name: cobra, dtype: int64
# Single row filter,Apply conditions[ ],Forced return DataFrame
df.loc[['cobra']]
# ... max_speed shield
# ... cobra 1 2
-
- Filter 1 column and return
Series
by default. If you apply the filter condition to [ ], you can force the return toDataFrame
.
- Filter 1 column and return
# General single column filter'max_speed',returnSeries
df.loc[:,'max_speed']
# ... cobra 1
# ... viper 4
# ... sidewinder 7
# ... Name: max_speed, dtype: int64
# Single column filter,Apply conditions[ ],Forced return DataFrame
df.loc[:,['max_speed']]
# ... max_speed
# ... cobra 1
# ... viper 4
# ... sidewinder 7
-
- When filtering cells, the scalar value
Scalar
is returned by default. If the filtering condition is set to [ ], it can be forced to returnDataFrame
.
- When filtering cells, the scalar value
# General cell filtering,Returns a scalar value
df.loc['cobra', 'max_speed']
# ... 1
# Apply filter conditions [ ],Can force return DataFrame
df.loc[['cobra'], ['max_speed']]
# ... max_speed
# ... cobra 1
Example 2: Multi-level indexed DataFrame, filtering data
-
- Construct demonstration data and observe data content
import pandas as pd
# Build demo data
tuples = [
('shooter', 'Trolls'), ('shooter', 'Necrons'),
('mage', 'Trolls'), ('mage', 'Necrons'),
('warrior', 'Trolls'), ('warrior', 'Necrons')
]
index = pd.MultiIndex.from_tuples(tuples)
values = [[9, 20], [10, 18], [7, 23],
[6, 25], [4, 30], [3, 35]]
df = pd.DataFrame(values, columns=[['Attributes1','Attributes2'], ['Attack speed','Attack power']], index=index)
# Observe data content
df
-
- Filter 1 row or 1 column of top-level index and pass the conditions normally.
# Filter top row index
df.loc['shooter']
# ... Attributes1 Attributes2
# ... Attack speed Attack power
# ... Trolls 9 20
# ... Necrons 10 18
# Filter on top column index
df.loc[:,'Attributes1']
# ... Attack speed
# ... shooter Trolls 9
# ... Necrons 10
# ... mage Trolls 7
# ... Necrons 6
# ... warrior Trolls 4
# ... Necrons 3
-
- To filter non-top-level indexes and column names, you need to use tuples to set the conditions.
# Filter the bottom row
df.loc[('shooter','Trolls')]
# ... Attributes1 Attack speed 9
# ... Attributes2 Attack power 20
# ... Name: (shooter, Trolls), dtype: int64
# Filter the underlying index,Top level column name
df.loc[('shooter','Trolls'),'Attributes1']
# ... Attack speed 9
# ... Name: (shooter, Trolls), dtype: int64
# Filter the underlying index,underlying column names
df.loc[('shooter','Trolls'),('Attributes1','Attack speed')]
# ... 9
9
-
- Multi-level index filtering, the conditions are set with [], and you can also force the return of
DataFrame
- Multi-level index filtering, the conditions are set with [], and you can also force the return of
df.loc[[('shooter','Trolls')],['Attributes1']]
# ... Attributes1
# ... Attack speed
# ... shooter Trolls 9
Example 3: Single-level indexed DataFrame, filtering multiple pieces of data
import pandas as pd
# ConstructDF
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
# Filter multiple rows
df.loc[['cobra', 'viper']]
# ... max_speed shield
# ... cobra 1 2
# ... viper 4 5
# Filter multiple rows、single column
df.loc[['cobra', 'viper'],'max_speed']
# ... cobra 1
# ... viper 4
# ... Name: max_speed, dtype: int64
# Filter multiple rows、multiple columns
df.loc[['cobra', 'viper'],['max_speed', 'shield']]
# ... max_speed shield
# ... cobra 1 2
# ... viper 4 5
Example 4: DataFrame with multi-level index, filtering multiple pieces of data
import pandas as pd
# Build demo data
tuples = [
('shooter', 'Trolls'), ('shooter', 'Necrons'),
('mage', 'Trolls'), ('mage', 'Necrons'),
('warrior', 'Trolls'), ('warrior', 'Necrons')
]
index = pd.MultiIndex.from_tuples(tuples)
values = [[9, 20], [10, 18], [7, 23],
[6, 25], [4, 30], [3, 35]]
df = pd.DataFrame(values, columns=[['Attributes1','Attributes2'], ['Attack speed','Attack power']], index=index)
# Filter multiple top row indexes
df.loc[['shooter','mage']]
# ... Attributes1 Attributes2
# ... Attack speed Attack power
# ... shooter Trolls 9 20
# ... Necrons 10 18
# ... mage Trolls 7 23
# ... Necrons 6 25
# Filter multiple underlying navigation indexes
df.loc[[('shooter','Trolls'),('mage','Necrons')]]
# ... Attributes1 Attributes2
# ... Attack speed Attack power
# ... shooter Trolls 9 20
# ... mage Necrons 6 25
# OK、Column combination conditions
df.loc[('shooter','Trolls'),('Attributes2','Attack power')]
# ... 20
# Filter multiple rows at the same time、multiple columns
df.loc[[('shooter','Trolls'),('mage','Necrons')],[('Attributes1','Attack speed'),('Attributes2','Attack power')]]
# ... Attributes1 Attributes2
# ... Attack speed Attack power
# ... shooter Trolls 9 20
# ... mage Necrons 6 25
Example 5: DataFrame with single-level index, filtering row range
import pandas as pd
# ConstructDF
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
# Filter only a range of rows
df.loc['viper':'sidewinder']
# ... max_speed shield
# ... viper 4 5
# ... sidewinder 7 8
# row range + 1List
df.loc['viper':'sidewinder','shield']
# ... viper 5
# ... sidewinder 8
# ... Name: shield, dtype: int64
# row range + multiple columns
df.loc['viper':'sidewinder',['max_speed','shield']]
# ... max_speed shield
# ... viper 4 5
# ... sidewinder 7 8
**Example 6: For a multi-level indexed DataFrame, the starting range of the filtered row range must be accurate to the bottom index or column name. **
import pandas as pd
# Build demo data
tuples = [
('shooter', 'Trolls'), ('shooter', 'Necrons'),
('warrior', 'Trolls'), ('warrior', 'Necrons'),
('mage', 'Trolls'), ('mage', 'Necrons')
]
index = pd.MultiIndex.from_tuples(tuples)
values = [[9, 20], [10, 18], [7, 23],
[6, 25], [4, 30], [3, 35]]
df = pd.DataFrame(values, columns=[['Attributes1','Attributes2'], ['Attack speed','Attack power']], index=index)
# Observation data
# df
# Attributes1 Attributes2
# Attack speed Attack power
# shooter Trolls 9 20
# Necrons 10 18
# warrior Trolls 4 30
# Necrons 3 35
# mage Trolls 7 23
# Necrons 6 25
# Filter from Archer to Mage,Even if they are all top-level indexes,The starting position of the range condition,It must also be accurate to the troll clan,Means to specify this line。Because the top index、List,May represent multiple rows or columns,This cannot be used as a start condition。
df.loc[('shooter','Trolls'):'warrior']
# Attributes1 Attributes2
# Attack speed Attack power
# shooter Trolls 9 20
# Necrons 10 18
# warrior Trolls 4 30
# Necrons 3 35
# Filter row range + column range
df.loc[('shooter','Trolls'):('warrior', 'Trolls'),('Attributes1','Attack speed'):('Attributes2', 'Attack power')]
# Attributes1 Attributes2
# Attack speed Attack power
# shooter Trolls 9 20
# Necrons 10 18
# warrior Trolls 4 30
Example 7: Pass a boolean list indicating which rows are left
import pandas as pd
# Build demo data
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
# boolean list
list_bool = [False, False, True]
# Pass in boolean list,Only keep the first3OK'sidewinder'
df.loc[list_bool]
# ... max_speed shield
# ... sidewinder 7 8
Example 8: Filter rows with Boolean operations
import pandas as pd
# Build demo data
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['Movement speed', 'Armor'])
# filter 'Movement speed column' > 6 Target
df.loc[df['Movement speed'] > 6]
# ... Movement speed Armor
# ... sidewinder 7 8
# filter 'Movement speed column' > 6 Target,at the same time,Only keep the armor column
df.loc[df['Movement speed'] > 6,['Armor']] # Armor added in square brackets,is forDataFrameshow。
# ... Armor
# ... sidewinder 8
# use & express and
df.loc[(df['Movement speed'] > 1) & (df['Armor'] < 8)]
# ... Movement speed Armor
# ... viper 4 5
# use | express or
df.loc[(df['Movement speed'] > 4) | (df['Armor'] < 5)]
# ... Movement speed Armor
# ... cobra 1 2
# ... sidewinder 7 8
Example 9: Filtering using Callable
- 1、lambda
import pandas as pd
# Build demo data
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['Movement speed', 'Armor'])
# uselambda filter Armor column ==8 Target
df.loc[lambda df: df['Armor'] == 8]
# ... Movement speed Armor
# ... sidewinder 7 8
-
- Custom function
import pandas as pd
# Build demo data
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['Movement speed', 'Armor'])
# Define filter function
def slect_df(df):
return df['Armor'] == 8
# Call functions
df.loc[slect_df]
Example 10: Assign value after filtering and change data content
import pandas as pd
# Build demo data
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['Movement speed', 'Armor'])
# After filtering,Modify data in batches
df.loc[df['Movement speed']>2] = 50
df
# ... Movement speed Armor
# ... cobra 1 2
# ... viper 50 50
# ... sidewinder 50 50
# After filtering,batch + 30
df.loc[df['Movement speed'] == 50] += 5
df
# ... Movement speed Armor
# ... cobra 1 2
# ... viper 55 55
# ... sidewinder 55 55