How to Select Rows from Pandas DataFrame

Pandas is built on top of the Python Numpy library and has two primarydata structures viz. one dimensional Series and two dimensional DataFrame . Pandas DataFrame can handle both homogeneous and heterogeneous data . You can perform basic operations on Pandas DataFrame rows like selecting, deleting, adding, and renaming.

Create a Pandas DataFrame with data

import pandas as pd import numpy as np df = pd.DataFrame() df['Name'] = ['John', 'Doe', 'Bill','Jim','Harry','Ben'] df['TotalMarks'] = [82, 38, 63,22,55,40] df['Grade'] = ['A', 'E', 'B','E','C','D'] df['Promoted'] = [True, False,True,False,True,True] df
Name TotalMarks Grade Promoted 0 John 82 A True 1 Doe 38 E False 2 Bill 63 B True 3 Jim 22 E False 4 Harry 55 C True 5 Ben 40 D True

Selecting rows using []

You can use square brackets to access rows from Pandas DataFrame.

df[2:4]
Name TotalMarks Grade Promoted 2 Bill 63 B True 3 Jim 22 E False
**Select rows starting from 2nd row position upto 4th row position of all columns.

Selected columns

You can specify the column names while retrieving data from DataFrame.

df[2:4][['TotalMarks', 'Grade']]
TotalMarks Grade 2 63 B 3 22 E
**Select rows starting from 2nd row position upto 4th row position of columns 'TotalMarks'and 'Grade' .

Selecting rows using loc[]

df.iloc[2:4]
Name TotalMarks Grade Promoted 2 Bill 63 B True 3 Jim 22 E False
**Select rows starting from 2nd row position upto 4th row position of all columns.

Selected columns

While using loc, you can specify the column names while retrieving data from DataFrame.

df.loc[2:4, ['TotalMarks', 'Grade']]
TotalMarks Grade 2 63 B 3 22 E 4 55 C
**Select rows starting from 2nd row position upto 4th row position of columns 'TotalMarks'and 'Grade' .

Select rows based on condition using loc

df.loc[df['Grade'] == 'E']
Name TotalMarks Grade Promoted 1 Doe 38 E False 3 Jim 22 E False
**Select all rows from DataFrame where Grade is 'E'.

Using 'loc' and '!='

df.loc[df['Grade'] != 'E']
Name TotalMarks Grade Promoted 0 John 82 A True 2 Bill 63 B True 4 Harry 55 C True 5 Ben 40 D True
**Select all rows whose Grade does not equal 'E'.

Combine multiple conditions with & operator

df.loc[(df['TotalMarks'] >= 50) & (df['TotalMarks'] <= 79)]
Name TotalMarks Grade Promoted 2 Bill 63 B True 4 Harry 55 C True
**Select all rows from DataFrame where total marks greater than 50 and less than 79.

Selected columns using loc

df.loc[(df['TotalMarks'] >= 50) & (df['TotalMarks'] <= 79), ['Name','TotalMarks', 'Grade']]
Name TotalMarks Grade 2 Bill 63 B 4 Harry 55 C
**Retrieve Name, TotalMarks, Grade column where total marks greater than 50 and less than 79.
How to Select Rows from Pandas DataFrame

Using loc[] and isin()

df.loc[df['Grade'].isin(['A', 'B'])]
Name TotalMarks Grade Promoted 0 John 82 A True 2 Bill 63 B True
**Select all rows where grade is 'A' or 'B'

Selected column using loc[] and isin()

df.loc[df['Grade'].isin(['A', 'B']),['Name','TotalMarks', 'Grade'] ]
Name TotalMarks Grade 0 John 82 A 2 Bill 63 B
**Select only Name, TotalMarks, Grade columns where grade is 'A' or 'B'

Using Dataframe.query()

df.query('Grade == "A" Grade == "B" ')
Name TotalMarks Grade Promoted 0 John 82 A True 2 Bill 63 B True