Selecting columns from Pandas DataFrame

Selecting column or columns from a Pandas DataFrame is one of the most frequently performed tasks while manipulating data. Pandas provides several technique to efficiently retrieve subsets of data from your DataFrame. The Python indexing operators '[]' and attribute operator '.' allows simple and fast access to DataFrame across a wide range of use cases. Following article will discuss different ways to work with a DataFrame that has a large number of columns.

Create a DataFrame with data

import pandas as pd import numpy as np df = pd.DataFrame() df['Name'] = ['John', 'Doe', 'Bill','Jim','Harry','Ben'] df['TotalMarks'] = [82, 38, 63,22,55,40] df['Grade'] = ['A', 'E', 'B','E','C','D'] df['Promoted'] = [True, False,True,False,True,True]
Name TotalMarks Grade Promoted 0 John 82 A True 1 Doe 38 E False 2 Bill 63 B True 3 Jim 22 E False 4 Harry 55 C True 5 Ben 40 D True

Selecting single column from Pandas DataFrame

You can apply Python selection filters to the DataFrame itself, to select a single column to work with.
df['Name']
0 John 1 Doe 2 Bill 3 Jim 4 Harry 5 Ben

Selecting multiple column from Pandas DataFrame

When you select multiple columns from DataFrame, use a list of column names within the selection brackets [].
df[['Name','TotalMarks']]
Name TotalMarks 0 John 82 1 Doe 38 2 Bill 63 3 Jim 22 4 Harry 55 5 Ben 40
Here the inner square brackets [] define a Python list with column names from DataFrame, whereas the outer brackets[] are used to select the data from a DataFrame .

If you want to get dimensionality of the DataFrame

df[['Name','TotalMarks']].shape
(6, 2)

Selecting range of columns

#select second and third columns with all rows df[df.columns[1:3]]
TotalMarks Grade 0 82 A 1 38 E 2 63 B 3 22 E 4 55 C 5 40 D

Select two column with first 3 rows

DataFrame.loc access a group of rows and columns by label(s) or a boolean array .
df.loc[0:2, 'Name':'TotalMarks']
Name TotalMarks 0 John 82 1 Doe 38 2 Bill 63

Select all column with first row

df.loc[0, :]
Name John TotalMarks 82 Grade A Promoted True

how to Select columns values from Pandas DataFrame

Select all rows with first three column

df.iloc[:, 0:3]
Name TotalMarks Grade 0 John 82 A 1 Doe 38 E 2 Bill 63 B 3 Jim 22 E 4 Harry 55 C 5 Ben 40 D

Select first three rows with first four column

df.iloc[0:3, 0:4]
Name TotalMarks Grade Promoted 0 John 82 A True 1 Doe 38 E False 2 Bill 63 B True