Selecting columns from Pandas DataFrame
Selecting column or columns from a Pandas DataFrame is one of the most frequently performed tasks while manipulating data. Pandas provides several technique to efficiently retrieve subsets of data from your DataFrame. The Python indexing operators '[]' and attribute operator '.' allows simple and fast access to DataFrame across a wide range of use cases. Following article will discuss different ways to work with a DataFrame that has a large number of columns.Create a DataFrame with data
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['Name'] = ['John', 'Doe', 'Bill','Jim','Harry','Ben']
df['TotalMarks'] = [82, 38, 63,22,55,40]
df['Grade'] = ['A', 'E', 'B','E','C','D']
df['Promoted'] = [True, False,True,False,True,True]
Name TotalMarks Grade Promoted
0 John 82 A True
1 Doe 38 E False
2 Bill 63 B True
3 Jim 22 E False
4 Harry 55 C True
5 Ben 40 D True
Selecting single column from Pandas DataFrame
You can apply Python selection filters to the DataFrame itself, to select a single column to work with.
df['Name']
0 John
1 Doe
2 Bill
3 Jim
4 Harry
5 Ben
Selecting multiple column from Pandas DataFrame
When you select multiple columns from DataFrame, use a list of column names within the selection brackets [].
df[['Name','TotalMarks']]
Name TotalMarks
0 John 82
1 Doe 38
2 Bill 63
3 Jim 22
4 Harry 55
5 Ben 40
Here the inner square brackets [] define a Python list with column names from DataFrame, whereas the outer brackets[] are used to select the data from a DataFrame .
If you want to get dimensionality of the DataFrame
df[['Name','TotalMarks']].shape
(6, 2)
Selecting range of columns
#select second and third columns with all rows
df[df.columns[1:3]]
TotalMarks Grade
0 82 A
1 38 E
2 63 B
3 22 E
4 55 C
5 40 D
Select two column with first 3 rows
DataFrame.loc access a group of rows and columns by label(s) or a boolean array .
df.loc[0:2, 'Name':'TotalMarks']
Name TotalMarks
0 John 82
1 Doe 38
2 Bill 63
Select all column with first row
df.loc[0, :]
Name John
TotalMarks 82
Grade A
Promoted True

Select all rows with first three column
df.iloc[:, 0:3]
Name TotalMarks Grade
0 John 82 A
1 Doe 38 E
2 Bill 63 B
3 Jim 22 E
4 Harry 55 C
5 Ben 40 D
Select first three rows with first four column
df.iloc[0:3, 0:4]
Name TotalMarks Grade Promoted
0 John 82 A True
1 Doe 38 E False
2 Bill 63 B True
Related Topics
- Creating an empty Pandas DataFrame
- How to Check if a Pandas DataFrame is Empty
- How to check if a column exists in Pandas Dataframe
- How to delete column from pandas DataFrame
- Selecting multiple columns in a Pandas dataframe based on condition
- Selecting rows in pandas DataFrame based on conditions
- How to Drop rows in DataFrame by conditions on column values
- Rename column in Pandas DataFrame
- Get a List of all Column Names in Pandas DataFrame
- How to add new columns to Pandas dataframe?
- Change the order of columns in Pandas dataframe
- Concatenate two columns into a single column in pandas dataframe
- How to count the number of rows and columns in a Pandas DataFrame
- Use a list of values to select rows from a pandas dataframe
- How to iterate over rows in a DataFrame in Pandas
- How to drop rows/columns of Pandas DataFrame whose value is NaN
- How to Export Pandas DataFrame to a CSV File
- Convert list of dictionaries to a pandas DataFrame
- How to set a particular cell value in pandas DataFrame