Get Column Names as List in Pandas DataFrame

Python Pandas is a powerful library for data manipulation and analysis, designed to handle diverse datasets with ease. It provides a wide range of functions to perform various operations on data, such as cleaning, transforming, visualizing, and analyzing. The columns in a Pandas DataFrame can hold different types of data, including alphanumeric characters, numerical values, or logical data, and the library offers efficient tools for working with these data types.The following programs illustrate to get DataFrame column headers using various methods.

The fastest and simplest way to get column header name is:

DataFrame.columns.values.tolist()
examples:

Create a Pandas DataFrame with data:

import pandas as pd import numpy as np df = pd.DataFrame() df['Name'] = ['John', 'Doe', 'Bill','Jim','Harry','Ben'] df['TotalMarks'] = [82, 38, 63,22,55,40] df['Grade'] = ['A', 'E', 'B','E','C','D'] df['Promoted'] = [True, False,True,False,True,True] df
Name TotalMarks Grade Promoted 0 John 82 A True 1 Doe 38 E False 2 Bill 63 B True 3 Jim 22 E False 4 Harry 55 C True 5 Ben 40 D True

Get list of column headers from a Pandas DataFrame

df.columns.values.tolist()
['Students_name', 'TotalMarks', 'Grade', 'IsPromoted']

Or

list(df.columns)
['Students_name', 'TotalMarks', 'Grade', 'IsPromoted']

If you want column header names in Python Tuple :

*df,
('Students_name', 'TotalMarks', 'Grade', 'IsPromoted')

When you use *df, please note the trailing comma.

If you want column header names in Python Set :

{*df}
{'Grade', 'IsPromoted', 'Students_name', 'TotalMarks'}

If you want column header names in Python List :

[*df]
['Students_name', 'TotalMarks', 'Grade', 'IsPromoted']

Pandas DataFrame follows the dict-like convention, where you can iterate over the column names (keys) of the DataFrame using a loop or other dict-like operations. This allows you to access and manipulate individual columns easily. So you can get column header names as:

df.keys()
Index(['Students_name', 'TotalMarks', 'Grade', 'IsPromoted'], dtype='object')

If you want sorted column header names :

sorted(df)
['Grade', 'IsPromoted', 'Students_name', 'TotalMarks']

Conclusion

The sorted(df) function will sort the column names alphabetically, which may not preserve the original order of the columns in the DataFrame. On the other hand, using list(df) will give you the column names in the original order they appeared in the DataFrame, ensuring that the order is preserved. So, if maintaining the original order of column names is crucial, it's better to use list(df) instead of sorted(df).