Pandas DataFrame operations
Data has a variety of types. A data-type is essentially an internal construct that a programming language uses to understand how to store and operate data. The format of individual rows and columns will affect analysis performed on a dataset read into programming environment. The Pandas DataFrame is a structure that contains 2-dimensional Data and its corresponding labels.- Types of Data
- Numeric Data Types
- Text Data Type
How to Check the Data Type in Pandas DataFrame
There are two purpose to check data types in a dataframe. Pandas automatically assigns types based on the encoding it detects from the original dataset. For a number of reasons, this assignment may be correct or incorrect. The data type for a column in a Pandas DataFrame or a Series is known as the dtype. You can use the dtype property to grab the type of a specific column You can use the following syntax to check the data type of all columns in Pandas DataFrame :df.dtypes
Alternatively, you may use the syntax below to check the data type of a specific column in a DataFrame:
df['DataFrame Column'].dtypes
How to change column type in pandas?
You have four main options for converting types in pandas:
- astype()
- to_numeric()
- infer_objects()
- convert_dtypes()
astype()
The astype() method is generally used for casting the pandas object to a specified dtype.astype() function. It can also convert any appropriate existing column to a categorical type. example
df = df.astype(int) # convert all columns to int64
df = df.astype({"x": int, "y": complex}) # column "x" to int64 dtype and "y" to complex type
s = s.astype(np.float16) # Series to float16 type
s = s.astype(str) # Series to Python strings
s = s.astype('category') # Series to categorical type
to_numeric()
Pandas to_numeric() method will try to change strings (such as non-numeric objects) into integers or floating point numbers as appropriate.
df["a"] = pd.to_numeric(df["a"]) # column "a" of a DataFrame
df[["a", "b"]] = df[["a", "b"]].apply(pd.to_numeric) # convert columns "a" and "b"
infer_objects()
infer_objects() for converting columns of a Pandas DataFrame that have an object datatype to a more specific type.

Using infer_objects() , you can change the type of column 'a' to int64:

convert_dtypes()
You can use pandas convert_dtypes() method to convert the default assigned data types to the suitable datatype automatically. There is one big advantage of using convert_dtypes()- it supports new type for missing values pd.NA along with NaN.
import pandas as pd
import numpy as np
# creating a dataframe
df = pd.DataFrame({"Roll_No.": ([101, 102, 103]),
"Name": ["John", "Doe", "Bill"],
"Result": ["Pass", "Fail", np.nan],
"Promoted": [True, False, np.nan],
"Marks": [80.34, 36.6, np.nan]})
# printing the dataframe
print("PRINTING DATAFRAME")
display(df)
# checking datatype
print()
print("PRINTING DATATYPE")
print(df.dtypes)
# converting datatype
print()
print("AFTER CONVERTING DATATYPE")
print(df.convert_dtypes().dtypes)
Related Topics
- Creating an empty Pandas DataFrame
- How to Check if a Pandas DataFrame is Empty
- How to check if a column exists in Pandas Dataframe
- How to delete column from pandas DataFrame
- How to select multiple columns from Pandas DataFrame
- Selecting multiple columns in a Pandas dataframe based on condition
- Selecting rows in pandas DataFrame based on conditions
- How to Drop rows in DataFrame by conditions on column values
- Rename column in Pandas DataFrame
- Get a List of all Column Names in Pandas DataFrame
- How to add new columns to Pandas dataframe?
- Change the order of columns in Pandas dataframe
- Concatenate two columns into a single column in pandas dataframe
- How to count the number of rows and columns in a Pandas DataFrame
- Use a list of values to select rows from a pandas dataframe
- How to iterate over rows in a DataFrame in Pandas
- How to drop rows/columns of Pandas DataFrame whose value is NaN
- How to Export Pandas DataFrame to a CSV File
- Convert list of dictionaries to a pandas DataFrame
- How to set a particular cell value in pandas DataFrame