Create a Pandas DataFrame from List of Dicts

To convert your list of dicts to a pandas dataframe use the following methods:

  1. pd.DataFrame(data)
  2. pd.DataFrame.from_dict(data)
  3. pd.DataFrame.from_records(data)

Depending on the structure and format of your data, there are situations where either all three methods work, or some work better than others, or some don't work at all.

pd.DataFrame(data)

This method creates a DataFrame from a list of dictionaries, where each dictionary represents a row in the DataFrame. The keys of the dictionaries become column names, and the values become the row values.

import pandas as pd # Sample list of dictionaries data = [ {'Name': 'John', 'Age': 25, 'City': 'New York'}, {'Name': 'Alice', 'Age': 30, 'City': 'Los Angeles'}, {'Name': 'Bob', 'Age': 28, 'City': 'Chicago'} ] # Convert list of dictionaries to DataFrame df = pd.DataFrame(data) print(df)

pd.DataFrame.from_dict(data)

This method creates a DataFrame from a dictionary where keys represent column names, and values represent column data. The dictionary should be in the format {column_name: column_data}.

import pandas as pd # Sample dictionary with column data data = { 'Name': ['John', 'Alice', 'Bob'], 'Age': [25, 30, 28], 'City': ['New York', 'Los Angeles', 'Chicago'] } # Convert dictionary to DataFrame df = pd.DataFrame.from_dict(data) print(df)

pd.DataFrame.from_records(data)

This method creates a DataFrame from a list of tuples or namedtuples. The tuples represent rows in the DataFrame, and the names of the fields in the tuples correspond to the column names.

import pandas as pd # Sample list of tuples data = [ ('John', 25, 'New York'), ('Alice', 30, 'Los Angeles'), ('Bob', 28, 'Chicago') ] # Convert list of tuples to DataFrame df = pd.DataFrame.from_records(data, columns=['Name', 'Age', 'City']) print(df)

Conclusion

All three methods can be useful depending on the format of your data. If your data is already in the form of a list of dictionaries, using pd.DataFrame(data) is straightforward. If your data is in the form of a dictionary with column data, pd.DataFrame.from_dict(data) is suitable. If your data is in the form of a list of tuples, pd.DataFrame.from_records(data) can be used with the appropriate column names specified in the columns parameter.