pandas - Python Data Analysis Library
Pandas is an open-source software library for analysing, cleaning, exploring, and manipulating data, built on top of the Python programming language. The main data structures in Pandas are the Series and the DataFrame (similar to R's data frame). A Pandas Series one-dimensional labelled array of data and an index. All the data in a dataFrame Series is of the same data type. The pandas DataFrame is a two-dimensional tabular style data with column and row indexes. The columns in DataFrame are made up of Series objects . The pandas module allows developers to import data from various file formats (csv, json, sql, xls, etc.) and perform data manipulation operations, including cleaning and reshaping the data, summarizing observations , grouping data, and merging multiple datasets.
Importing Pandas DataFrame module
If you have large amounts of function calls to pandas , it can become hard to write pandas.x() over and over again. Instead, it is better to import under the brief name pd.
Get your data into a DataFrame
There are several ways you can use to take a standard python datastructure and create a panda's DataFrame.
Pandas DataFrame from Python List
Pandas DataFrame from Python Dictionary
Working with DataFrame Columns and Rows
Select Columns from DataFrame
From daraframe select only Name and Grade Columns
Select Rows from DataFrame
Pandas daraframe uses the loc() method to return one or more specified row(s).
Select Multiple rows from DataFrame
Adding Named Indexes
In dataframe you can name your own indexes by using index argument .
Retrieve data using Named Index
Dataframe from numpy ndarray
View the first or last N rows
- DataFrame head() method return first 5 rows
- DataFrame tail() method return last 5 rows
You can pass number of rows as argument
Loading Data from files
The function read_csv (for comma separated values), read_excel (for Microsoft Excel spreadsheets), read_fwf (fixed width formatted text) etc. are using read data from external files.
Saving a DataFrame
Read data and saving a DataFrame to a CSV file.
Find columns data types
Statistical Summary of Data
Pandas describe() method output a a brief statistical summary of the numeric columns in the data, including descriptive statistics of the central tendency and dispersion.