Reading CSV files in Python

CSV stands for comma-separated values, and it is a common format for storing tabular data. A CSV file is a text file that contains a list of records, where each record is a list of values separated by commas.

To read a CSV file in Python, you can use the csv module. The csv module provides a reader() function that can be used to read a CSV file and return a list of lists. The reader() function takes two arguments: the file name and the delimiter. The delimiter is the character that separates the values in a CSV file. The default delimiter is a comma, but you can specify a different delimiter if necessary. Here's how to read a CSV file in Python:

Reading CSV File Line by Line

import csv # Open the CSV file in read mode with open('data.csv', 'r') as file: csv_reader = csv.reader(file) for row in csv_reader: print(row)

In this example, the csv.reader class is used to read the CSV file line by line. The for loop iterates through each row in the file, and the print() function displays the row content as a list.

Reading CSV File with Header

import csv # Open the CSV file in read mode with open('data_with_header.csv', 'r') as file: csv_reader = csv.DictReader(file) for row in csv_reader: print(row)

When your CSV file includes a header row (with column names), you can use the csv.DictReader class. This class treats the first row as the header and each subsequent row as a dictionary where column names are keys and row values are dictionary values.

Reading CSV File and Extracting Data

import csv # Open the CSV file in read mode with open('data.csv', 'r') as file: csv_reader = csv.reader(file) for row in csv_reader: name, age, city = row # Assuming the CSV has columns: Name, Age, City print(f"Name: {name}, Age: {age}, City: {city}")

In this example, the CSV file is assumed to have columns named "Name," "Age," and "City." The script extracts and prints these values for each row.


Python csv read

Using pandas library

Alternatively, an alternative approach involves employing the pandas library to accomplish the task of reading CSV files. This library boasts robust capabilities in data manipulation and analysis, rendering it a potent choice for effectively handling such operations. Here's an example of how to use pandas to read a CSV file:

import pandas as pd df = pd.read_csv('file.csv') print(df)

This action will yield a DataFrame, constituting a 2-dimensional labeled data structure that accommodates columns of diverse data types. Furthermore, a multitude of options are at your disposal to govern the manner in which the CSV file is processed, including provisions to skip rows, define column names, and more.

You can also use other parameters of read_csv function to customize the loading of the csv like,

df = pd.read_csv("data.csv", delimiter='\t', header=None, names=["col1","col2","col3"])

This will read the file "data.csv" with tab separator, no header and assign the column names as "col1","col2","col3".

Python Pandas

Pandas stands as a formidable and adaptable open-source library within the scope of Python, carefully tailored for the intricate tasks of data manipulation and comprehensive analysis. Its prowess extends across vital aspects of data management, encompassing the intricate art of data wrangling and preprocessing, while simultaneously facilitating sophisticated data exploration, analysis, and intricate modeling endeavors.

CSV Files

A CSV (Comma-Separated Values) file emerges as a plaintext document designed to house tabular data in an uncomplicated configuration. In this format, each line within the file embodies a distinct row of the table, with fields (columns) in each row carefully demarcated by commas. This versatile format is frequently utilized for seamless data export and import operations, and conveniently lends itself to effortless manipulation through an assortment of spreadsheet and text editing software solutions.

Conclusion

The csv module provides various methods to handle different CSV formats and delimiters. You can specify a different delimiter using the delimiter parameter (e.g., delimiter='\t' for tab-separated files). The csv module also supports writing CSV files and offers additional configuration options to fine-tune the reading and writing process.