Importing Data with DataFrame.read_csv()

The simple and easiest way to read data from a CSV file is:

import pandas as pd df = pd.read_csv ('data.csv') print(df)

Specifying Delimiter

pd.read_csv ('data.csv',sep='\t')

Reading specific Columns only

pd.read_csv ('data.csv',usecols=['Name','Age'])

Read CSV without headers

pd.read_csv ('data.csv',header=None)
Argument header=None , skip the first row and use the 2nd row as headers


skiprows allows you to specify the number of lines to skip at the start of the file.
df = pd.read_csv ('data.csv', skiprows = 3)

Use a specific encoding (e.g. 'utf-8' )

pd.read_csv('data.csv', encoding='utf-8')

Parsing date columns

pd.read_csv('data.csv', parse_dates=['date'])

Specify dType

df = pd.read_csv ('data.csv', usecols=['Height'],dtype=np.float32)

Multi-character separator

By default, Pandas read_csv() uses a C parser engine for high performance. The C parser engine can only handle single character separators. If you need your CSV has a multi-character separator , you will need to modify your code to use the 'python' engine.
pd.read_csv ('data.csv', sep=r'\s*\\s*', engine='python')

UnicodeDecodeError while read_csv()

UnicodeDecodeError while read_csv()

UnicodeDecodeError occurs when the data was stored in one encoding format but read in a different, incompatible one. The easiest solution for this error is:
pd.read_csv('data.csv', engine='python')

"Unnamed: 0" while read_csv()

"Unnamed: 0" occurs when a DataFrame with an un-named index is saved to CSV and then re-read after. To solve this error, what you have to do is to specify an index_col=[0] argument to read_csv() function, then it reads in the first column as the index.
pd.read_csv('data.csv', index_col=[0])

Instead of having to fix this issue while reading, you can also fix this issue when writing by using:

df.to_csv('data.csv', index=False)

Error tokenizing data while read_csv()

In most cases, it might be an issue with (1) the delimiters in your data (2) confused by the headers/column of the file. Solution:

pandas.read_csv('data.csv', sep='you_delimiter', header=None)

Above code tells pandas that your source data has no row for headers/column titles.

pd.read_csv('data.csv', error_bad_lines=False)

Above code will cause the offending lines to be skipped.

In order to get information about error causing rows try to use combination of error_bad_lines=False and warn_bad_lines=True:
pd.read_csv('data.csv', error_bad_lines=False,warn_bad_lines=True)


In most cases :

just put r'' before your path to file. Because \ escapes character.


Here r is a special character and means raw string.

Another way is to use \\ in your string to escape that \.


read csv


Memory errors happens a lot with python when using the 32bit Windows version . This is because 32bit processes only gets 2GB of memory to play with by default. The solution for this error is that pandas.read_csv() function takes an option called dtype. This lets pandas know what types exist inside your csv data. For example: by specifying dtype={'age':int} as an option to the .read_csv() will let pandas know that age should be interpreted as a number. This saves you lots of memory.

Or try the solution below: