Net-informations.com

Importing Data with DataFrame.read_csv()

The simple and easiest way to read data from a CSV file is:

Specifying Delimiter

Reading specific Columns only

Read CSV without headers

Argument header=None , skip the first row and use the 2nd row as headers

Skiprows

skiprows allows you to specify the number of lines to skip at the start of the file.

Use a specific encoding (e.g. 'utf-8' )

Parsing date columns

Specify dType

Multi-character separator

By default, Pandas read_csv() uses a C parser engine for high performance. The C parser engine can only handle single character separators. If you need your CSV has a multi-character separator , you will need to modify your code to use the 'python' engine.


UnicodeDecodeError while read_csv()

UnicodeDecodeError while read_csv()

UnicodeDecodeError occurs when the data was stored in one encoding format but read in a different, incompatible one. The easiest solution for this error is:

"Unnamed: 0" while read_csv()

"Unnamed: 0" occurs when a DataFrame with an un-named index is saved to CSV and then re-read after. To solve this error, what you have to do is to specify an index_col=[0] argument to read_csv() function, then it reads in the first column as the index.

Instead of having to fix this issue while reading, you can also fix this issue when writing by using:

Error tokenizing data while read_csv()

In most cases, it might be an issue with (1) the delimiters in your data (2) confused by the headers/column of the file. Solution:

Above code tells pandas that your source data has no row for headers/column titles.

Or

Above code will cause the offending lines to be skipped.

In order to get information about error causing rows try to use combination of error_bad_lines=False and warn_bad_lines=True:

FileNotFoundError

In most cases :

just put r'' before your path to file. Because \ escapes character.

Here r is a special character and means raw string.

Another way is to use \\ in your string to escape that \.


read csv

MemoryError

Memory errors happens a lot with python when using the 32bit Windows version . This is because 32bit processes only gets 2GB of memory to play with by default.

The solution for this error is that pandas.read_csv() function takes an option called dtype. This lets pandas know what types exist inside your csv data.

For example: by specifying dtype={'age':int} as an option to the .read_csv() will let pandas know that age should be interpreted as a number. This saves you lots of memory.

Or try the solution below:










net-informations.com (C) 2021    Founded by raps mk
All Rights Reserved. All other trademarks are property of their respective owners.
SiteMap  | Terms  | About