How do i read a csv file in python using pandas?

In this post, we’ll go over how to import a CSV File into Python.

Photo by AbsolutVision on Unsplash

Short Answer

The easiest way to do this :

import pandas as pddf = pd.read_csv ['file_name.csv']
print[df]

If you want to import a subset of columns, simply addusecols=['column_name'];

pd.read_csv['file_name.csv', usecols= ['column_name1','column_name2']]

If you want to use another separator, simply add sep='\t' ; Default separator is ',' .

pd.read_csv['file_name.csv', sep='\t']

Recap on Pandas DataFrame

Pandas DataFrames is an excel like data structure with labeled axes [rows and columns]. Here is an example of pandas DataFrame that we will use as an example below:

Code to generate DataFrame:

Importing a CSV file into the DataFrame

Pandas read_csv[] function imports a CSV file to DataFrame format.

Here are some options:

filepath_or_buffer: this is the file name or file path

df.read_csv['file_name.csv’] # relative position
df.read_csv['C:/Users/abc/Desktop/file_name.csv']

header: this allows you to specify which row will be used as column names for your dataframe. Expected an int value or a list of int values.

Default value is header=0, which means the first row of the CSV file will be treated as column names.

If your file doesn’t have a header, simply set header=None .

df.read_csv['file_name.csv’, header=None] # no header

The output of no header:

sep: Specify a custom delimiter for the CSV input, the default is a comma.

pd.read_csv['file_name.csv',sep='\t'] # Use Tab to separate

index_col: This is to allow you to set which columns to be used as the index of the dataframe. The default value is None, and pandas will add a new column start from 0 to specify the index column.

It can be set as a column name or column index, which will be used as the index column.

pd.read_csv['file_name.csv',index_col='Name'] # Use 'Name' column as index

nrows: Only read the number of first rows from the file. Needs an int value.

usecols: Specify which columns to import to the dataframe. It can a list of int values or column names.

pd.read_csv['file_name.csv',usecols=[1,2,3]] # Only reads col1, col2, col3. col0 will be ignored.
pd.read_csv['file_name.csv',usecols=['Name']] # Only reads 'Name' column. Other columns will be ignored.

converters: Helps to convert values in the columns by defined functions.

na_values: The default missing values will be NaN. Use this if you want other strings to be considered as NaN. The expected input is a list of strings.

pd.read_csv['file_name.csv',na_values=['a','b']] # a and b values will be treated as NaN after importing into dataframe.

To access data from the CSV file, we require a function read_csv[] that retrieves data in the form of the Dataframe.

Syntax of read_csv[] 

Syntax: pd.read_csv[filepath_or_buffer, sep=’ ,’ , header=’infer’,  index_col=None, usecols=None, engine=None, skiprows=None, nrows=None] 

Parameters: 

  • filepath_or_buffer: It is the location of the file which is to be retrieved using this function. It accepts any string path or URL of the file.
  • sep: It stands for separator, default is ‘, ‘ as in CSV[comma separated values].
  • header: It accepts int, a list of int, row numbers to use as the column names, and the start of the data. If no names are passed, i.e., header=None, then,  it will display the first column as 0, the second as 1, and so on.
  • usecols: It is used to retrieve only selected columns from the CSV file.
  • nrows: It means a number of rows to be displayed from the dataset.
  • index_col: If None, there are no index numbers displayed along with records.  
  • skiprows: Skips passed rows in the new data frame.

Read CSV using Pandas read_csv

Before using this function, we must import the Pandas library, we will load the CSV file.

PYTHON3

import pandas as pd

pd.read_csv["example1.csv"]

Output:

Example 1: Using sep in read_csv[]

In this example, we will manipulate our existing CSV file and then add some special characters to see how the sep parameter works.

Python3

import pandas as pd

df = pd.read_csv['headbrain1.csv',

                 sep='[:, |_]',

                 engine='python']

df

Output:

Example 2: Using usecols in read_csv[]

Here, we are specifying only 3 columns,i.e.[“tip”, “sex”, “time”] to load and we use the header 0 as its default header.

Python3

df = pd.read_csv['example1.csv',

        header=0,

        usecols=["tip", "sex", "time"]]

df

Output:

Example 3: Using index_col in read_csv[]

Here, we use the “sex” index first and then the “tip” index, we can simply reindex the header with index_col parameter.

Python3

df = pd.read_csv['example1.csv',

        header=0,

        index_col=["sex", "tip"],

        usecols=["tip", "sex", "time"]]

df

Output:

Example 4: Using nrows in read_csv[]

Here, we just display only 5 rows using nrows parameter.

Python3

df = pd.read_csv['example1.csv',

        header=0,

        index_col=["tip", "sex"],

        usecols=["tip", "sex", "time"],

                nrows=5]

df

Output:

Example 5: Using skiprows in read_csv[]

The skiprows help to skip some rows in CSV, i.e, here you will observe that the upper row and the last row from the original CSV data have been skipped.

Python3

pd.read_csv["example1.csv", skiprows = [1,12]]

Output:


How do I read a CSV file in Python?

Read A CSV File Using Python.
Using the CSV Library. import csv with open["./bwq.csv", 'r'] as file: csvreader = csv.reader[file] for row in csvreader: print[row] Here we are importing the csv library in order to use the . ... .
Using the Pandas Library. import pandas as pd data = pd.read_csv["bwq.csv"] data..

How do I read a CSV file row by row in Python using pandas?

15 ways to read CSV file with pandas.
Example 1 : Read CSV file with header row..
Example 2 : Read CSV file with header in second row..
Example 3 : Skip rows but keep header..
Example 4 : Read CSV file without header row..
Example 5 : Specify missing values..
Example 6 : Set Index Column..
Example 7 : Read CSV File from External URL..

Which pandas function is used to read from a CSV file?

Pandas read_csv[] function imports a CSV file to DataFrame format. header: this allows you to specify which row will be used as column names for your dataframe.

How do I read a column from a CSV file in Python using pandas?

This can be done with the help of the pandas. read_csv[] method. We will pass the first parameter as the CSV file and the second parameter the list of specific columns in the keyword usecols. It will return the data of the CSV file of specific columns.

Chủ Đề