Proxy for

Solutions

How to Parse CSV Files in Python

11.03.2025

Comments: 0

Like:

Content of the article:

What is a CSV File?
Parsing CSV Files with Python
Writing CSV Files with Python
Parsing CSV with Pandas Library

Key Features of Pandas Library

Reading CSV Files with Pandas
Writing CSV Files with Pandas
Conclusion

Data parsing is defined as an automatic collection and processing of information, which is often used in the case of CSV files. Here parsing means slicing CSV files into rows, columns, and values. In doing so, the data can be analyzed, filtered, and extracted for further work effortlessly. In this article we will explain how to use Python for reading csv files. Additionally, we will show how to parse data from a CSV file in Python.

What is a CSV File?

CSV, or (Comma Separated Values), is a file format that saves data in a way that has values separated by commas and new line shifts. Because of this, CSV format can be used in a variety of contexts, such as creating or modifying data in Excel.

One main strength of CSV files is the ease of accessing and sharing information. Its uniqueness permits the file to be opened and processed regardless of the software being used. This makes it convenient to export such data in the form of a spreadsheet or a database.

Now, let us show how to open and read CSV in Python in the following block.

Parsing CSV Files with Python

Python has a built-in CSV library which is able to read and write data with ease. Installing external libraries is not necessary which makes analyzing content and opening files such an easy task.

The following segments of code show how to open and print a СSV file called university_records in Python. It uses read mode to open the file, and then it reads the CSV file, finally, it prints the data with a for loop.


import csv

with open('university_records.csv', 'r') as csv_file:
    reader = csv.reader(csv_file)

    for row in reader:
        print(row)

Writing CSV Files with Python

For this purpose, we will employ the CSV module to write data. There are useful methods to assist you in writing information in the CSV module:

.writer() – serves as a file creation tool;
.writerow() – stores data in a row.

The methods of the module are comprehensively illustrated in the code below:


import csv

row = ['David', 'MCE', '3', '7.8']

row1 = ['Monika', 'PIE', '3', '9.1']

row2 = ['Raymond', 'ECE', '2', '8.5']

with open('university_records.csv', 'a') as csv_file:
    writer = csv.writer(csv_file)

    writer.writerow(row)

    writer.writerow(row1)

    writer.writerow(row2)

Parsing CSV with Pandas Library

Using python to parse CSV files is crucial nowadays: from spreadsheets for finance to colossal databases for machine learning. Sometimes working with those files is a pain, especially when you need more features than what Python provides out of the box. In such cases, the Pandas library can come in handy.

Full capability of writing data with DataFrame is demonstrated below. DataFrame is one of the main data structures in the Pandas library and is used for working with tabular data.


import pandas as pd

data = {"Name": ["David", "Monika", "Raymond"], 
        "Age": [30, 25, 40], 
        "City": ["Kyiv", "Lviv", "Odesa"]
} 

df = pd.DataFrame(data) 

file_path = "data.csv" 
df.to_csv(file_path, index=False, encoding="utf-8")

Key Features of Pandas Library

For Python the Pandas library is considered one of the most effective ones to parse CSV and here are the reasons why it is so powerful and convenient:

Simple file upload. If a dataset comes from multiple origins and has inconsistency in its formatting, then Pandas proving it’s supreme is magical as it parses the file automatically eliminating manual effort.
Scalability. When the standard Python libraries try to parse large volume CSV’s files, they usually lag a lot but with Pandas, the optimization is done as it is trumped in performing well with larger files. Also, fragmentation of the file upload enables prevention of memory overload.
Dealing with various processes. Missing values, wrong formats and duplicates are primarily found in CSV files. Good thing, Pandas proves yet again that magic with built in tools such as missing data and type replacement, character cleansing, and information restructuring for advanced analysis.

These features show that the library is best for quickly analyzing CSV files as other tools are limited in comparison. At the same time, it is able to process large quantities of data making it extremely useful in the world of information.

Reading CSV Files with Pandas

Before you are able to use the CSV document, the first step is uploading it.


import pandas as pd

df = pd.read_csv("data.csv")

When dealing with extensive datasets, Pandas tools are appropriate for use. Let’s explore how a Python script can parse a CSV file.


df.head() # Shows the first 5 rows
df.tail(10) # Shows the last 10 rows
df.info() # Outputs a list of columns, data types, and the number of filled values

For selecting one or multiple columns, execute:


df["Name"] # Get the column "Name"


df[["Name", "Age"]] # Extract only "Name" and "Age"

Writing CSV Files with Pandas

Now let’s look at how to insert, modify, and remove particular rows.

Inserting a new row:


# Load the CSV file
df = pd.read_csv(file_path) 

# Add a new row
new_row = pd.DataFrame([{"Name": "Denys", "Age": 35, "City": "Kharkiv"}]) df = pd.concat([df, new_row], ignore_index=True) 


# Save
df.to_csv(file_path, index=False, encoding="utf-8")

Modifying a particular row:


df = pd.read_csv(file_path) 

# Change the age of Ivan
df.loc[df["Name"] == "Ivan", "Age"] = 26 

df.to_csv(file_path, index=False, encoding="utf-8")

Removing a row:


df = pd.read_csv(file_path) 

# Remove the row where Name == "Mykhailo" 
df = df[df["Name"] != "Mykhailo"] 

df.to_csv(file_path, index=False, encoding="utf-8")

Conclusion

To sum up, in this article we showed how to open and read a CSV file in Python. But whenever a user requires greater accuracy and powerful interpreting tools, Pandas works perfectly. Automating repetitive processes, allowing for the handling of massive files and saving time, this library is very effective. Hence, it can be concluded that for basic functions, the standard CSV library provides the requirements, while Pandas is made to deal with extensive information data.