Rows in a Pandas DataFrame represent individual records or observations and accessing them efficiently is key to data manipulation.
Accessing rows in a Pandas DataFrame is fundamental for data manipulation and analysis. The most basic approach of accessing rows is using iloc function. The iloc method is used for positional indexing, allowing us to access rows by their integer position. To access a single row, use its integer index:
import pandas as pd
from io import StringIO
data = """Name,Age,Gender,Salary
John,25,Male,50000
Alice,30,Female,55000
Bob,22,Male,40000
Eve,35,Female,70000
Charlie,28,Male,48000"""
df = pd.read_csv(StringIO(data))
row1 = df.iloc[0]
print(row1)
Output
Name John Age 25 Gender Male Salary 50000 Name: 0, dtype: object
This method allows you to easily access a single column of data.
Now, let's select multiple rows using iloc with the help of slicing:
# Access rows 1 to 3 (index 1 to 3)
rows = df.iloc[1:4]
print(rows)
Output
Name Age Gender Salary 1 Alice 30 Female 55000 2 Bob 22 Male 40000 3 Eve 35 Female 70000
This approach enables us to select and manipulate multiple columns simultaneously.
In addition to the this method, there are several other methods to access rows in a Pandas DataFrame:
Table of Content
Accessing Rows Using loc
The loc method is used for label-based indexing. It allows you to access rows by their index labels (the row names). This method is useful when we know the row labels but not their integer position.
# Access the row with label 2
row_label_2 = df.loc[2]
print(row_label_2)
Output
Name Bob Age 22 Gender Male Salary 40000 Name: 2, dtype: object
Accessing Rows Using Conditions
Pandas allows us to access rows based on a condition or filter. This is useful when we want to retrieve rows that meet specific criteria.
# Access rows where 'Age' is greater than 25
row_f = df[df['Age'] > 25]
print(row_f)
Output
Name Age Gender Salary 1 Alice 30 Female 55000 3 Eve 35 Female 70000 4 Charlie 28 Male 48000
Accessing Specific Rows Using query
The query() method allows us to filter rows using a SQL-like syntax. This method is useful for complex queries and makes the code more readable when dealing with conditions on multiple columns.
# Access rows where 'Age' is greater than 25 and 'Salary' is less than 60000
query_f = df.query('Age > 25 and Salary < 60000')
print(query_f)
Output
Name Age Gender Salary 1 Alice 30 Female 55000 4 Charlie 28 Male 48000
Accessing Rows Using head and tail
The head() and tail() methods allow us to quickly access the first or last few rows of a DataFrame, respectively. These methods are useful when we want to inspect the top or bottom records in a dataset.
# Access the first 2 rows
r1 = df.head(2)
print(r1)
# Access the last 2 rows
r2 = df.tail(2)
print(r2)
Output
Name Age Gender Salary
0 John 25 Male 50000
1 Alice 30 Female 55000
Name Age Gender Salary
3 Eve 35 Female 70000
4 Charlie 28 Male 48000
Accessing Rows Using iloc with Conditions
We can combine iloc with conditions to access specific rows by position after applying a filter. This allows us to select rows based on criteria and then access them by their position in the filtered DataFrame.
# Access the first row where 'Age' is greater than 25
row_f = df[df['Age'] > 25].iloc[0]
print(row_f)
Output
Name Alice Age 30 Gender Female Salary 55000 Name: 1, dtype: object
For more information refer to below: