Manipulating Data with DataFrames
A DataFrame
in Pandas is a data structure for systematically handling tabular data, similar to an Excel spreadsheet.
A DataFrame is a 2-dimensional array
composed of multiple series, with both rows and columns.
Below is a simple code example that creates a DataFrame containing item and sales data and manipulates the data.
Data Manipulation Example
import pandas as pd # Create DataFrame data_frame = pd.DataFrame({ 'Item': ['Apple', 'Banana', 'Strawberry', 'Grapes'], 'Sales': [1000, 2000, 1500, 3000] }) # Select a specific column sales = data_frame['Sales'] print("sales:", sales) # Filter rows based on a condition filtered_data = data_frame[data_frame['Sales'] > 1500] print("filtered_data:", filtered_data) # Sort the data sorted_data = data_frame.sort_values(by='Sales', ascending=False) print("sorted_data:", sorted_data)
sales = data_frame['Sales']
selects only the 'Sales' column from the DataFrame and returns it as a series.
print(sales) Output Result
0 1000 1 2000 2 1500 3 3000 Name: Sales, dtype: int64
filtered_data = data_frame[data_frame['Sales'] > 1500]
filters the rows where the value in the 'Sales' column is greater than 1500 and creates a new DataFrame.
print(filtered_data) Output Result
Item Sales 1 Banana 2000 3 Grapes 3000
sorted_data = data_frame.sort_values(by='Sales', ascending=False)
sorts the DataFrame in descending order based on the 'Sales' column.
print(sorted_data) Output Result
Item Sales 3 Grapes 3000 1 Banana 2000 2 Strawberry 1500 0 Apple 1000
Calculating Maximum, Minimum, and Average Values
There are methods to calculate the maximum, minimum, and average values of a specific column in a DataFrame.
-
max()
: Maximum value -
min()
: Minimum value -
mean()
: Average value
Here is an example code that calculates the maximum, minimum, and average values of the 'Sales' column.
Calculating Maximum, Minimum, Average Values
import pandas as pd data_frame = pd.DataFrame({ 'Item': ['Apple', 'Banana', 'Strawberry', 'Grapes'], 'Sales': [1000, 2000, 1500, 3000] }) # Maximum value max_sales = data_frame['Sales'].max() # Output: Maximum value: 3000 print(f'Maximum value: {max_sales}') # Minimum value min_sales = data_frame['Sales'].min() # Output: Minimum value: 1000 print(f'Minimum value: {min_sales}') # Average value mean_sales = data_frame['Sales'].mean() # Output: Average value: 1875.0 print(f'Average value: {mean_sales}')
Mission
0 / 1
Selecting a specific column of a DataFrame using Pandas returns a Series.
True
False
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help
Code Editor
Run
Generate
Execution Result