lesson1Title

lesson2Title

lesson3Title

lesson4Title

automationIntroApplicationChapter3Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

automationIntroApplicationChapter1Title

automationIntroApplicationChapter2Title

automationIntroApplicationChapter4Title

import pandas as pd

# Load Excel file
file_path = 'input_file.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

# Group data by Customer ID and Customer Name (i.e., group data belonging to the same customer)
grouped = df.groupby(['Customer ID', 'Customer Name'])

# Select only the 'Sales Amount' column for each group (customer)
sales_data = grouped['Sales Amount']

# Sum the sales amount for each group (customer)
total_sales = sales_data.sum()

# Convert the grouped result back to a regular DataFrame format (reindex)
customer_sales = total_sales.reset_index()

# Rename columns for clarity
customer_sales.columns = ['Customer ID', 'Customer Name', 'Total Sales Amount']

# Print results
print(customer_sales)

# Analyze Monthly Sales Data and Send Reports via Email

Imagine a scenario where your company has to prepare and send monthly sales reports via email!

Manually handling a large amount of data can be time-consuming and prone to errors.

Automating these repetitive tasks can save time and significantly improve work efficiency.

In this lesson, we will learn how to analyze customer sales data stored in an Excel file using Python and send customized reports through email.

 

## Sample Data

The data used for this task is stored in an `.xlsx` Excel file as shown below.

The data includes customer ID, name, sales amount, and sales date.

| Customer ID | Customer Name | Sales Amount | Sales Date |
|-------------|----------------|--------------|------------|
| 101 | John Doe | 500000 | 2024-01-15 |
| 102 | Jane Smith | 300000 | 2024-01-22 |
| 101 | John Doe | 250000 | 2024-02-10 |
| 101 | John Doe | 300000 | 2024-02-18 |
| 103 | Peter Johnson | 400000 | 2024-02-14 |
| 102 | Jane Smith | 150000 | 2024-03-05 |
| 102 | Jane Smith | 50000 | 2024-03-08 |
| 101 | John Doe | 100000 | 2024-03-18 |

Using this Excel data, we will analyze the monthly sales data for each customer and write an email report.

 

## Code Explanation

The practice code uses the `Pandas` library to read data from an Excel file, group it by specific columns, and sum up the `Sales Amount` for each customer.

Let's go through the code step by step.

 

### 1. Import the Excel File

You can import the Excel file using the `read_excel` function from *Pandas*.

```python title="Import Excel File with read_excel"
file_path = 'input_file.xlsx'

df = pd.read_excel(file_path, sheet_name='Sheet1')
```

- `file_path` is the path to the Excel file to be loaded.

- The `pd.read_excel` function reads the file and stores the data in a DataFrame called `df`.

- `sheet_name='Sheet1'` specifies that data should be imported from Sheet1 of the Excel file.

 

### 2. Group Data by Customer ID and Name

```python title="Group Data by Customer ID and Name"
grouped = df.groupby(['Customer ID', 'Customer Name'])
```

The `groupby` function groups the data according to the specified columns (`Customer ID`, `Customer Name`).

Rows with the same customer ID and name are grouped together.

 

### 3. Select Sales Amount Data

```python title="Select Sales Amount Data"
sales_data = grouped['Sales Amount']
```

Select the 'Sales Amount' column from the grouped data.

The values in this column will be processed for each group.

 

### 4. Sum the Sales Amount for Each Group

```python title="Sum Sales Amount"
total_sales = sales_data.sum()
```

The `sum()` function adds up the sales amounts for each group to calculate the total sales amount for each customer.

 

### 5. Convert to DataFrame and Reset Index

```python title="Convert to DataFrame"
customer_sales = total_sales.reset_index()
```

The `reset_index()` function converts the grouped result back into a typical DataFrame format.

During this process, the DataFrame’s index is reset starting from 0, assigning new row numbers.

 

### 6. Rename Columns

```python title="Rename Columns"
customer_sales.columns = ['Customer ID', 'Customer Name', 'Total Sales Amount']
```

Rename the columns of the DataFrame to clearly indicate `Customer ID`, `Customer Name`, and `Total Sales Amount`.

 

Printing the `customer_sales` DataFrame with the print function will show the following results.

```text title="Sales Data Output by Customer"
 Customer ID Customer Name Total Sales Amount
0 101 John Doe 1150000
1 102 Jane Smith 500000
2 103 Peter Johnson 400000
```

Customer ID	Customer Name	Sales Amount	Sales Date
101	John Doe	500000	2024-01-15
102	Jane Smith	300000	2024-01-22
101	John Doe	250000	2024-02-10
101	John Doe	300000	2024-02-18
103	Peter Johnson	400000	2024-02-14
102	Jane Smith	150000	2024-03-05
102	Jane Smith	50000	2024-03-08
101	John Doe	100000	2024-03-18