lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

lesson11Title

lesson12Title

lesson13Title

lesson14Title

lesson15Title

pythonDataAnalyticsAdvancedChapter4Title

pythonDataAnalyticsAdvancedChapter1Title

pythonDataAnalyticsAdvancedChapter2Title

pythonDataAnalyticsAdvancedChapter3Title

# The Machine Learning Workflow

<br/>

A **machine learning workflow** is a structured process that guides how we move from a raw dataset to a deployed, working model.  
Following a clear workflow ensures efficiency, reproducibility, and better results.

Rather than listing each stage here, take a look at the **whiteboard diagram** for a visual breakdown of the workflow steps and their relationships.

<br/>

## Example: Simple Workflow in Scikit-learn

```python title="ML Workflow Example: Classification"
# Install scikit-learn in Jupyter Lite
import piplite
await piplite.install('scikit-learn')

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# 1. Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# 2. Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# 3. Choose model
model = KNeighborsClassifier(n_neighbors=3)

# 4. Train model
model.fit(X_train, y_train)

# 5. Evaluate
predictions = model.predict(X_test)
acc = accuracy_score(y_test, predictions)

print(f"Accuracy: {acc:.2f}")
```

This example demonstrates the **core loop** of the ML workflow:

* Data preparation
* Model selection
* Training
* Evaluation

<br/>

## Key Takeaways

* A well-structured ML workflow reduces errors and improves reproducibility.
* The steps are iterative — you might return to earlier stages if performance isn’t satisfactory.
* Scikit-learn provides tools for almost every stage, from preprocessing to evaluation.

<br/>

## What’s Next?

In the next lesson, we’ll dive into **Supervised vs. Unsupervised Learning** to understand the two main types of machine learning.

A well-structured machine learning workflow enhances reproducibility by providing a clear path from data collection to model deployment. This structured approach helps avoid errors and allows others to replicate your findings, ensuring consistent and reliable results across different projects.

The Machine Learning Workflow

Example: Simple Workflow in Scikit-learn

Key Takeaways

What’s Next?

Which of the following is a key benefit of following a structured machine learning workflow?