Lecture

Responsible Data Use: Ethics and Privacy

As a data analyst, it's not just about what you can do with data. It's also about what you should do.

Even when data is available, using it without considering privacy, consent, or fairness can lead to harm.

That's why responsible data use is a critical skill in your journey.


What do we need to consider?

When working with people's data — whether survey responses, user activity, or customer feedback — you need to consider:

  • Privacy: Does the data expose personal details?
  • Consent: Was it collected with permission?
  • Bias: Are certain groups underrepresented or misrepresented?
  • Security: Is the data stored and accessed safely?

Just because you have access to names, emails, or ages doesn't mean they should be used in every analysis. Responsible data use builds trust and protects individuals.


What is Anonymization?

To protect sensitive data, analysts often anonymize it. This means removing or masking information that could identify someone.

Let's see a simple example in Python.


Anonymizing Personal Data

Anonymizing Personal Data
# Imagine we collected survey responses with names and ages data = [ {"name": "Lina", "age": 25}, {"name": "Marcus", "age": 30} ] # To protect privacy, we remove or mask names before analysis for row in data: row["name"] = "REDACTED" # View the anonymized data print(data)
  • data holds personal info (name, age) collected in a survey.
  • We replace names with "REDACTED" to protect identities.
  • This is a common first step before sharing or analyzing personal data.
Quiz
0 / 1

Why is it important for data analysts to anonymize personal data before analysis?

To protect sensitive data, analysts often it.
Encrypt
Anonymize
Delete
Share

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help

Code Editor

Run
Generate

Execution Result