Loading lesson…
Real datasets have holes. Blank cells, NaN, NULL, -999, and the dreaded empty string. Learning to see them is a core skill.
In a perfect world, every row would have every column filled in. In reality, datasets are full of gaps. A survey respondent skipped a question. A sensor cut out for three seconds. A database migration dropped a field. All of this creates missing data.
import pandas as pd
import numpy as np
df = pd.read_csv('survey.csv', na_values=['-999', 'N/A', 'unknown'])
# How much is missing in each column?
print(df.isna().sum())
print(df.isna().mean()) # as a fraction
# Fill age with the median
df['age'] = df['age'].fillna(df['age'].median())
# Flag missingness before filling
df['income_was_missing'] = df['income'].isna()
df['income'] = df['income'].fillna(df['income'].median())Detecting and handling missing data in pandasThe big idea: missing data is not just absence, it is often information. Treat every gap as a question, not an error.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-data-missing-data
What is the core idea behind "Missing Data and How to Spot It"?
Which term best describes a foundational idea in "Missing Data and How to Spot It"?
A learner studying Missing Data and How to Spot It would need to understand which concept?
Which of these is directly relevant to Missing Data and How to Spot It?
Which of the following is a key point about Missing Data and How to Spot It?
What is one important takeaway from studying Missing Data and How to Spot It?
Which of these does NOT belong in a discussion of Missing Data and How to Spot It?
What is the key insight about "The many faces of missing" in the context of Missing Data and How to Spot It?
What is the key insight about "MNAR is sneaky" in the context of Missing Data and How to Spot It?
Which statement accurately describes an aspect of Missing Data and How to Spot It?
What does working with Missing Data and How to Spot It typically involve?
Which best describes the scope of "Missing Data and How to Spot It"?
Which section heading best belongs in a lesson about Missing Data and How to Spot It?
Which section heading best belongs in a lesson about Missing Data and How to Spot It?
Which of the following is a concept covered in Missing Data and How to Spot It?