Loading lesson…
Pandas is the Python library that made data science what it is today. Ten verbs get you through 90 percent of day-to-day data work.
Pandas was created in 2008 by Wes McKinney at a hedge fund. Today it is the default Python library for tabular data, downloaded over 100 million times per month. Its two main types are Series (a single column) and DataFrame (a table).
import pandas as pd # 1. Load df = pd.read_csv('data.csv') # 2. Peek df.head() df.info() df.describe() # 3. Select columns df['age'] # one column (Series) df[['age', 'income']] # multiple columns (DataFrame) # 4. Filter rows df[df['age'] > 18] df[(df['age'] > 18) & (df['country'] == 'US')] # 5. Sort df.sort_values('income', ascending=False) # 6. Create columns df['income_per_age'] = df['income'] / df['age'] # 7. Group and aggregate df.groupby('country')['income'].mean() df.groupby(['country', 'gender']).agg({ 'income': ['mean', 'median'], 'age': 'mean' }) # 8. Join tables merged = pd.merge(df, other_df, on='user_id', how='left') # 9. Pivot pd.pivot_table(df, index='country', columns='year', values='income') # 10. Save df.to_csv('clean.csv', index=False) df.to_parquet('clean.parquet')The ten most important pandas operations# .loc uses labels df.loc[5] # row with index label 5 df.loc[df['age'] > 18, 'name'] # name column, filtered rows # .iloc uses positions df.iloc[5] # 6th row regardless of index label df.iloc[:10, :3] # first 10 rows, first 3 cols # Chained assignment is a trap # df[df.age > 18]['score'] = 100 # DO NOT DO THIS df.loc[df.age > 18, 'score'] = 100 # CORRECTCorrect indexing patterns# Top N per group top3 = df.groupby('country').apply( lambda g: g.nlargest(3, 'income') ).reset_index(drop=True) # Rolling stats df['7d_avg'] = df['sales'].rolling(window=7).mean() # Replace based on mapping df['country'] = df['country'].replace({'USA': 'US', 'U.S.A.': 'US'}) # One-hot encoding df_encoded = pd.get_dummies(df, columns=['color']) # Handle dates df['date'] = pd.to_datetime(df['date']) df['day_of_week'] = df['date'].dt.day_name()Patterns you will use every weekThe big idea: pandas rewards the ten verbs you use 90 percent of the time. Master those before chasing fancier features, and the other 10 percent will come naturally when you need it.
6 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-data-pandas-fundamentals
What is the main idea of "Pandas Fundamentals in 40 Minutes"?
Which concept is most central to "Pandas Fundamentals in 40 Minutes"?
What should a careful learner remember about "The SettingWithCopyWarning"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about pandas be treated?
Name one way to verify an AI answer about pandas.