Pandas Cheat Sheet
Basics commmands for the Pandas Python lib.
Pandas Cheat Sheet
Importing and Creating Data Structures
- Importing Pandas:
import pandas as pd - Series:
s = pd.Series([1, 3, 5, np.nan, 6, 8]) - DataFrame:
df = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))df = pd.DataFrame([{'A': 0, 'B': 1}, {'A': 2, 'B': 3}])
Viewing Data
- Display the top rows:
df.head(n) - Display the bottom rows:
df.tail(n) - Display basic statistics:
df.describe() - Display column data types:
df.dtypes
Selection
- Select a column:
df['column_name'] - Select multiple columns:
df[['column_name1', 'column_name2']] - Select rows by index:
df.iloc[3],df.iloc[2:5] - Select rows by condition:
df[df['column_name'] > 2]
Data Cleaning
- Drop rows with missing values:
df.dropna(axis=0) - Fill missing values:
df.fillna(value) - Replace values:
df.replace(to_replace, value)
Data Operations
- Apply a function:
df.apply(np.mean) - Sort by values:
df.sort_values(by='column_name') - Group by values:
df.groupby('column_name').sum()
Merging Data
- Inner Join:
pd.merge(df1, df2, on='key') - Left Join:
pd.merge(df1, df2, on='key', how='left') - Right Join:
pd.merge(df1, df2, on='key', how='right') - Full Outer Join:
pd.merge(df1, df2, on='key', how='outer') - Concatenate DataFrames:
pd.concat([df1, df2])
Input and Output
- Read from CSV:
df = pd.read_csv('file.csv') - Write to CSV:
df.to_csv('file.csv', index=False) - Read from Excel:
df = pd.read_excel('file.xlsx') - Write to Excel:
df.to_excel('file.xlsx', sheet_name='Sheet1', index=False)
Handling Dates
- Convert to datetime:
df['date_column'] = pd.to_datetime(df['date_column']) - Set as index:
df.set_index('date_column', inplace=True) - Resample time series data:
df.resample('M').mean()
Plotting
- Basic plot:
df.plot() - Histogram:
df['column_name'].plot.hist(bins=20) - Bar plot:
df.plot.bar() - Scatter plot:
df.plot.scatter(x='column_name1', y='column_name2')