Pandas Cheat Sheet
Basics commmands for the Pandas Python lib.
Pandas Cheat Sheet
Importing and Creating Data Structures
- Importing Pandas:
import pandas as pd
- Series:
s = pd.Series([1, 3, 5, np.nan, 6, 8])
- DataFrame:
df = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
df = pd.DataFrame([{'A': 0, 'B': 1}, {'A': 2, 'B': 3}])
Viewing Data
- Display the top rows:
df.head(n)
- Display the bottom rows:
df.tail(n)
- Display basic statistics:
df.describe()
- Display column data types:
df.dtypes
Selection
- Select a column:
df['column_name']
- Select multiple columns:
df[['column_name1', 'column_name2']]
- Select rows by index:
df.iloc[3]
,df.iloc[2:5]
- Select rows by condition:
df[df['column_name'] > 2]
Data Cleaning
- Drop rows with missing values:
df.dropna(axis=0)
- Fill missing values:
df.fillna(value)
- Replace values:
df.replace(to_replace, value)
Data Operations
- Apply a function:
df.apply(np.mean)
- Sort by values:
df.sort_values(by='column_name')
- Group by values:
df.groupby('column_name').sum()
Merging Data
- Inner Join:
pd.merge(df1, df2, on='key')
- Left Join:
pd.merge(df1, df2, on='key', how='left')
- Right Join:
pd.merge(df1, df2, on='key', how='right')
- Full Outer Join:
pd.merge(df1, df2, on='key', how='outer')
- Concatenate DataFrames:
pd.concat([df1, df2])
Input and Output
- Read from CSV:
df = pd.read_csv('file.csv')
- Write to CSV:
df.to_csv('file.csv', index=False)
- Read from Excel:
df = pd.read_excel('file.xlsx')
- Write to Excel:
df.to_excel('file.xlsx', sheet_name='Sheet1', index=False)
Handling Dates
- Convert to datetime:
df['date_column'] = pd.to_datetime(df['date_column'])
- Set as index:
df.set_index('date_column', inplace=True)
- Resample time series data:
df.resample('M').mean()
Plotting
- Basic plot:
df.plot()
- Histogram:
df['column_name'].plot.hist(bins=20)
- Bar plot:
df.plot.bar()
- Scatter plot:
df.plot.scatter(x='column_name1', y='column_name2')