Basics commmands for the Pandas Python lib.

Pandas Cheat Sheet

Importing and Creating Data Structures

  • Importing Pandas: import pandas as pd
  • Series: s = pd.Series([1, 3, 5, np.nan, 6, 8])
  • DataFrame:
    • df = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
    • df = pd.DataFrame([{'A': 0, 'B': 1}, {'A': 2, 'B': 3}])

Viewing Data

  • Display the top rows: df.head(n)
  • Display the bottom rows: df.tail(n)
  • Display basic statistics: df.describe()
  • Display column data types: df.dtypes

Selection

  • Select a column: df['column_name']
  • Select multiple columns: df[['column_name1', 'column_name2']]
  • Select rows by index: df.iloc[3], df.iloc[2:5]
  • Select rows by condition: df[df['column_name'] > 2]

Data Cleaning

  • Drop rows with missing values: df.dropna(axis=0)
  • Fill missing values: df.fillna(value)
  • Replace values: df.replace(to_replace, value)

Data Operations

  • Apply a function: df.apply(np.mean)
  • Sort by values: df.sort_values(by='column_name')
  • Group by values: df.groupby('column_name').sum()

Merging Data

  • Inner Join: pd.merge(df1, df2, on='key')
  • Left Join: pd.merge(df1, df2, on='key', how='left')
  • Right Join: pd.merge(df1, df2, on='key', how='right')
  • Full Outer Join: pd.merge(df1, df2, on='key', how='outer')
  • Concatenate DataFrames: pd.concat([df1, df2])

Input and Output

  • Read from CSV: df = pd.read_csv('file.csv')
  • Write to CSV: df.to_csv('file.csv', index=False)
  • Read from Excel: df = pd.read_excel('file.xlsx')
  • Write to Excel: df.to_excel('file.xlsx', sheet_name='Sheet1', index=False)

Handling Dates

  • Convert to datetime: df['date_column'] = pd.to_datetime(df['date_column'])
  • Set as index: df.set_index('date_column', inplace=True)
  • Resample time series data: df.resample('M').mean()

Plotting

  • Basic plot: df.plot()
  • Histogram: df['column_name'].plot.hist(bins=20)
  • Bar plot: df.plot.bar()
  • Scatter plot: df.plot.scatter(x='column_name1', y='column_name2')