Pandas - Missing Data

Missing Data

Let’s show a few convenient methods to deal with Missing Data in pandas:
pd.dropna(axis, thresh)
pd.fillna(value)

import numpy as np
import pandas as pd

df = pd.DataFrame({'A':[1,2,np.nan],
'B':[5,np.nan,np.nan],
'C':[1,2,3]})
A B C
0 1.0 5.0 1
1 2.0 NaN 2
2 NaN NaN 3

Drop Nan data

df.dropna()
A B C
0 1.0 5.0 1

Drop data with over 2 NaNs

df.dropna(thresh=2)
A B C
0 1.0 5.0 1
1 2.0 NaN 2

Drop Columns with NaN data

df.dropna(axis=1)
C
0 1
1 2
2 3

Fill Data with Specified Value

df.fillna(value='FILL VALUE')
A B C
0 1 5 1
1 2 FILL VALUE 2
2 FILL VALUE FILL VALUE 3

Fill Data with Column’s Mean Value

df['A'].fillna(value=df['A'].mean())
0    1.0
1    2.0
2    1.5
Name: A, dtype: float64