How to replace all blank/empty cells in a pandas dataframe with NaNs
As has already been mentioned, the best way to accomplish this is to use df.replace().
- >>> import pandas as pd
- >>> import numpy as np
- >>> df = pd.DataFrame({'A':[1, 2, '', 4, ''], 'B':[6, 7, '', '', 10]})
- >>> df
- A B
- 0 1 6
- 1 2 7
- 2
- 3 4
- 4 10
- >>> df.replace('', np.nan)
- A B
- 0 1.0 6.0
- 1 2.0 7.0
- 2 NaN NaN
- 3 4.0 NaN
- 4 NaN 10.0
You can either use this to create a new DataFrame or use inplace = True to apply it to your current DataFrame object.
This also works with columns. If, for example, you only wanted to replace all of the blanks in column A while leaving the blanks in column B, then you could use df.A.replace().
- >>> df.A.replace('', np.nan, inplace = True)
- >>> df
- A B
- 0 1.0 6
- 1 2.0 7
- 2 NaN
- 3 4.0
- 4 NaN 10