✍Tips and Tricks in Python

What is the difference between NaN ,None, pd.nan and np.nan?

Photo by Dave Gandy under the Public Domain Dedication License

Warning: There is no magical formula or Holy Grail here, though a new world might open the door for you.

TL;NR:

  • First of all, there is no pd.nan, but do have np.nan.
  • if a data is missing and showing NaN, be careful to use NaN ==np.nan. np.nan is not comparable to np.nan... directly.
np.nan == np.nanFalse

NaN is used as a placeholder for missing data consistently in pandas, consistency is good. I usually read/translate NaN as “missing”. Also see the ‘working with missing data’ section in the docs.

Wes writes in the docs ‘choice of NA-representation’:

After years of production use [NaN] has proven, at least in my opinion, to be the best decision given the state of affairs in NumPy and Python in general. The special value NaN (Not-A-Number) is used everywhere as the NA value, and there are API functions isnull and notnull which can be used across the dtypes to detect NA values.
...
Thus, I have chosen the Pythonic “practicality beats purity” approach and traded integer NA capability for a much simpler approach of using a special value in float and object arrays to denote NA, and promoting integer arrays to floating when NAs must be introduced.

Note: the “gotcha” that integer Series containing missing data are upcast to floats.

In my opinion the main reason to use NaN (over None) is that it can be stored with numpy’s float64 dtype, rather than the less efficient object dtype, see NA type promotions.

#  without forcing dtype it changes None to NaN!
s_bad = pd.Series([1, None], dtype=object)
s_good = pd.Series([1, np.nan])
In [13]: s_bad.dtype
Out[13]: dtype('O')
In [14]: s_good.dtype
Out[14]: dtype('float64')

Jeff comments (below) on this:

np.nan allows for vectorized operations; its a float value, while None, by definition, forces object type, which basically disables all efficiency in numpy.

So repeat 3 times fast: object==bad, float==good

Saying that, many operations may still work just as well with None vs NaN (but perhaps are not supported i.e. they may sometimes give surprising results):

In [15]: s_bad.sum()
Out[15]: 1
In [16]: s_good.sum()
Out[16]: 1.0

To answer the second question:
You should be using pd.isnull and pd.notnull to test for missing data (NaN).

np.nan is not comparable to np.nan... directly.

np.nan == np.nanFalse

yes, if a data is missing and showing NaN, be careful to use NaN ==np.nan .

While

np.isnan(np.nan)True

Could also do

pd.isnull(np.nan)True

examples
Filters nothing because nothing is equal to np.nan

s = pd.Series([1., np.nan, 2.])
s[s != np.nan]
0 1.0
1 NaN
2 2.0
dtype: float64

Filters out the null

s = pd.Series([1., np.nan, 2.])
s[s.notnull()]
0 1.0
2 2.0
dtype: float64

Use odd comparison behavior to get what we want anyway. If np.nan != np.nan is True then

s = pd.Series([1., np.nan, 2.])
s[s == s]
0 1.0
2 2.0
dtype: float64

Just dropna

s = pd.Series([1., np.nan, 2.])
s.dropna()
0 1.0
2 2.0
dtype: float64

you can use where, it's worth noting that you can do this natively in pandas:

df1 = df.where(pd.notnull(df), None)

Note: this changes the dtype of all columns to object.

Example:

In [1]: df = pd.DataFrame([1, np.nan])In [2]: df
Out[2]:
0
0 1
1 NaN
In [3]: df1 = df.where(pd.notnull(df), None)In [4]: df1
Out[4]:
0
0 1
1 None

Note: what you cannot do recast the DataFrames dtype to allow all datatypes types, using astype, and then the DataFrame fillna method:

df1 = df.astype(object).replace(np.nan, 'None')

Unfortunately neither this, nor using replace, works with None see this (closed) issue.

As an aside, it’s worth noting that for most use cases you don’t need to replace NaN with None, see this question about the difference between NaN and None in pandas.

However, in this specific case it seems you do.

An ordinary guy who wants to be the reason someone believes in the goodness of people. He is living at Brisbane, Australia, with a lovely backyard.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store