Just curious on the behavior of 'where' and why you would use it over 'loc'.

If I create a dataframe:

df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10], 'Run Distance':[234,35,77,787,243,5435,775,123,355,123], 'Goals':[12,23,56,7,8,0,4,2,1,34], 'Gender':['m','m','m','f','f','m','f','m','f','m']})

And then apply the 'where' function:

df2 = df.where(df['Goals']>10)

I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:

Gender Goals ID Run Distance 0 m 12.0 1.0 234.0 1 m 23.0 2.0 35.0 2 m 56.0 3.0 77.0 3 NaN NaN NaN NaN 4 NaN NaN NaN NaN 5 NaN NaN NaN NaN 6 NaN NaN NaN NaN 7 NaN NaN NaN NaN 8 NaN NaN NaN NaN 9 m 34.0 10.0 123.0

If however I use the 'loc' function:

df2 = df.loc[df['Goals']>10]

It returns the dataframe subsetted without the NaN values:

Gender Goals ID Run Distance 0 m 12 1 234 1 m 23 2 35 2 m 56 3 77 9 m 34 10 123

So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?

