Answer a question

I have two boolean columns A and B in a pandas dataframe, each with missing data (represented by NaN). What I want is to do an AND operation on the two columns, but I want the resulting boolean column to be NaN if either of the original columns is NaN. I have the following table:

    A      B
0   True   True    
1   True   False   
2   False  True   
3   True   NaN    
4   NaN    NaN
5   NaN    False

Now when I do df.A & df.B I want:

0    True
1    False
2    False
3    NaN
4    NaN
5    False
dtype: bool

but instead I get:

0    True
1    False
2    False
3    True
4    True
5    False
dtype: bool

This behaviour is consistent with np.bool(np.nan) & np.bool(False) and its permutations, but what I really want is a column that tells me for certain if each row is True for both, or for certain could not be True for both. If I know it is True for both, then the result should be True, if I know that it is False for at least one then it should be False, and otherwise I need NaN to show that the datum is missing.

Is there a way to achieve this?

Answers

Let's use np.logical_and:

import numpy as np
import pandas as pd
df = pd.DataFrame({'A':[True, True, False, True, np.nan, np.nan], 
                   'B':[True, False, True, np.nan, np.nan, False]})

s = np.logical_and(df['A'],df['B'])
print(s)

Output:

0     True
1    False
2    False
3      NaN
4      NaN
5    False
Name: A, dtype: object
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐