Answer a question

I'm trying to replace some empty list in my data with a NaN values. But how to represent an empty list in the expression?

import numpy as np
import pandas as pd
d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]})
d

    x           y
0   [1, 2, 3]   1
1   [1, 2]      2
2   [text]      3
3   []          4



d.loc[d['x'] == [],['x']] = d.loc[d['x'] == [],'x'].apply(lambda x: np.nan)
d

ValueError: Arrays were different lengths: 4 vs 0

And, I want to select [text] by using d[d['x'] == ["text"]] with a ValueError: Arrays were different lengths: 4 vs 1 error, but select 3 by using d[d['y'] == 3] is correct. Why?

Answers

If you wish to replace empty lists in the column x with numpy nan's, you can do the following:

d.x = d.x.apply(lambda y: np.nan if len(y)==0 else y)

If you want to subset the dataframe on rows equal to ['text'], try the following:

d[[y==['text'] for y in d.x]]

I hope this helps.

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐