Answer a question

Is there a way to convert values like '34%' directly to int or float when using read_csv in pandas? I want '34%' to be directly read as 0.34

  1. Using this in read_csv did not work:

    read_csv(..., dtype={'col':np.float})

  2. After loading the csv as 'df' this also did not work with the error "invalid literal for float(): 34%"

    df['col'] = df['col'].astype(float)

  3. I ended up using this which works but is long winded:

    df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).astype(float)/100

Answers

You can define a custom function to convert your percents to floats at read_csv() time:

# dummy data
temp1 = """index col 
113 34%
122 50%
123 32%
301 12%"""

# Custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float
def p2f(x):
    return float(x.strip('%'))/100

# Pass to `converters` param as a dict...
df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f})
df

        col
index      
113    0.34
122    0.50
123    0.32
301    0.12

# Check that dtypes really are floats
df.dtypes

col    float64
dtype: object

My percent to float code is courtesy of ashwini's answer: What is a clean way to convert a string percent to a float?

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐