Answer a question

When I query a service through their API for daily data, they throw in a time portion which is equal to whatever time the query was made. So my pandas dataframe looks like this when I called the function at 14:54:36 -

2018-05-16 14:54:36  1024.75  1008.25      ...        39221        242897
2018-05-17 14:54:36  1017.00  1002.00      ...        35361        241132
2018-05-18 14:54:36  1015.75  1002.75      ...        49090        242938
2018-05-21 14:54:36  1034.50  1020.75      ...        56950        243316
2018-05-22 14:54:36  1043.75  1028.50      ...        49724        247874
2018-05-23 14:54:36  1049.00  1036.25      ...        46256        253609
2018-05-24 14:54:36  1059.75  1047.00      ...        65352        259617

As this is daily data, the time portion is useless. When I do:

data = pd.read_csv(StringIO(data), index_col=0, header=None,names=['High','Low','Open','Close','Volume','OpenInterest'])
data.index = pd.to_datetime(data.index,format="%Y-%m-%d")

The format doesn't seem to work. The DateTime index still contains time. Any idea how I can remove the time portion?

Answers

You can maintain the datetime functionality and set the time portion to 00:00:00 with normalize.

df.index = df.index.normalize()

# For non-Index datetime64[ns] dtype columns you use the `.dt` accessor:
# df['column'] = df['column'].dt.normalize()

import pandas as pd
df = pd.DataFrame([1, 2, 3, 4], index=pd.date_range('2018', periods=4, freq='H'))

df.index = df.index.normalize()

print(df)
#            0
#2018-01-01  1
#2018-01-01  2
#2018-01-01  3
#2018-01-01  4

Looking at the index:

df.index
#DatetimeIndex(['2018-01-01', '2018-01-01', '2018-01-01', '2018-01-01'], dtype='datetime64[ns]', freq=None)

And the values are Timestamps:

df.index[0]
#Timestamp('2018-01-01 00:00:00')
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐