Pandas中時間和日期處理

1、生成一個時間段

In [1]:import pandas as pd
In [2]:import numpy as np
#1、生成一個時間區(qū)間段,間隔為小時
In [3]:rng = pd.date_range('1/1/2011', periods=72, freq='H')
#2、生成一個Series,并制定索引為時間段
In [4]:ts = pd.Series(np.random.randn(len(rng)), index=rng)
In [5]:ts
Out[5]: 
2011-01-01 00:00:00   -0.204085
2011-01-01 01:00:00    1.101711
2011-01-01 02:00:00    1.840500
2011-01-01 03:00:00    0.112426
2011-01-01 04:00:00   -0.310413
2011-01-01 05:00:00    1.180762
2011-01-01 06:00:00    0.087775
2011-01-01 07:00:00    1.087877
2011-01-01 08:00:00   -0.950237
2011-01-01 09:00:00   -0.468453
Freq: H, dtype: float64

#3、改變時間間隔
In [6]:converted = ts.asfreq('45Min', method='pad')
In [7]:converted
Out[7]: 
2011-01-01 00:00:00   -0.204085
2011-01-01 00:45:00   -0.204085
2011-01-01 01:30:00    1.101711
2011-01-01 02:15:00    1.840500
2011-01-01 03:00:00    0.112426
2011-01-01 03:45:00    0.112426
2011-01-01 04:30:00   -0.310413
2011-01-01 05:15:00    1.180762
2011-01-01 06:00:00    0.087775
2011-01-01 06:45:00    0.087775
2011-01-01 07:30:00    1.087877
2011-01-01 08:15:00   -0.950237
2011-01-01 09:00:00   -0.468453
Freq: 45T, dtype: float64

2、轉華為日期格式

2.1 數字生成日期格式

In [8]: pd.Timestamp(datetime(2012, 5, 1))
Out[8]: Timestamp('2012-05-01 00:00:00')

2.2 字符生成日期格式

In [9]: pd.Timestamp('2012-05-01')
Out[9]: Timestamp('2012-05-01 00:00:00')

2.3 只有年月

In [10]: pd.Period('2011-01')
Out[10]: Period('2011-01', 'M')

In [11]: pd.Period('2012-05', freq='D')
Out[11]: Period('2012-05-01', 'D')

2.4 轉化為日期格式

In [22]: pd.to_datetime(pd.Series(['Jul 31, 2009', '2010-01-10', None]))
Out[22]: 
0   2009-07-31
1   2010-01-10
2          NaT
dtype: datetime64[ns]

In [23]: pd.to_datetime(['2005/11/23', '2010.12.31'])
Out[23]: DatetimeIndex(['2005-11-23', '2010-12-31'], dtype='datetime64[ns]', freq=None)

3、生成一個時間段

3.1 生成索引的方法

In [35]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]

# Note the frequency information
In [36]: index = pd.DatetimeIndex(dates)

In [37]: index
Out[37]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)

# Automatically converted to DatetimeIndex
In [38]: index = pd.Index(dates)

In [39]: index
Out[39]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)


# date_range日歷,bdate_range工作日
In [40]: index = pd.date_range('2000-1-1', periods=1000, freq='M')

In [41]: index
Out[41]: 
DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-30',
               '2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',
               '2000-09-30', '2000-10-31',
               ...
               '2082-07-31', '2082-08-31', '2082-09-30', '2082-10-31',
               '2082-11-30', '2082-12-31', '2083-01-31', '2083-02-28',
               '2083-03-31', '2083-04-30'],
              dtype='datetime64[ns]', length=1000, freq='M')

In [42]: index = pd.bdate_range('2012-1-1', periods=250)

In [43]: index
Out[43]: 
DatetimeIndex(['2012-01-02', '2012-01-03', '2012-01-04', '2012-01-05',
               '2012-01-06', '2012-01-09', '2012-01-10', '2012-01-11',
               '2012-01-12', '2012-01-13',
               ...
               '2012-12-03', '2012-12-04', '2012-12-05', '2012-12-06',
               '2012-12-07', '2012-12-10', '2012-12-11', '2012-12-12',
               '2012-12-13', '2012-12-14'],
              dtype='datetime64[ns]', length=250, freq='B')


In [44]: start = datetime(2011, 1, 1)

In [45]: end = datetime(2012, 1, 1)

In [46]: rng = pd.date_range(start, end)

In [47]: rng
Out[47]: 
DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04',
               '2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08',
               '2011-01-09', '2011-01-10',
               ...
               '2011-12-23', '2011-12-24', '2011-12-25', '2011-12-26',
               '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30',
               '2011-12-31', '2012-01-01'],
              dtype='datetime64[ns]', length=366, freq='D')

In [48]: rng = pd.bdate_range(start, end)

In [49]: rng
Out[49]: 
DatetimeIndex(['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06',
               '2011-01-07', '2011-01-10', '2011-01-11', '2011-01-12',
               '2011-01-13', '2011-01-14',
               ...
               '2011-12-19', '2011-12-20', '2011-12-21', '2011-12-22',
               '2011-12-23', '2011-12-26', '2011-12-27', '2011-12-28',
               '2011-12-29', '2011-12-30'],
              dtype='datetime64[ns]', length=260, freq='B')

3.2 每個月末,每隔一周

In [50]: pd.date_range(start, end, freq='BM')
Out[50]: 
DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29',
               '2011-05-31', '2011-06-30', '2011-07-29', '2011-08-31',
               '2011-09-30', '2011-10-31', '2011-11-30', '2011-12-30'],
              dtype='datetime64[ns]', freq='BM')

In [51]: pd.date_range(start, end, freq='W')
Out[51]: 
DatetimeIndex(['2011-01-02', '2011-01-09', '2011-01-16', '2011-01-23',
               '2011-01-30', '2011-02-06', '2011-02-13', '2011-02-20',
               '2011-02-27', '2011-03-06', '2011-03-13', '2011-03-20',
               '2011-03-27', '2011-04-03', '2011-04-10', '2011-04-17',
               '2011-04-24', '2011-05-01', '2011-05-08', '2011-05-15',
               '2011-05-22', '2011-05-29', '2011-06-05', '2011-06-12',
               '2011-06-19', '2011-06-26', '2011-07-03', '2011-07-10',
               '2011-07-17', '2011-07-24', '2011-07-31', '2011-08-07',
               '2011-08-14', '2011-08-21', '2011-08-28', '2011-09-04',
               '2011-09-11', '2011-09-18', '2011-09-25', '2011-10-02',
               '2011-10-09', '2011-10-16', '2011-10-23', '2011-10-30',
               '2011-11-06', '2011-11-13', '2011-11-20', '2011-11-27',
               '2011-12-04', '2011-12-11', '2011-12-18', '2011-12-25',
               '2012-01-01'],
              dtype='datetime64[ns]', freq='W-SUN')

3.3 從End往前數20個工作日,從start往后數20個工作日

In [52]: pd.bdate_range(end=end, periods=20)
Out[52]: 
DatetimeIndex(['2011-12-05', '2011-12-06', '2011-12-07', '2011-12-08',
               '2011-12-09', '2011-12-12', '2011-12-13', '2011-12-14',
               '2011-12-15', '2011-12-16', '2011-12-19', '2011-12-20',
               '2011-12-21', '2011-12-22', '2011-12-23', '2011-12-26',
               '2011-12-27', '2011-12-28', '2011-12-29', '2011-12-30'],
              dtype='datetime64[ns]', freq='B')

In [53]: pd.bdate_range(start=start, periods=20)
Out[53]: 
DatetimeIndex(['2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06',
               '2011-01-07', '2011-01-10', '2011-01-11', '2011-01-12',
               '2011-01-13', '2011-01-14', '2011-01-17', '2011-01-18',
               '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-24',
               '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28'],
              dtype='datetime64[ns]', freq='B')

4、根據部分索引選擇,切片

In [56]: rng = pd.date_range(start, end, freq='BM')

In [57]: ts = pd.Series(np.random.randn(len(rng)), index=rng)

In [58]: ts.index
Out[58]: 
DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29',
               '2011-05-31', '2011-06-30', '2011-07-29', '2011-08-31',
               '2011-09-30', '2011-10-31', '2011-11-30', '2011-12-30'],
              dtype='datetime64[ns]', freq='BM')

In [59]: ts[:5].index
Out[59]: 
DatetimeIndex(['2011-01-31', '2011-02-28', '2011-03-31', '2011-04-29',
               '2011-05-31'],
              dtype='datetime64[ns]', freq='BM')

In [60]: ts[::2].index
Out[60]: 
DatetimeIndex(['2011-01-31', '2011-03-31', '2011-05-31', '2011-07-29',
               '2011-09-30', '2011-11-30'],
              dtype='datetime64[ns]', freq='2BM')

In [61]: ts['1/31/2011']
Out[61]: -1.2812473076599531

In [62]: ts[pd.datetime(2011, 12, 25):]
Out[62]: 
2011-12-30    0.687738
Freq: BM, dtype: float64

In [63]: ts['10/31/2011':'12/31/2011']
Out[63]: 
2011-10-31    0.149748
2011-11-30   -0.732339
2011-12-30    0.687738
Freq: BM, dtype: float64

In [64]: ts['2011']
Out[64]: 
2011-01-31   -1.281247
2011-02-28   -0.727707
2011-03-31   -0.121306
2011-04-29   -0.097883
2011-05-31    0.695775
2011-06-30    0.341734
2011-07-29    0.959726
2011-08-31   -1.110336
2011-09-30   -0.619976
2011-10-31    0.149748
2011-11-30   -0.732339
2011-12-30    0.687738
Freq: BM, dtype: float64

In [65]: ts['2011-6']
Out[65]: 
2011-06-30    0.341734
Freq: BM, dtype: float64


# DataFrame中指定了時間索引,可以根據時間索引提取子集
In [66]: dft = pd.DataFrame(np.random.randn(100000,1),columns=['A'],index=pd.date_range('20130101',periods=100000,freq='T'))


In [67]: dft
Out[67]: 
                            A
2013-01-01 00:00:00  0.176444
2013-01-01 00:01:00  0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00  0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
...                       ...
2013-03-11 10:33:00 -0.293083
2013-03-11 10:34:00 -0.059881
2013-03-11 10:35:00  1.252450
2013-03-11 10:36:00  0.046611
2013-03-11 10:37:00  0.059478
2013-03-11 10:38:00 -0.286539
2013-03-11 10:39:00  0.841669

[100000 rows x 1 columns]

In [68]: dft['2013']
Out[68]: 
                            A
2013-01-01 00:00:00  0.176444
2013-01-01 00:01:00  0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00  0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
...                       ...
2013-03-11 10:33:00 -0.293083
2013-03-11 10:34:00 -0.059881
2013-03-11 10:35:00  1.252450
2013-03-11 10:36:00  0.046611
2013-03-11 10:37:00  0.059478
2013-03-11 10:38:00 -0.286539
2013-03-11 10:39:00  0.841669

[100000 rows x 1 columns]


In [69]: dft['2013-1':'2013-2']
Out[69]: 
                            A
2013-01-01 00:00:00  0.176444
2013-01-01 00:01:00  0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00  0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
...                       ...
2013-02-28 23:53:00  0.103114
2013-02-28 23:54:00 -1.303422
2013-02-28 23:55:00  0.451943
2013-02-28 23:56:00  0.220534
2013-02-28 23:57:00 -1.624220
2013-02-28 23:58:00  0.093915
2013-02-28 23:59:00 -1.087454

[84960 rows x 1 columns]


In [70]: dft['2013-1':'2013-2-28']
Out[70]: 
                            A
2013-01-01 00:00:00  0.176444
2013-01-01 00:01:00  0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00  0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
...                       ...
2013-02-28 23:53:00  0.103114
2013-02-28 23:54:00 -1.303422
2013-02-28 23:55:00  0.451943
2013-02-28 23:56:00  0.220534
2013-02-28 23:57:00 -1.624220
2013-02-28 23:58:00  0.093915
2013-02-28 23:59:00 -1.087454

[84960 rows x 1 columns]


In [71]: dft['2013-1':'2013-2-28 00:00:00']
Out[71]: 
                            A
2013-01-01 00:00:00  0.176444
2013-01-01 00:01:00  0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00  0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
...                       ...
2013-02-27 23:54:00  0.897051
2013-02-27 23:55:00 -0.309230
2013-02-27 23:56:00  1.944713
2013-02-27 23:57:00  0.369265
2013-02-27 23:58:00  0.053071
2013-02-27 23:59:00 -0.019734
2013-02-28 00:00:00  1.388189

[83521 rows x 1 columns]


In [72]: dft['2013-1-15':'2013-1-15 12:30:00']
Out[72]: 
                            A
2013-01-15 00:00:00  0.501288
2013-01-15 00:01:00 -0.605198
2013-01-15 00:02:00  0.215146
2013-01-15 00:03:00  0.924732
2013-01-15 00:04:00 -2.228519
2013-01-15 00:05:00  1.517331
2013-01-15 00:06:00 -1.188774
...                       ...
2013-01-15 12:24:00  1.358314
2013-01-15 12:25:00 -0.737727
2013-01-15 12:26:00  1.838323
2013-01-15 12:27:00 -0.774090
2013-01-15 12:28:00  0.622261
2013-01-15 12:29:00 -0.631649
2013-01-15 12:30:00  0.193284

[751 rows x 1 columns]


In [73]: dft.loc['2013-1-15 12:30:00']
Out[73]: 
A    0.193284
Name: 2013-01-15 12:30:00, dtype: float64

6、常用時間

類別 解釋
year
month
day
hour
minute 分鐘
second
microsecond 微秒
nanosecond 納秒
date 返回日期
time 返回時間
dayofyear 年序日
weekofyear 年序周
week
dayofweek 周中的第幾天,Monday=0, Sunday=6
weekday 周中的第幾天,Monday=0, Sunday=6
weekday_name 周中的星期幾,ex: Friday
quarter 季度
days_in_month 一個月中有多少天
is_month_start 是否月初第一天
is_month_end 是否月末最后一天
is_quarter_start 是否季度的最開始
is_quarter_end 是否季度的最后一個
is_year_start 是否年初第一天
is_year_end 是否年末第一天

7、某一時間點,往前往后加一段時間

類別 解釋
BDay 工作日
CDay 自定義日期
Week
WeekOfMonth 月中的第幾周
LastWeekOfMonth 月中的最后一周
MonthEnd 日歷上月末
MonthBegin 日歷上月初
BMonthEnd 工作月初
BMonthBegin 月開始營業(yè)
CBMonthEnd 自定義月末
CBMonthBegin 自定義月初
QuarterEnd 日歷季末
QuarterBegin 日歷季初
BQuarterEnd 工作季末
BQuarterBegin 工作季初
FY5253Quarter retail (aka 52-53 week) quarter
YearEnd 日歷年末
YearBegin 日歷年初
BYearEnd 工作年末
BYearBegin 工作年初
FY5253 retail (aka 52-53 week) year
BusinessHour 工作小時
CustomBusinessHour 自定義小時
Hour 小時
Minute 分鐘
Second
In [84]: d = pd.datetime(2008, 8, 18, 9, 0)
In [86]: from pandas.tseries.offsets import *

In [87]: d + DateOffset(months=4, days=5)
Out[87]: Timestamp('2008-12-23 09:00:00')


In [88]: d - 5 * BDay()
Out[88]: Timestamp('2008-08-11 09:00:00')

# 月末
In [89]: d + BMonthEnd()
Out[89]: Timestamp('2008-08-29 09:00:00')


In [90]: d
Out[90]: datetime.datetime(2008, 8, 18, 9, 0)

# 往前數月末
In [91]: offset = BMonthEnd()

In [92]: offset.rollforward(d)
Out[92]: Timestamp('2008-08-29 09:00:00')

# 往后數月末
In [93]: offset.rollback(d)
Out[93]: Timestamp('2008-07-31 09:00:00')


# 時間方面的
In [94]: day = Day()

In [95]: day.apply(pd.Timestamp('2014-01-01 09:00'))
Out[95]: Timestamp('2014-01-02 09:00:00')

In [96]: day = Day(normalize=True)

In [97]: day.apply(pd.Timestamp('2014-01-01 09:00'))
Out[97]: Timestamp('2014-01-02 00:00:00')

In [98]: hour = Hour()

In [99]: hour.apply(pd.Timestamp('2014-01-01 22:00'))
Out[99]: Timestamp('2014-01-01 23:00:00')

In [100]: hour = Hour(normalize=True)

In [101]: hour.apply(pd.Timestamp('2014-01-01 22:00'))
Out[101]: Timestamp('2014-01-01 00:00:00')

In [102]: hour.apply(pd.Timestamp('2014-01-01 23:00'))
Out[102]: Timestamp('2014-01-02 00:00:00')


# 周相關的
In [103]: d
Out[103]: datetime.datetime(2008, 8, 18, 9, 0)

In [104]: d + Week()
Out[104]: Timestamp('2008-08-25 09:00:00')

In [105]: d + Week(weekday=4)
Out[105]: Timestamp('2008-08-22 09:00:00')

In [106]: (d + Week(weekday=4)).weekday()
Out[106]: 4

In [107]: d - Week()
Out[107]: Timestamp('2008-08-11 09:00:00')

8、時間序列相關的時間處理

In [213]: ts = ts[:5]

In [214]: ts.shift(1)
Out[214]: 
2011-01-31         NaN
2011-02-28   -1.281247
2011-03-31   -0.727707
2011-04-29   -0.121306
2011-05-31   -0.097883
Freq: BM, dtype: float64


In [215]: ts.shift(5, freq=datetools.bday)
Out[215]: 
2011-02-07   -1.281247
2011-03-07   -0.727707
2011-04-07   -0.121306
2011-05-06   -0.097883
2011-06-07    0.695775
dtype: float64

In [216]: ts.shift(5, freq='BM')
Out[216]: 
2011-06-30   -1.281247
2011-07-29   -0.727707
2011-08-31   -0.121306
2011-09-30   -0.097883
2011-10-31    0.695775
Freq: BM, dtype: float64


In [217]: ts.tshift(5, freq='D')
Out[217]: 
2011-02-05   -1.281247
2011-03-05   -0.727707
2011-04-05   -0.121306
2011-05-04   -0.097883
2011-06-05    0.695775
dtype: float64

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

  • 國家電網公司企業(yè)標準(Q/GDW)- 面向對象的用電信息數據交換協議 - 報批稿:20170802 前言: 排版 ...
    庭說閱讀 12,537評論 6 13
  • 上次給大家分享了《2017年最全的excel函數大全(5)——邏輯函數》,這次分享給大家日期和時間函數(上)。 D...
    幸福的耗子閱讀 5,997評論 0 5
  • 為了我的愛,我的自由 為了實現理想中的歡欣與永久 發(fā)起一次逃離 將人間遠遠地甩在背后 別回頭,我的朋友 毫無顧忌地...
    梵文米娜閱讀 310評論 0 0
  • 聽! 滴--嗒--滴--嗒 哇!下雨了誒! 可是 為什么太陽爺爺還在天上呢? 我問媽媽 媽媽說 “這是太陽雨哦! ...
    顧汐若_思卿閱讀 379評論 0 3
  • 小時候起,她就覺得婚姻是墳墓,具體多小的時候,要從她有了記憶,她清晰的記得父母扭打在一起,婚姻便在她腦中有了具體的...
    佳妞依然閱讀 164評論 0 1

友情鏈接更多精彩內容