時(shí)間戳窗口轉(zhuǎn)換

問(wèn)題起因

最近在做一個(gè)基于時(shí)間的統(tǒng)計(jì)功能,大體需求是統(tǒng)計(jì)按照 1min、10min、2h、24h 為窗口大小進(jìn)行數(shù)據(jù)統(tǒng)計(jì)。原始數(shù)據(jù)的時(shí)間字段是 ms 時(shí)間戳,思路很簡(jiǎn)單就是直接用時(shí)間戳減去窗口大小余數(shù),這種方式對(duì) 1min、10min、2h 的處理都沒(méi)有問(wèn)題,但是對(duì) 24h 的窗口處理就會(huì)有問(wèn)題,可以見(jiàn)下面的測(cè)試。

/**
 * 獲取指定時(shí)間時(shí)間戳歸屬的時(shí)間窗口,盡量刻度到當(dāng)天
 *
 * @param timestamp 時(shí)間戳
 * @param scale     刻度,ms
 * @return {@link long}
 */
public static long getTimestampWindow(long timestamp, long scale) {
    long remain = timestamp % scale;
    return timestamp - remain ;
}

上面就是使用的時(shí)間戳窗口計(jì)算算法。

 public static void main(String[] args) throws ParseException {
     long scale = TimeUnit.HOURS.toMillis(24);
     for (int i = 0; i < 24; i++) {
         Date date = DateUtils.addHours(DateUtils.parseDate("2022-06-24", DateUtil.DATE_PATTERN), i);
         Date scaledDate = new Date(getTimestampWindow(date.getTime(), scale));
         System.out.printf("當(dāng)前時(shí)間:%s,窗口時(shí)間:%s%n",
                           DateFormatUtils.format(date, DateUtil.LONG_DATE_PATTERN),
                           DateFormatUtils.format(scaledDate, DateUtil.LONG_DATE_PATTERN));
     }
 }

當(dāng)使用上面的測(cè)試程序進(jìn)行測(cè)試的時(shí)候發(fā)現(xiàn),窗口并未預(yù)期的顯示是“2022-06-24 00:00:00”,而是輸出了“2022-06-23 08:00:00”及“2022-06-24 08:00:00”兩種窗口,一天的數(shù)據(jù)出現(xiàn)了歸屬跨天問(wèn)題。

當(dāng)然如果使用 Java 里面的日期函數(shù)可以很簡(jiǎn)單的解決這個(gè)問(wèn)題,但是我們返回時(shí)間數(shù)據(jù)是時(shí)間戳,直接做算術(shù)運(yùn)算肯定是效率最高的。

當(dāng)前時(shí)間:2022-06-24 00:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 01:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 02:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 03:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 04:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 05:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 06:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 07:00:00,窗口時(shí)間:2022-06-23 08:00:00
當(dāng)前時(shí)間:2022-06-24 08:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 09:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 10:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 11:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 12:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 13:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 14:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 15:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 16:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 17:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 18:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 19:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 20:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 21:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 22:00:00,窗口時(shí)間:2022-06-24 08:00:00
當(dāng)前時(shí)間:2022-06-24 23:00:00,窗口時(shí)間:2022-06-24 08:00:00

問(wèn)題分析

這是為什么呢?核心原因是因?yàn)槲覀兲幵诘臅r(shí)區(qū)是東 8 區(qū),在比格林尼治時(shí)間早 8 小時(shí),時(shí)間戳 0 在零時(shí)區(qū)(UTC/GMT 0 )表示的是“1970-01-01 00:00:00”,而在東 8 區(qū)(UTC/GMT +8.00)則表示的是“1970-01-01 08:00:00”。

public static void main(String[] args) {
    System.out.println(TimeZone.getDefault());
    System.out.println(DateFormatUtils.format(new Date(0), DateUtil.LONG_DATE_PATTERN));
    System.out.println(TimeZone.getTimeZone("UTC"));
    System.out.println(DateFormatUtils.format(new Date(0), DateUtil.LONG_DATE_PATTERN, TimeZone.getTimeZone("UTC")));
}

sun.util.calendar.ZoneInfo[id="Asia/Shanghai",offset=28800000,dstSavings=0,useDaylight=false,transitions=19,lastRule=null]
1970-01-01 08:00:00
sun.util.calendar.ZoneInfo[id="UTC",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
1970-01-01 00:00:00

我們這里只討論能被 24 小時(shí)整除的窗口,也就是 2h、12h 這樣的,而不討論 5h,7h 這樣的窗口,因?yàn)閷?duì)后面的窗口,勢(shì)必存在跨天問(wèn)題。也就是說(shuō)如果用取余的方式來(lái)計(jì)算時(shí)間窗口的話,當(dāng)時(shí)間能被 24 整除但是如果大于 8 小時(shí)(12h、24h)或者不能被 8 整除(3h、6h)時(shí)候就會(huì)出現(xiàn)歸屬窗口跨天問(wèn)題。

窗口為 6h
窗口為 12h
窗口為 24h

如上圖所示,其中帶 - 號(hào)的代表上一天的時(shí)間,大家可以想下是不是這樣?這個(gè)有點(diǎn)繞,一定要記得起始時(shí)間戳 0 代表的時(shí)間是 8 點(diǎn)。

如果能理解這個(gè),其實(shí)就會(huì)發(fā)現(xiàn)對(duì)一個(gè) 24h 的窗口來(lái)說(shuō),今天 1 點(diǎn)的數(shù)據(jù)歸屬到昨天的 8 小時(shí)這個(gè)窗口是正常的,但是這個(gè)的確看起來(lái)很怪。如果時(shí)間窗口是 24h,對(duì)我們的思維來(lái)說(shuō),今天所有產(chǎn)生的數(shù)據(jù)就應(yīng)該是歸屬到今天。

問(wèn)題解決

既然知道了問(wèn)題的原因,也知道了需求方式,也就比較容易解決這個(gè)問(wèn)題。思想其實(shí)很簡(jiǎn)單,就是先把時(shí)間戳向后拉 8 小時(shí),讓“時(shí)間戳 0 代表的時(shí)間是 0 點(diǎn)”。在算完窗口之后,再將時(shí)間窗口向前拉 8 小時(shí),獲得真實(shí)的歸屬窗口。

    /**
     * 獲取指定時(shí)間時(shí)間戳歸屬的時(shí)間窗口,盡量刻度到當(dāng)天
     *
     * @param timestamp 時(shí)間戳
     * @param scale     刻度,ms
     * @return {@link long}
     */
    public static long getTimestampWindow(long timestamp, long scale) {
        timestamp = timestamp + TIMESTAMP_8H;
        long remain = timestamp % scale;
        return timestamp - remain - TIMESTAMP_8H;
    }

算法變成如上所示,再運(yùn)行測(cè)試程序,就會(huì)發(fā)現(xiàn)時(shí)間窗口歸屬符合我們的期望了。

當(dāng)前時(shí)間:2022-06-24 00:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 01:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 02:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 03:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 04:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 05:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 06:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 07:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 08:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 09:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 10:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 11:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 12:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 13:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 14:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 15:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 16:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 17:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 18:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 19:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 20:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 21:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 22:00:00,窗口時(shí)間:2022-06-24 00:00:00
當(dāng)前時(shí)間:2022-06-24 23:00:00,窗口時(shí)間:2022-06-24 00:00:00
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容