特黄特色视频,有码专区免费一级高清,视频黄页一区

python讀取一個utf-8編碼保存的文件，第一行為空，然后我用line.strip() == ‘’來判斷是否是空行，發(fā)現(xiàn)判斷不對。

line.strip()后，我發(fā)現(xiàn)顯示的值是‘’, 但為什么與‘’不相等呢？len(line.strip())居然等于3?。√婀至?，顯然不是空值呀，然后我用repr()這個函數(shù)對結(jié)果進行轉(zhuǎn)義，發(fā)現(xiàn)有值\xef\xbb\xbf，那這個值是什么意思呢？

EF BB BF是被稱為?Byte order mark?(BOM)的文件標記，用來指出這個文件是UTF-8編碼。

處理方式見?Reading Unicode file data with BOM chars in Python?的第一個回答，附下：

There is no reason to check if a BOM exists or not,?utf-8-sig?manages that for you and behaves exactly as?utf-8?if the BOM does not exist:

1. # Standard UTF-8 without BOM

>>> b'hello'.decode('utf-8')

'hello'

>>> b'hello'.decode('utf-8-sig')

'hello'

2. # BOM encoded UTF-8

>>> b'\xef\xbb\xbfhello'.decode('utf-8')

'\ufeffhello'

>>> b'\xef\xbb\xbfhello'.decode('utf-8-sig')

'hello'

In the example above, you can see?utf-8-sig?correctly decodes the given string regardless of the existence of BOM. If you think there is even a small chance that a BOM character might exist in the files you are reading, just use?utf-8-sig?and not worry about it

所以我在讀取文件時，采用utf-8-sig的方式，在python 2.7中，代碼如下：

import codecs

with codecs.open(file_path, 'r', 'utf-8-sig') as fh:

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

python讀取一個utf-8編碼的文件，出現(xiàn)\xef\xbb\xbf

python讀取一個utf-8編碼的文件，出現(xiàn)\xef\xbb\xbf

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

python讀取一個utf-8編碼的文件，出現(xiàn)\xef\xbb\xbf

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av