1. 篩選數(shù)據(jù)
- 過濾掉列表中的負(fù)數(shù)
- 篩出字典中高于90的項(xiàng)
- 篩出集合中能被3整除的元素
可以考慮使用filter函數(shù)和列表解析方式:
from random import randint
data_list = [randint(-10, 10) for x in range(10)]
print(data_list)
print(list(filter(lambda x: x >= 0, data_list)))
print([x for x in data_list if x >= 0])
print("=============================================")
data_dict = {x: randint(60, 100) for x in range(10)}
test = {10: 3}
print(data_dict)
print({k: v for k, v in data_dict.items() if v >= 90})
print("=============================================")
data_set = {randint(0, 10) for x in range(10)}
print(data_set)
print({x for x in data_set if x % 3 == 0})
兩種方法都可以,但是列表解析速度更快,是首選。
2. tuple命名
為了減小存儲(chǔ)開銷,對(duì)于數(shù)據(jù)量較多,可以使用tuple來存儲(chǔ)。例如有很多個(gè)學(xué)生,學(xué)生的信息項(xiàng)都是相同的。調(diào)用這些tuple時(shí)可能會(huì)通過索引來訪問具體的值,這樣會(huì)降低程序的可讀性。那么如何為元組中的每個(gè)元素命名,提高程序的可讀性。
方案一:
定義類似于其他語言的枚舉類型,也就是定義一系列數(shù)值常量。
方案二:
使用標(biāo)準(zhǔn)庫中collections.namedtuple替代內(nèi)置tuple
from collections import namedtuple
student = namedtuple('student', ['name', 'age', 'sex', 'addr'])
s = student('Dai', '22', 'male', 'beijing')
print(s)
print(s.name)
print(isinstance(s, tuple))
namedtuple是tuple的子類型,只要是使用tuple的都可以使用namedtuple
3. 統(tǒng)計(jì)頻率
在一個(gè)序列中找出出現(xiàn)頻率最高的三個(gè)元素;
在一個(gè)文件中統(tǒng)計(jì)出現(xiàn)頻率最高的十個(gè)單詞;
正常的思路就是新建立一個(gè)字典,key是序列中所能包含的字母表或者數(shù)字表,value都為0,。然后進(jìn)行迭代,遇到有的項(xiàng)就加1,最后將字典根據(jù)value值進(jìn)行排序,取出最大的三個(gè)。
下面介紹的是一種簡(jiǎn)單的實(shí)現(xiàn),利用collections.Counter
from collections import Counter
from random import randint
data = [randint(0, 20) for _ in range(1, 15)]
print(Counter(data).items())
print(Counter(data).most_common(3))
print(isinstance(Counter(data), dict))
# dict_items([(1, 1), (2, 1), (3, 3), (4, 2), (5, 1), (6, 1), (7, 1), (8, 1), (20, 1), (12, 1), (15, 1)])
# [(3, 3), (4, 2), (1, 1)]
# True
對(duì)于字符串來說,可以通過使用正則表達(dá)式來將整個(gè)資源按照非字母切分成list,然后再調(diào)用Counter來計(jì)算出想要的值。
import re
zen = """
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!"""
data_zen = re.split("\W+", zen)
print(Counter(data_zen).most_common(10))
# [('is', 10), ('better', 8), ('than', 8), ('to', 5), ('the', 5), ('idea', 3), ('Although', 3), ('it', 3), ('be', 3), ('never', 3)]
4. 字典排序
對(duì)字典的value值進(jìn)行排序;
通過使用內(nèi)置函數(shù)sorted進(jìn)行排序,效率高。
方案1:
通過zip函數(shù)將字典轉(zhuǎn)化成tuple,再進(jìn)行排序
from random import randint
data = {x: randint(60, 100) for x in "abcdefg"}
print(sorted(data))
print(sorted(zip(data.values(), data.keys())))
# ['a', 'b', 'c', 'd', 'e', 'f', 'g']
# [(60, 'f'), (66, 'b'), (74, 'c'), (80, 'e'), (94, 'a'), (94, 'd'), (96, 'g')]
方案2:
通過指定sorted的key來進(jìn)行排序
print(sorted(data.items(), key=lambda x: x[1]))
# [('a', 72), ('c', 74), ('e', 78), ('g', 91), ('b', 92), ('d', 94), ('f', 97)]
5. 字典公共鍵
有多個(gè)字典數(shù)據(jù),找出其中共有的鍵。
思路就是通過Set的交集操作,這是效率最高的解決方案。使用viewkeys方法,得到一個(gè)字典keys的集合;然后使用map函數(shù),得到所有字典的keys的集合;最后使用reduce函數(shù),取出所有字典的keys的集合的交集。
from random import randint, sample
from functools import reduce
data1 = {x: randint(1, 4) for x in sample("abcdefg", randint(3, 6))}
data2 = {x: randint(1, 4) for x in sample("abcdefg", randint(3, 6))}
data3 = {x: randint(1, 4) for x in sample("abcdefg", randint(3, 6))}
print(data1.keys() & data2.keys() & data3.keys())
print(reduce(lambda a, b: a & b, map(dict.keys, [data1, data2, data3])))
6.字典有序性
在python中默認(rèn)的字典dict保存數(shù)據(jù)時(shí)是無序的,也就是與數(shù)據(jù)的插入順序不同。如果有需要字典內(nèi)的數(shù)據(jù)保持有序的情況,可以使用:
from collections import OrderedDict
7.歷史記錄功能
查詢最近用戶輸入過的值,并且將其結(jié)果保存,下次調(diào)用依然可以查詢。
實(shí)現(xiàn)的思路就是使用python的雙向隊(duì)列deque,然后通過pickle將python對(duì)象保存成文件。
import os
import pickle
from random import randint
from collections import deque
N = randint(0, 100)
history = deque([], 5)
if os.path.exists('history'):
history = pickle.load(open('history', 'rb'))
else:
history = deque([], 5)
def guess(k):
if k == N:
print('right')
return True
if k < N:
print('less')
else:
print('high')
return False
while True:
line = input('input number')
if line.isdigit():
k = int(line)
if guess(k):
history.clear()
break
else:
history.append(k)
elif line == 'history':
print(history)
elif line == 'end':
pickle.dump(history, open('history', 'wb'), True)
os._exit(0)