可以根據網易云音樂任何歌單的ID,抓取歌單中所有歌曲的信息以及歌詞,并根據歌詞中的詞頻生成詞云圖片。項目中還將歌曲信息及歌詞保存在本地數據庫,詳細信息見代碼github地址 lyricWordCloud.
詞云圖

QQ20180404-182638.png
1.根據歌單ID 獲取歌單中歌曲列表信息
def get163SongList(song_url,headers):
res = requests.request('GET',song_url,headers=headers)
song_list = res.json()['result']['tracks']
return song_list
2.獲取每首歌歌詞
def getSongLyric(headers,lyric_url):
res = requests.request('GET',lyric_url,headers=headers)
# print(res.json())
if 'lrc' in res.json():
lyric = res.json()['lrc']['lyric']
lyric_without_time = re.sub(r'[\d:.[\]]','',lyric)
return lyric_without_time
else:
return ''
3.根據詞頻 生成詞云
print('根據詞頻,開始生成詞云!')
f1 = f.replace('作詞','')
f2 = f1.replace('作曲','')
cut_text = " ".join(jieba.cut(f2,cut_all=False, HMM=True))
# print(cut_text)
# color_mask = plt.imread("dy.png")
# color_mask = np.array(Image.open(os.path.join(os.path.dirname(__file__), "aa.jpg")))
wc = WordCloud(
font_path="aaa.ttf",
# mask=color_mask,
max_words=100,
width=2000,
height=1200,
margin=2,
)
wordcloud = wc.generate(cut_text)
wordcloud.to_file(os.path.join(os.path.dirname(__file__), "h11.jpg"))
print('打開詞云圖片')
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
所用到的模塊
from bs4 import BeautifulSoup
import sqlite3
import sys
import re
import os
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import jieba
from PIL import Image
import numpy as np
效果如下
image
github地址 lyricWordCloud.