麻豆系列在线观看,熟妇肥臀亚洲国产一区

3.1 內(nèi)省Excel工作薄

使用xlrd和xlwt擴(kuò)展包，確定工作簿中工作表的數(shù)量、名稱和每個(gè)工作表中行列的數(shù)量。
1excel_introspect_workbook.py

#!/usr/bin/env python3
import sys
from xlrd import open_workbook

input_file = sys.argv[1]

workbook = open_workbook(input_file)
print('Number of worksheets: ', workbook.nsheets)
for worksheet in workbook.sheets():
    print("Worksheet name:", worksheet.name, "\tRows:",\
        worksheet.nrows, "\tColumns:", worksheet.ncols)

導(dǎo)入xlrd模塊open_workbook函數(shù)讀取和分析Excel文件。
for循環(huán)在所有工作表之間迭代，workbook對(duì)象的sheets方法可以識(shí)別出工作簿中所有的工作表。
print語(yǔ)句使用worksheet對(duì)象的name屬性確定每個(gè)工作表名稱，使用nrows和ncols屬性確定每個(gè)工作表中行與列的數(shù)量。
輸出結(jié)果：

Number of worksheets: 3
Worksheet name: january_2013 Rows: 7 Columns: 5
Worksheet name: february_2013 Rows: 7 Columns: 5
Worksheet name: march_2013 Rows: 7 Columns: 5

3.2 處理單個(gè)工作表

3.2.1 讀寫Excel文件

使用pandas分析Excel文件
pandas_read_and_write_excel.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]
data_frame = pd.read_excel(input_file, sheet_name='january_2013')
writer = pd.ExcelWriter(output_file)
data_frame.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

3.2.2 篩選特定行

行中的值滿足某個(gè)條件

用pandas篩選出Sale Amount大于$1400.00的行。
pandas_value_meets_condition.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013', index_col=None)
data_frame_value_meets_condition = \
    data_frame[data_frame['Sale Amount'].astype(float) > 1400.0]
writer = pd.ExcelWriter(output_file)
data_frame_value_meets_condition.to_excel(\
    writer, sheet_name='jan_13_output', index=False)
writer.save()

行中的值屬于某個(gè)集合

用pandas篩選出購(gòu)買日期屬于集合（01/24/2013-01/31/2013）的行。
pandas提供isin函數(shù)檢驗(yàn)一個(gè)特定值是否在一個(gè)列表中
pandas_value_in_set.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013', index_col=None)
important_dates = ['01/24/2013', '01/31/2013']
data_in_set = data_frame[data_frame['PurchaseDate']\
.isin(important_dates)]
writer = pd.ExcelWriter(output_file)
data_in_set.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

行中的值匹配于特定模式

用pandas篩選出客戶姓名以大寫字母J開(kāi)頭的行。
pandas_value_matches_pattern.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013', index_col=None)
data_frame_value = data_frame[data_frame['Customer Name']\
.str.startswith("J")]
writer = pd.ExcelWriter(output_file)
data_frame_value.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

3.2.3 選取特定列

有兩種方法可以在Excel文件中選取特定的列：

使用列索引值
使用列標(biāo)題

使用列索引值
用pandas設(shè)置數(shù)據(jù)框，在方括號(hào)中列出要保留的列的索引值或名稱（字符串）。設(shè)置數(shù)據(jù)框和iloc函數(shù)，同時(shí)選擇特定的行與特定的列。如果使用iloc函數(shù)來(lái)選擇列，那么就需要在列索引值前面加上一個(gè)冒號(hào)和一個(gè)逗號(hào)，表示為這些特定的列保留所有的行。
pandas_column_by_index.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file =sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013', index_col=None)
data_frame_index = data_frame.iloc[:, [1:4]]
writer = pd.ExcelWriter(output_file)
data_frame_index.to_excel(writer, sheet_name='jan_13_out', index=False)
writer.save()

使用列標(biāo)題
用pandas基于列標(biāo)題選取Customer ID和Purchase Date列的兩種方法：

在數(shù)據(jù)框名稱后面的方括號(hào)中將列名以字符串方式列出。
用loc函數(shù)，在列標(biāo)題列表前面加上一個(gè)冒號(hào)和一個(gè)逗號(hào)，表示為這些特定的列保留所有行。

pandas_column_by_name.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, 'january_2013', index_col=None)
data_frame_column = data_frame.iloc[:, ['Customer ID', 'Purchase Date']]
writer = pd.ExcelWriter(output_file)
data_frame_column.to_excel(writer, sheet_name='jan_13_output', index=False)
writer.save()

3.3 讀取工作簿中的所有工作表

3.3.1 在所有工作表中篩選特定行

pandas通過(guò)在read_excel函數(shù)中設(shè)置sheetname=None，可以一次性讀取工作簿中的所有工作表。
pandas將所有工作表讀入數(shù)據(jù)框字典，字典中的鍵就是工作表的名稱，值就是包含工作表中數(shù)據(jù)的數(shù)據(jù)框。所以，通過(guò)在字典的鍵和值之間迭代，可以使用工作簿中所有的數(shù)據(jù)。
當(dāng)在每個(gè)數(shù)據(jù)框中篩選特定行時(shí)，結(jié)果是一個(gè)新的篩選過(guò)的數(shù)據(jù)框，所以可以創(chuàng)建一個(gè)列表保存這些篩選過(guò)的數(shù)據(jù)框，然后將它們連接成一個(gè)最終數(shù)據(jù)框。

在所有工作表中篩選出銷售額大于$2000.00的所有行。
pandas_value_ meets_condition_all_worksheets.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, sheet_name=None, index_col=None)
row_output = []
for worksheet_name, data in data_frame.items():
    row_output_append(data[data['Sale Amount'].astype(float) > 2000.0])
filtered_rows = pd.concat(row_output, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
filtered_rows.to_excel(writer, sheet_name='sale_amount_gt2000', index=False)
writer.save()

3.3.2 在所有工作表中選取特定列

有兩種方法可以從工作表中選取一組列：

使用列索引值
使用列標(biāo)題

在所有工作表中選取Customer Name和Sale Amount列
用pandas的read_excel函數(shù)將所有工作表讀入字典。然后，用loc函數(shù)在每個(gè)工作表中選取特定的列，創(chuàng)建一個(gè)篩選過(guò)的數(shù)據(jù)框列表，并將這些數(shù)據(jù)框連接在一起，形成一個(gè)最終數(shù)據(jù)框。
pandas_column_ by_name_all_worksheets.py

#!/usr/bin/env python3
import pandas as pd
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

data_frame = pd.read_excel(input_file, sheet_name=None, index_col=None)
column_output = []
for worksheet_name, data in data_frame.items():
    column_output.append(data.loc[:, ['Customer Name', 'Sale Amount']])
selected_columns = pd.concat(column_output, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
selected_columns.to_excel(writer, sheet_name='selected_columns_all_worksheets',\
index=False)
writer.save()

3.4 在Excel工作簿中讀取一組工作表

在一組工作表中篩選特定行

用pandas在工作簿中選擇一組工作表，在read_excel函數(shù)中將工作表的索引值或名稱設(shè)置成一個(gè)列表。創(chuàng)建索引值列表my_ sheets，在read_excel函數(shù)中設(shè)定sheetname等于my_sheets。想從第一個(gè)和第二個(gè)工作表中篩選出銷售額大于$1900.00 的行。

pandas_value_meets_condition_set_of_worksheets.py

#!/usr/bin/env python
import pandas as pd
import sys

input_file = sys.argv[1]
output_file =sys.argv[2]

my_sheets = [0,1]
threshold = 1900.0
data_frame = pd.read_excel(input_file, sheet_name=my_sheets, index_col=None)
row_list = [ ]
for worksheet_name, data in data_frame.items():
    row_list.append(data[data['Sale Amount'].astype(float) > threshold])
filtered_rows = pd.concat(row_list, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
filtered_rows.to_excel(writer, sheet_name='set_of_worksheets', index=False)
writer.save()

3.5 處理多個(gè)工作簿

3.5.1 工作表計(jì)數(shù)以及每個(gè)工作表中的行列計(jì)數(shù)

在開(kāi)始處理工作表之前，獲取關(guān)于工作表的描述性信息非常重要。

想知道一個(gè)文件夾中工作簿的數(shù)量，每個(gè)工作簿中工作表的數(shù)量，以及每個(gè)工作表中行與列的數(shù)量：
12excel_introspect_all_ workbooks.py

#!/usr/bin/env python3
import glob
import os
import sys
from xlrd import open_work

input_directory = sys.argv[1]
workbook_counter = 0
for input_file in glob.glob(os.path.join(input_directory, '*.xls*')):
    workbook = open_workbook(input_file)
    print('Workbook: %s' % os.path.basename(input_file))
    print('Number of worksheets: %d' % workbook.nsheets)
    for worksheet in workbook.sheets():
        print('Worksheet name:', worksheet.name, '\tRows:',\
                  worksheet.nrows, '\tColumns:', worksheet.nclos)
    workbook_counter += 1
print('Number of Excel workbooks: %d' % (workbook_counter))

導(dǎo)入Python內(nèi)置的glob模塊和os模塊，使用其中的函數(shù)識(shí)別和解析待處理文件的路徑名。
使用Python內(nèi)置的glob模塊和os模塊，創(chuàng)建要處理的輸入文件列表，并對(duì)輸入文件列表應(yīng)用for循環(huán)，對(duì)所有要處理的工作簿進(jìn)行迭代。

3.5.2 從多個(gè)工作簿中連接數(shù)據(jù)

pandas提供concat函數(shù)連接數(shù)據(jù)框。

如果想把數(shù)據(jù)框一個(gè)一個(gè)地垂直堆疊，設(shè)置參數(shù)axis=0。
如果想把數(shù)據(jù)框一個(gè)一個(gè)地平行連接，設(shè)置參數(shù)axis=1。

如果要基于某個(gè)關(guān)鍵字列連接數(shù)據(jù)框，pandas的merge函數(shù)提供類似SQL join的操作。

用pandas將多個(gè)工作簿中所有工作表的數(shù)據(jù)垂直連接成一個(gè)輸出文件
pandas_concat_data_from_multiple_workbook.py

#!/usr/bin/env python3
import pandas as pd
import glot
import os
import sys

input_file = sys.argv[1]
output_file = sys.argv[2]

all_workbooks = glob.glob(os.path.join(input_path, '*.xls*'))
data_frames = [ ]
for workbook in all_workbooks:
    all_workbooks = pd.read_excel(workbook, sheet_name=None, index_col=None)
    for worksheet_name, data in all_worksheets.items():
            data_frames.append(data)
all_data_concatenated = pd.concat(data_frame, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
all_data_concatenated.to_excel(writer, sheet_name='all_data_all_workbooks',\
index=False)
writer.save()

3.5.3 為每個(gè)工作簿和工作表計(jì)算總數(shù)和均值

pandas在多個(gè)工作簿間迭代，在工作簿級(jí)和工作表級(jí)計(jì)算統(tǒng)計(jì)量。為工作簿的每個(gè)工作表計(jì)算統(tǒng)計(jì)量，并將結(jié)果連接成一個(gè)數(shù)據(jù)框。
接下來(lái)，計(jì)算工作簿級(jí)的統(tǒng)計(jì)量，將它們轉(zhuǎn)換成一個(gè)數(shù)據(jù)框，然后通過(guò)基于工作簿名稱的左連接將兩個(gè)數(shù)據(jù)框合并在一起，并將結(jié)果數(shù)據(jù)框添加到一個(gè)列表中。
當(dāng)所有工作簿級(jí)的數(shù)據(jù)框都進(jìn)入列表后，將這些數(shù)據(jù)框連接成一個(gè)獨(dú)立數(shù)據(jù)框，并寫入輸出文件。
pandas_sum_average_multiple_workbook.py

#!/usr/bin/env python3
import pandas as pd
import glob
import os
import sys

input_path = sys.argv[1]
output_file = sys.argv[2]
all_workbooks = glob.glob(os.path.join(input_path, '*.xls*'))
data_frames = [ ]
for workbook in all_workbooks:
    all_workbooks = pd.read_excel(workbook, sheet_name=None, index_col=None)
    workbook_total_sales = [ ]
    workbook_number_of_sales = [ ]
    worksheet_data_frames = [ ]
    worksheets_data_frame = None
    workbook_data_frame = None
    for worksheet_name, data in all_worksheets.items():
        total_sales = pd.DataFrame([float(str(value).strip('$').replace(\
        ',',' '))
        for value in data.loc[:, 'Sale Amount']]).sum()
        number_of_sales = len(data.loc[:, 'Sale Amount'])
        average_sales = pd.DataFrame(total_sales / number_of_sales)

        workbook_total_sales.append(total_sales)
        workbook_number_of_sales.append(number_of_sales)
        data = {'workbook': os.path.basename(workbook),
                  'worksheet': worksheet_name,
                  'worksheet_total': total_sales,
                  'wprksheet_average': average_sales}
        
        worksheet_Data_frame.append(pd.DataFrame(data, \
        columns=['workbook', ''worksheet, \
        'worksheet_total', 'worksheet_average']))
    worksheets_data_frame = pd.concat(\
    worksheet_data_frames, axis=0, ignore_index=True
    workbook_total = pd.DataFrame(workbook_total_sales).sum()
    workbook_total_number_of_sales = pd.DaraFrame(\
    workbook_numbwe_of_sales).sum()
    workbook_average = pd.DataFrame(\
    workbook_total / workbook_total_number_of_sales)

    work_stats = {'workbook': os.path.basename(workbook),
                     'workbook_total'}: workbook_total,
                     'workbook_average': workbook_average}
    workbooks_stats = pd.DataFrame(workbook_stats, columns=\
    ['workbook', 'workbook_total', 'workbook_average'])
    workbook_data_frame = pd.merge(worksheets_data_frame, workbook_stats, \
    on='workbook', how='left')
    data_frames.append(workbook_data_frame)
all_data_concatenated = pd.concat(data_frames, axis=0, ignore_index=True)
writer = pd.ExcelWriter(output_file)
all_data_concatenated.to_excel(writer, sheet_name='sums_and_averages', index=False)
writer.save()

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Python3分析Excel數(shù)據(jù)

Python3分析Excel數(shù)據(jù)

3.1 內(nèi)省Excel工作薄

3.2 處理單個(gè)工作表

3.2.1 讀寫Excel文件

3.2.2 篩選特定行

行中的值滿足某個(gè)條件

行中的值屬于某個(gè)集合

行中的值匹配于特定模式

3.2.3 選取特定列

3.3 讀取工作簿中的所有工作表

3.3.1 在所有工作表中篩選特定行

3.3.2 在所有工作表中選取特定列

3.4 在Excel工作簿中讀取一組工作表

在一組工作表中篩選特定行

3.5 處理多個(gè)工作簿

3.5.1 工作表計(jì)數(shù)以及每個(gè)工作表中的行列計(jì)數(shù)

3.5.2 從多個(gè)工作簿中連接數(shù)據(jù)

3.5.3 為每個(gè)工作簿和工作表計(jì)算總數(shù)和均值

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Python3分析Excel數(shù)據(jù)

3.1 內(nèi)省Excel工作薄

3.2 處理單個(gè)工作表

3.2.1 讀寫Excel文件

3.2.2 篩選特定行

行中的值滿足某個(gè)條件

行中的值屬于某個(gè)集合

行中的值匹配于特定模式

3.2.3 選取特定列

3.3 讀取工作簿中的所有工作表

3.3.1 在所有工作表中篩選特定行

3.3.2 在所有工作表中選取特定列

3.4 在Excel工作簿中讀取一組工作表

在一組工作表中篩選特定行

3.5 處理多個(gè)工作簿

3.5.1 工作表計(jì)數(shù)以及每個(gè)工作表中的行列計(jì)數(shù)

3.5.2 從多個(gè)工作簿中連接數(shù)據(jù)

3.5.3 為每個(gè)工作簿和工作表計(jì)算總數(shù)和均值

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av