1.實驗目的
通過使用智能編程語言(BANGC)進行算子開發(fā),對高性能庫(CNML) 算子進行擴展,并最終集成到編程框架(TensorFlow)中,掌握對高性能庫及編 程框架進行擴展的能力,使讀者可以在 DLP 硬件上自由設計并優(yōu)化滿足特定應 用場景的新算子,滿足日新月異智能算法的實際需求。
2.背景介紹
智能編程語言開發(fā)所需的編譯工具鏈包括但不限于 CNCC、CNGDB 等。該部分詳見理論課程 PPT。
3.實驗內(nèi)容
a) 算子實現(xiàn):采用智能編程語言 BCL 實現(xiàn) PowerDifference 算子;
b) 算子測試:對 PowerDifference 算子本身進行測試,保證其功能正確;
c) 框架集成:通過高性能庫 PluginOp 的接口對 PowerDifference 算子進行封裝,
使其調(diào)用方式和高性能庫原有算子一致,
將封裝后的算子集成到TensorFlow 編程框架中;
d) 框架算子測試:使用框架 API 測試上一步集成在 TensorFlow 中的算子,保證其功能正確。
- 實驗過程
a) 登錄云平臺:ssh root@xxx.xxx.xxx.xxx -p xxx
(寒武紀-開發(fā)平臺-登錄申請資源)
b) 初始化環(huán)境:cd /opt/AICSE-demo-student/env;source env.sh
c) cd /opt/AICSE-demo-student/demo/style_transfer_bcl/src/bangc/PluginPowerDifferenceOp
d) PowerDifference BANGC 算子實現(xiàn),
補全 plugin_power_difference_kernel.h和 plugin_power_difference_kernel.mlu 文件。
e) PowerDifference BANGC 算子測試,
補全 powerDiff.cpp 文件,執(zhí)行./make.sh
f) cnplugin 集成:
補全 plugin_power_difference_op.cc 和 cnplugin.h 并編譯新的 Cambricon-CNPlugin。
編譯前復制改動后的PluginPowerDifferenceOp文件夾以及頭文件
cp -r /opt/AICSE-demo-student/demo/style_transfer_bcl/src/bangc/PluginPowerDifferenceOp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/pluginops/
cp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/pluginops/PluginPowerDifferenceOp/cnplugin.h /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/common/include/
編譯
cd /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/
./build_cnplugin.sh --mlu200
得到新的libcnplugin.so,將其放到tensorflow源碼目錄下
cp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/build/libcnplugin.so /opt/AICSE-demo-student/env/neuware/lib64/
//頭文件也放到tensorflow目錄下
cp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/pluginops/PluginPowerDifferenceOp/cnplugin.h /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/common/include/cnplugin.h
g) TensorFlow 算子集成,將下述文件夾中的文件依次添加到 TensorFlow 源碼中(由于課程時間關系,該部分代碼直接給出):
/opt/AICSE-demostudent/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff;
/opt/AICSE-demo-student/env/tensorflow-v1.10
將demo項目中的部分文件按課件readme要求,復制到tensorflow源碼目錄下
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/BUILD /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/core/kernels/BUILD
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/mlu_stream.h /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/mlu_lib_ops.* /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/mlu_api/lib_ops/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/mlu_ops.h /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/mlu_api/ops/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/power_difference.cc /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/mlu_api/ops/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/math_ops.cc /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/core/ops/
編譯
cd /opt/AICSE-demo-student/env/tensorflow-v1.10
./build_tensorflow-v1.10_mlu.sh
h) 框架算子測試,
補全.../src/online_mlu/power_difference_test_bcl.py
和 .../src/online_cpu/power_difference_test_cpu.py 文件,
執(zhí)行: python power_difference_test_xxx.py
踩坑集合
tensorflow編譯報socket closed
將編譯腳本job_nums=32 改為job_nums=16