學習pycaret之前，先搭建好jupyter notebook。代碼實現是基于jupyter的。

安裝pycaret(默認的是cpu版本)

 #create a conda environment
conda create --name pycaret3 python=3.9

# activate conda environment
conda activate pycaret3

# install pycaret
pip install pycaret [full]

#創(chuàng)建一個notebook kernel
python -m ipykernel install --user --name pycaret3 --display-name "pycaret3"

如果你有GPU可以考慮安裝支持GPU的pycaret

前面的步驟和上面的cpu版本完全一樣。下面是需要手動安裝的內容

pip3 uninstall lightgbm -y
#先降級pip版本，否則無法使用--install-option參數
pip3 install pip==22.2.1
pip3 install lightgbm --install-option=--gpu --install-option="--opencl-include-dir=~/CUDA11.8/include/" --install-option="--opencl-library=~/CUDA11.8/lib64/libOpenCL.so"

上面的~/CUDA11.8/是我的cuda的安裝位置。需要修改為你自己的cuda的安裝位置
還需要cuml ,這個需要根據自己情況選擇對應版本Installation Guide - RAPIDS Docs
RAPIDS里面包含這個cuml.

pycaret是可以實現多個機器學習的包裝器
包含的有scikit-learn,XGBoost,LightGBM,CatBoost,SpaCy,Optuna,Hyperopt,Ray等。

有監(jiān)督機器學習

分類Classification

二元分類
多元分類
pycaret.classification
官方的分類的所有函數的API

image.png

回歸Regression

pycaret.regression
官方的回歸的所有函數的API

image.png

無監(jiān)督機器學習

異常檢測Anomaly Detection

pycaret.anomaly
異常檢測的官方API

image.png

聚類Clustering

pycaret.clustering
聚類官方API

image.png

時間序列分析 Time Series Forecasting

pycaret.time_series
時間序列官方API

image.png

pycaret分析的基本步驟

讀取數據get_data
初始化安裝，導入分析模型類型
模型訓練和選擇
可視化最優(yōu)的模型
預測測試集的數據
預測新的數據的結果
保存模型

數據預處理

數據預處理原文
缺失值，一般為空白或NaN
使用setup函數后會自動初始化，并填充缺失值

# load dataset
from pycaret.datasets import get_data
hepatitis = get_data('hepatitis')

原始數據

# init setup
from pycaret.classification import *
clf1 = setup(data = hepatitis, target = 'Class')

初始化之后自動填充了缺失數據

不同類型的填充數據的方法的MAPE值

MAPE值越低，說明填充的結果約接近真實值
軟件默認的缺失數據填充
數字值 numeric_imputation: int, float, or string, defaul:mean 默認是用均值
可以使用的參數值：
drop 刪除包含缺失的行
mean 均值
median 使用中間值填充
mode 使用頻率最多的值填充
knn 使用knn近鄰法填充
int or float 使用提供的數值

分類值 categorical_imputation: string, defaul:mode
可以使用的參數值：
drop
mode
str 使用提供的字符串

`imputation_type`設置填補類型

默認是simple
可選值是： simple, iterative, None
如果是None則不填充

數據填充使用的模型

numeric_iterative_imputer:str or sklearn estimator ,默認值是：lightgbm
categorical_iterative_imputer:str or sklearn estimator ,默認值是：lightgbm

數據類型,包括數字，分類或日期時間，pycaret會自動檢測數據類型

如果pycaret自動檢測的數據類型和預期的不一致，則可以手動指定為對應的數據類

一次性編碼，數據集的分類特征包含標簽值

序數編碼，數據集中的分類特征包含具有內在自然順序的變量，例如：（低，中，高）

基數編碼

目標不平衡，當訓練數據集的目標類分布不均勻時，可以使用fix_imbalance設置中的參數進行修復。
刪除異常值 remove_outliers

pycaret3可用的模型種類

分類模型classification

縮寫	模型全稱
lr	Logistic Regression
knn	K Neighbors Classifier
nb	Naive Bayes
dt	Decision Tree Classifier
svm	SVM - Linear Kernel
rbfsvm	SVM - Radial Kernel
gpc	Gaussian Process Classifier
mlp	MLP Classifier
ridge	Ridge Classifier
rf	Random Forest Classifier
qda	Quadratic Discriminant Analysis
ada	Ada Boost Classifier
gbc	Gradient Boosting Classifier
lda	Linear Discriminant Analysis
et	Extra Trees Classifier
xgboost	Extreme Gradient Boosting
lightgbm	Light Gradient Boosting Machine
catboost	CatBoost Classifier

回歸模型 regression

模型縮寫	模型全稱
lr	Linear Regression
lasso	Lasso Regression
ridge	Ridge Regression
en	Elastic Net
lar	Least Angle Regression
llar	Lasso Least Angle Regression
omp	Orthogonal Matching Pursuit
br	Bayesian Ridge
ard	Automatic Relevance Determination
par	Passive Aggressive Regressor
ransac	Random Sample Consensus
tr	TheilSen Regressor
huber	Huber Regressor
kr	Kernel Ridge
svm	Support Vector Regression
knn	K Neighbors Regressor
dt	Decision Tree Regressor
rf	Random Forest Regressor
et	Extra Trees Regressor
ada	AdaBoost Regressor
gbr	Gradient Boosting Regressor
mlp	MLP Regressor
xgboost	Extreme Gradient Boosting
lightgbm	Light Gradient Boosting Machine
catboost	CatBoost

時間序列模型列表Time Series

時間序列模型縮寫	模型全稱
naive	Naive Forecaster
grand_means	Grand Means Forecaster
snaive	Seasonal Naive Forecaster (disabled when seasonal_period = 1)
polytrend	Polynomial Trend Forecaster
arima	ARIMA family of models (ARIMA, SARIMA, SARIMAX)
auto_arima	Auto ARIMA
exp_smooth	Exponential Smoothing
stlf	STL Forecaster
croston	Croston Forecaster
ets	ETS
theta	Theta Forecaster
tbats	TBATS
bats	BATS
prophet	Prophet Forecaster
lr_cds_dt	Linear w/ Cond. Deseasonalize & Detrending
en_cds_dt	Elastic Net w/ Cond. Deseasonalize & Detrending
ridge_cds_dt	Ridge w/ Cond. Deseasonalize & Detrending
lasso_cds_dt	Lasso w/ Cond. Deseasonalize & Detrending
llar_cds_dt	Lasso Least Angular Regressor w/ Cond. Deseasonalize & Detrending
br_cds_dt	Bayesian Ridge w/ Cond. Deseasonalize & Deseasonalize & Detrending
huber_cds_dt	Huber w/ Cond. Deseasonalize & Detrending
omp_cds_dt	Orthogonal Matching Pursuit w/ Cond. Deseasonalize & Detrending
knn_cds_dt	K Neighbors w/ Cond. Deseasonalize & Detrending
dt_cds_dt	Decision Tree w/ Cond. Deseasonalize & Detrending
rf_cds_dt	Random Forest w/ Cond. Deseasonalize & Detrending
et_cds_dt	Extra Trees w/ Cond. Deseasonalize & Detrending
gbr_cds_dt	Gradient Boosting w/ Cond. Deseasonalize & Detrending
ada_cds_dt	AdaBoost w/ Cond. Deseasonalize & Detrending
lightgbm_cds_dt	Light Gradient Boosting w/ Cond. Deseasonalize & Detrending
catboost_cds_dt	CatBoost w/ Cond. Deseasonalize & Detrending

聚類模型列表Clustering

聚類的模型名稱縮寫	模型的全稱
kmeans	K-Means Clustering
ap	Affinity Propagation
meanshift	Mean shift Clustering
sc	Spectral Clustering
hclust	Agglomerative Clustering
dbscan	Density-Based Spatial Clustering
optics	OPTICS Clustering
birch	Birch Clustering
kmodes	K-Modes Clustering

異常檢測Anomaly Detection

異常檢測的模型縮寫	異常檢測的模型全稱
abod	Angle-base Outlier Detection
cluster	Clustering-Based Local Outlier
cof	Connectivity-Based Outlier Factor
histogram	Histogram-based Outlier Detection
iforest	Isolation Forest
knn	k-Nearest Neighbors Detector
lof	Local Outlier Factor
svm	One-class SVM detector
pca	Principal Component Analysis
mcd	Minimum Covariance Determinant
sod	Subspace Outlier Detection
sos	Stochastic Outlier Selection

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

機器學習pycaret框架入門簡介.0

機器學習pycaret框架入門簡介.0

安裝pycaret(默認的是cpu版本)

如果你有GPU可以考慮安裝支持GPU的pycaret

有監(jiān)督機器學習

分類Classification

回歸Regression

無監(jiān)督機器學習

異常檢測Anomaly Detection

聚類Clustering

時間序列分析 Time Series Forecasting

pycaret分析的基本步驟

數據預處理

`imputation_type`設置填補類型

數據填充使用的模型

數據類型,包括數字，分類或日期時間，pycaret會自動檢測數據類型

pycaret3可用的模型種類

分類模型classification

回歸模型 regression

時間序列模型列表Time Series

聚類模型列表Clustering

異常檢測Anomaly Detection

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

機器學習pycaret框架入門簡介.0

安裝pycaret(默認的是cpu版本)

如果你有GPU可以考慮安裝支持GPU的pycaret

有監(jiān)督機器學習

分類Classification

回歸Regression

無監(jiān)督機器學習

異常檢測Anomaly Detection

聚類Clustering

時間序列分析 Time Series Forecasting

pycaret分析的基本步驟

數據預處理

imputation_type設置填補類型

數據填充使用的模型

數據類型,包括數字，分類或日期時間 ，pycaret會自動檢測數據類型

pycaret3可用的模型種類

分類模型classification

回歸模型 regression

時間序列模型列表Time Series

聚類模型列表Clustering

異常檢測Anomaly Detection

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

`imputation_type`設置填補類型

數據類型,包括數字，分類或日期時間，pycaret會自動檢測數據類型