mirror of
https://github.com/LCTT/TranslateProject.git
synced 2024-12-26 21:30:55 +08:00
commit
70474628e6
@ -1,157 +0,0 @@
|
||||
[#]: subject: "PyCaret: Machine Learning Model Development Made Easy"
|
||||
[#]: via: "https://www.opensourceforu.com/2022/05/pycaret-machine-learning-model-development-made-easy/"
|
||||
[#]: author: "S Ratan Kumar https://www.opensourceforu.com/author/s-ratan/"
|
||||
[#]: collector: "lkxed"
|
||||
[#]: translator: "geekpi"
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
|
||||
PyCaret: Machine Learning Model Development Made Easy
|
||||
======
|
||||
Organisations use low code/no code (LC/NC) apps to construct new information systems swiftly in today’s fast-paced digital world. This article introduces PyCaret a low code machine learning library written in Python.
|
||||
|
||||
![Featured-image-of-pycaret][1]
|
||||
|
||||
PyCaret is a Python version of the Caret (short for Classification And REgression Training) package in the R programming language, and has many benefits.
|
||||
|
||||
- **Increased productivity:** PyCaret, being a low code library, makes you more productive. With less time spent coding, you and your team can now focus on business problems.
|
||||
- **Easy to use:** This simple and easy to use machine learning library will help you to perform end-to-end ML experiments with fewer lines of code.
|
||||
- **Business ready:** PyCaret is a business-ready solution. It allows you to do prototyping quickly and efficiently from your choice of notebook environment.
|
||||
|
||||
You can create a virtual environment in Python and execute the following command to install the PyCaret complete version:
|
||||
|
||||
```
|
||||
pip install pycaret [full]
|
||||
```
|
||||
|
||||
A machine learning practitioner can do classification, regression, clustering, anomaly detection, natural language processing, association rules mining and time series analysis with PyCaret.
|
||||
|
||||
### Classification model building with PyCaret
|
||||
|
||||
This article explains classification model building with PyCaret by taking the Iris data set from PyCaret’s data repository.
|
||||
|
||||
We will use the Google Colab environment to make things simple and follow the steps mentioned below.
|
||||
|
||||
#### Step 1
|
||||
|
||||
First, install PyCaret by giving the following command:
|
||||
|
||||
```
|
||||
pip install pycaret
|
||||
```
|
||||
|
||||
#### Step 2
|
||||
|
||||
Next, load the data set, as shown in Figure 1:
|
||||
|
||||
![Loading the data set][2]
|
||||
|
||||
```
|
||||
from pycaret.datasets import get_data
|
||||
dataset = get_data(‘iris’)
|
||||
(or)
|
||||
import pandas as pd
|
||||
dataset = pd.read_csv(/path_to_data/file.csv’)
|
||||
```
|
||||
|
||||
#### Step 3
|
||||
|
||||
Now set up the PyCaret environment, as shown in Figure 2:
|
||||
|
||||
![PyCaret environment setup][3]
|
||||
|
||||
```
|
||||
from pycaret.classification import *
|
||||
clf1 = setup (data=dataset, target = ‘species’)
|
||||
```
|
||||
|
||||
![PyCaret environment setup result][4]
|
||||
|
||||
For any type of model building with PyCaret, the environment setup is the most important step. By default, *setup ()* function takes the *data*: Pandas DataFrame and target, which points to the class label variable in the data set. The result of the setup function is shown in Figure 3. The setup function, by default, splits 70 per cent of the data as train set and 30 per cent as test set, and does data preprocessing as shown in Figure 3.
|
||||
|
||||
#### Step 4
|
||||
|
||||
Next, find the best model, as shown in Figure 4:
|
||||
|
||||
![Finding the best model][5]
|
||||
|
||||
```
|
||||
best = compare_models()
|
||||
```
|
||||
|
||||
The *compare_models()*, by default, applies ten-fold cross validation and calculates different performance metrics like accuracy, AUC, recall, precision, F1 score, Kappa and MCC for different classifiers with lesser training times, as shown in Figure 4. By passing the tubro=True to*compare_models()* function we can try all the classifiers.
|
||||
|
||||
#### Step 5
|
||||
|
||||
Now create the model, as shown in Figure 5:
|
||||
|
||||
![Creating the model][6]
|
||||
|
||||
```
|
||||
lda_model=create_model (‘lda’)
|
||||
```
|
||||
|
||||
The Linear Discriminant Analysis classifier is performing well, as shown in Figure 4. So by passing ‘lda’ to the *create_model()* function, we can fit the model.
|
||||
|
||||
#### Step 6
|
||||
|
||||
The next step is to fine tune the model, as shown in Figure 6.
|
||||
|
||||
![Tuning the model][7]
|
||||
|
||||
```
|
||||
tuned_lda=tune_model(lda_model)
|
||||
```
|
||||
|
||||
Tuning of hyper parameters can improve the model accuracy. The *tune_model()* function improved the Linear Discriminant Analysis model accuracy from 0.9818 to 0.9909, as shown in Figure 7.
|
||||
|
||||
![Tuned model details][8]
|
||||
|
||||
#### Step 7
|
||||
|
||||
The next step is to make predictions, as shown in Figure 8:
|
||||
|
||||
![Predictions using the tuned model][9]
|
||||
|
||||
```
|
||||
predictions=predict_model(tuned_lda)
|
||||
```
|
||||
|
||||
The *predict_model()* function is used for making the predictions of the samples present in the test data.
|
||||
|
||||
#### Step 8
|
||||
|
||||
Now plot the model performance, as shown in Figure 9:
|
||||
|
||||
![Evaluating and plotting the model performance — confusion matrix][10]
|
||||
|
||||
```
|
||||
evaluate_model(tuned_lda)
|
||||
```
|
||||
|
||||
The *evaluate_model ()* function is used to develop different performance metrics with minimum effort. You can try them out to see the output.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.opensourceforu.com/2022/05/pycaret-machine-learning-model-development-made-easy/
|
||||
|
||||
作者:[S Ratan Kumar][a]
|
||||
选题:[lkxed][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://www.opensourceforu.com/author/s-ratan/
|
||||
[b]: https://github.com/lkxed
|
||||
[1]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Featured-image-of-pycaret-696x477.jpg
|
||||
[2]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-1-loading-the-dataset.jpg
|
||||
[3]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-2-PyCaret-Environment-Setup.jpg
|
||||
[4]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-3-PyCaret-Environment-Setup-Result.jpg
|
||||
[5]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-4-Finding-the-best-model.jpg
|
||||
[6]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-5-Creating-the-model.jpg
|
||||
[7]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-6-Tuning-the-model.jpg
|
||||
[8]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-7Tuned-model-details.jpg
|
||||
[9]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-8-Predictions-using-tuned-model.jpg
|
||||
[10]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-9-Evaluating-and-ploting-the-model-performance-Confusion-Matrix.jpg
|
@ -0,0 +1,157 @@
|
||||
[#]: subject: "PyCaret: Machine Learning Model Development Made Easy"
|
||||
[#]: via: "https://www.opensourceforu.com/2022/05/pycaret-machine-learning-model-development-made-easy/"
|
||||
[#]: author: "S Ratan Kumar https://www.opensourceforu.com/author/s-ratan/"
|
||||
[#]: collector: "lkxed"
|
||||
[#]: translator: "geekpi"
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
|
||||
PyCaret:机器学习模型开发变得简单
|
||||
======
|
||||
在当今快节奏的数字世界中,组织使用低代码/无代码 (LC/NC) 应用来快速构建新的信息系统。本文介绍 PyCaret,一个用 Python 编写的低代码机器学习库。
|
||||
|
||||
![Featured-image-of-pycaret][1]
|
||||
|
||||
PyCaret 是 R 编程语言中 Caret(分类和回归训练的缩写)包的 Python 版本,具有许多优点。
|
||||
|
||||
- **提高工作效率:** PyCaret 是一个低代码库,可让你提高工作效率。由于花费更少的时间进行编码,你和你的团队现在可以专注于业务问题。
|
||||
- **易于使用:** 这个简单易用的机器学习库将帮助你以更少的代码行执行端到端的机器学习实验。
|
||||
- **可用于商业:** PyCaret 是一个可用于商业的解决方案。它允许你从选择的 notebook 环境中快速有效地进行原型设计。
|
||||
|
||||
你可以在 Python 中创建一个虚拟环境并执行以下命令来安装 PyCaret 完整版:
|
||||
|
||||
```
|
||||
pip install pycaret [full]
|
||||
```
|
||||
|
||||
机器学习从业者可以使用 PyCaret 进行分类、回归、聚类、异常检测、自然语言处理、关联规则挖掘和时间序列分析。
|
||||
|
||||
### 使用 PyCaret 构建分类模型
|
||||
|
||||
本文通过从 PyCaret 的数据仓库中获取 Iris 数据集来解释使用 PyCaret 构建分类模型。
|
||||
|
||||
我们将使用 Google Colab 环境使事情变得简单,并按照下面提到的步骤进行操作。
|
||||
|
||||
#### 步骤 1
|
||||
|
||||
首先,通过给出以下命令安装 PyCaret:
|
||||
|
||||
```
|
||||
pip install pycaret
|
||||
```
|
||||
|
||||
#### 步骤 2
|
||||
|
||||
接下来,加载数据集,如图 2 所示:
|
||||
|
||||
![Loading the data set][2]
|
||||
|
||||
```
|
||||
from pycaret.datasets import get_data
|
||||
dataset = get_data(‘iris’)
|
||||
(or)
|
||||
import pandas as pd
|
||||
dataset = pd.read_csv(/path_to_data/file.csv’)
|
||||
```
|
||||
|
||||
#### 步骤 3
|
||||
|
||||
现在设置 PyCaret 环境,如图 2 所示:
|
||||
|
||||
![PyCaret environment setup][3]
|
||||
|
||||
```
|
||||
from pycaret.classification import *
|
||||
clf1 = setup (data=dataset, target = ‘species’)
|
||||
```
|
||||
|
||||
![PyCaret environment setup result][4]
|
||||
|
||||
对于使用 PyCaret 构建任何类型的模型,环境设置是最重要的一步。默认情况下,*setup()* 函数采用 *data*: Pandas DataFrame 和 target,它指向数据集中的类标签变量。 setup 函数的结果如图 3 所示。 setup 函数默认将 70% 的数据拆分为训练集,30% 作为测试集,并进行数据预处理,如图 3 所示。
|
||||
|
||||
#### 步骤 4
|
||||
|
||||
接下来,找到最佳模型,如图 4 所示:
|
||||
|
||||
![Finding the best model][5]
|
||||
|
||||
```
|
||||
best = compare_models()
|
||||
```
|
||||
|
||||
默认情况下,*compare_models()* 应用十倍交叉验证,并针对具有较少训练时间的不同分类器计算不同的性能指标,如准确度、AUC、召回率、精度、F1 分数、Kappa 和 MCC,如图 4 所示。通过将 tubro=True 传递给 *compare_models()* 函数,我们可以尝试所有分类器。
|
||||
|
||||
#### 步骤 5
|
||||
|
||||
现在创建模型,如图 5 所示:
|
||||
|
||||
![Creating the model][6]
|
||||
|
||||
```
|
||||
lda_model=create_model (‘lda’)
|
||||
```
|
||||
|
||||
线性判别分析分类器表现良好,如图 4 所示。因此,通过将 “lda” 传递给 *create_model()* 函数,我们可以拟合模型。
|
||||
|
||||
#### 步骤 6
|
||||
|
||||
下一步是微调模型,如图 6 所示。
|
||||
|
||||
![Tuning the model][7]
|
||||
|
||||
```
|
||||
tuned_lda=tune_model(lda_model)
|
||||
```
|
||||
|
||||
超参数的调整可以提高模型的准确性。 *tune_model()* 函数将线性判别分析模型的精度从 0.9818 提高到 0.9909,如图 7 所示。
|
||||
|
||||
![Tuned model details][8]
|
||||
|
||||
#### 步骤 7
|
||||
|
||||
下一步是进行预测,如图 8 所示:
|
||||
|
||||
![Predictions using the tuned model][9]
|
||||
|
||||
```
|
||||
predictions=predict_model(tuned_lda)
|
||||
```
|
||||
|
||||
*predict_model()* 函数用于对测试数据中存在的样本进行预测。
|
||||
|
||||
#### 步骤 8
|
||||
|
||||
现在绘制模型性能,如图 9 所示:
|
||||
|
||||
![Evaluating and plotting the model performance — confusion matrix][10]
|
||||
|
||||
```
|
||||
evaluate_model(tuned_lda)
|
||||
```
|
||||
|
||||
*evaluate_model ()* 函数用于以最小的努力开发不同的性能指标。你可以尝试它们并查看输出。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.opensourceforu.com/2022/05/pycaret-machine-learning-model-development-made-easy/
|
||||
|
||||
作者:[S Ratan Kumar][a]
|
||||
选题:[lkxed][b]
|
||||
译者:[geekpi](https://github.com/geekpi)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://www.opensourceforu.com/author/s-ratan/
|
||||
[b]: https://github.com/lkxed
|
||||
[1]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Featured-image-of-pycaret-696x477.jpg
|
||||
[2]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-1-loading-the-dataset.jpg
|
||||
[3]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-2-PyCaret-Environment-Setup.jpg
|
||||
[4]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-3-PyCaret-Environment-Setup-Result.jpg
|
||||
[5]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-4-Finding-the-best-model.jpg
|
||||
[6]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-5-Creating-the-model.jpg
|
||||
[7]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-6-Tuning-the-model.jpg
|
||||
[8]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-7Tuned-model-details.jpg
|
||||
[9]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-8-Predictions-using-tuned-model.jpg
|
||||
[10]: https://www.opensourceforu.com/wp-content/uploads/2022/03/Figure-9-Evaluating-and-ploting-the-model-performance-Confusion-Matrix.jpg
|
Loading…
Reference in New Issue
Block a user