Executing
Now that the model is trained, we don’t want to re-train it every time we want to classify or regress data. Because we saved the trained model to the document library, we can retrieve it and execute it against a dataset using an executor model.
The model takes the following parameters:
training_model_name
: the name of the model created in the training section of the tutorialtarget
: the target column to fill the prediction, or leave blank for default ofresult
.data
: tabular data that contains every column used as a feature in the trained model, with as many rows as should be processed.
Examples are provided below for retrieving the trained model data from either the Akumen Document Manager or a third party cloud storage service.
Akumen Document Manager
To create an ML execution model, you can do the following:
- Go to the App Manager, and select
Create Application -> Python Model
, namedML Executor - Breast Cancer
. - Click the
Git Clone
button on the toolbar, and enter the git url:https://gitlab.com/optika-solutions/apps/auto-sklearn-executor-document.git
. You can leave the username and password blank, and the branch onmaster
. Clickok
. - Go to the research grid and enter the following:
training_model_name
: ML Trainer - Breast Cancertarget
: leave blank.data
: see below.
As a sample, you can use the following for data
(Save the contents as a CSV and upload to the data spreadsheet):
radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,fractal_dimension_mean,radius_se,texture_se,perimeter_se,area_se,smoothness_se,compactness_se,concavity_se,concave points_se,symmetry_se,fractal_dimension_se,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
17.99,10.38,122.8,1001,.1184,.2776,.3001,.1471,.2419,.07871,1.095,.9053,8.589,153.4,.006399,.04904,.05373,.01587,.03003,.006193,25.38,17.33,184.6,2019,.1622,.6656,.7119,.2654,.4601,.1189
Execute the scenario. Once completed, go to the data tab and find the result
column.
Third Party Cloud Storage
Alternatively, if an Amazon S3 bucket was used for the ML Training model, the steps below can be used to retrieve the data:
The model takes the following parameters:
training_model_name
: the name of the model created in the training section of the tutorialjoblib_location
: see below.target
: the target column to fill the prediction, or leave blank for default ofresult
.data
: tabular data that contains every column used as a feature in the trained model, with as many rows as should be processed.
joblib_location
JSON:
{
"provider": "s3",
"bucket": "model-bucket",
"key": "xxx",
"secret": "xxx",
"region": "ap-southeast-2"
}
To create an ML execution model, you can do the following:
- Go to the App Manager, and select
Create Application -> Python Model
, namedML Executor - Breast Cancer
. - Click the
Git Clone
button on the toolbar, and enter the git url:https://gitlab.com/optika-solutions/apps/auto-sklearn-executor.git
. You can leave the username and password blank, and the branch onmaster
. Clickok
. - Go to the research grid and enter the following:
training_model_name
: ML Trainer - Breast Cancerjoblib_location
: see above.target
: leave blank.data
: see below.
As a sample, you can use the following for data
(Save the contents as a CSV and upload to the data spreadsheet):
radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,fractal_dimension_mean,radius_se,texture_se,perimeter_se,area_se,smoothness_se,compactness_se,concavity_se,concave points_se,symmetry_se,fractal_dimension_se,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
17.99,10.38,122.8,1001,.1184,.2776,.3001,.1471,.2419,.07871,1.095,.9053,8.589,153.4,.006399,.04904,.05373,.01587,.03003,.006193,25.38,17.33,184.6,2019,.1622,.6656,.7119,.2654,.4601,.1189
Execute the scenario. Once completed, go to the data tab and find the result
column.