Table of Contents

Step 1: The Data

Step 2: Creating a Task
Step 3: Model Training and Testing
Step 4: Review Models
Step 5: Publishing the Model


This tutorial provides an overview of how to use for a real-world business problem. For this example, we are using the “Dress Recommendation” dataset from the UCI Archive, which we’ve slightly modified it to fit the format of this tutorial.

A dress store is trying to determine what dresses to keep stocked in the future in order to make the best use of their limited space and maximize sales. They have already gathered data on dresses sold in the past and determined which dresses were worth stocking based on their sales and other qualities. Now they have a new selection of dresses and need an AI model to predict which dresses to stock for the future.

Step 1: The Data

Every task needs a training dataset, as this is the only effective way we can explain the task to computers. The dataset used in this task contains information about past dresses and stocking recommendations:


We look for several different types of things in the dataset used for model training:

  • Features: (shown in blue) is the data that helps inform the predictions. Data should be as varied and detailed as possible to help inform the prediction.
    • As this is the only way AI can access the information about the dresses, it is essential to make sure those features are sufficient to make such predictions. Insufficient information leads to more errors in the predictions.
      • Imagine if “dress sizes” were not included, there could be a crucial link to the fact that large-sized dresses do not sell well. If that information was missing, then that link could not be known.
  • Label: (shown in yellow) This is the column that has been chosen to make predictions of. Shown here filled, because this is past data used to build the model, future data may not have this column at all.
    • Our future data that will use the model will have the same features, but will not include a recommendations column as the label will not be needed after the model is made.

All datasets must be converted to CSV format before uploading (support for other formats coming soon), additionally, there are other things to consider when formatting your data for more complex tasks. For more information on formatting data, go here.

Step 2: Creating a Task

Now we can use this dataset to train AI models that recommend dresses to the store managers based on sales potential.

Upon clicking the button you will be asked to upload your data, this can be done by dragging and dropping or just finding the file using a file browser:

Once the data is uploaded, you have another chance to review the data:

  • Data Review: (red box) Make sure the data is shown correctly and there should be no errors in the data. Make any corrections if needed.

After the data has been uploaded and reviewed, the task parameters are set:

Certain areas of the screen have been marked to point out all three that must be specified:

  • Column to Predict: (red box) Select the label column of your choosing. This is the information you are going to learn after training your model.
    • It was decided that the “recommendation” column is our label, so it is selected as the column to predict.
  • Task: (blue box) A plethora of tasks: prediction, forecasting, and recommendations can be selected. These are all tasks we can tell the model to be made for. 
    • “Classification” is for discrete values (which binary decisions are) so it is chosen.
  • Metric: (green box) Picking a key metric helps you evaluate the model based on what you most value. All metrics picked here are various ways to measure the “accuracy” of your model, and for this task, the chosen “accuracy” metric consists of how often the model makes the correct prediction out of all predictions made.
    • Since our sales decisions will depend on the model’s predictions, we want a model that makes the correct prediction as often as possible, therefore we would choose accuracy as our metric to evaluate the quality of created models.

Metrics depend on the type of learning tasks. For other available learning tasks and metrics, please refer to Classification and Regression.

Step 3: Model Training and Testing

The process of training, testing, and creating models is entirely automatic. Still, it is not immediate, and the process can take anywhere from a few minutes to a few days depending on the type and size of the data the AI needs to learn.

While the process may be automatic, it can still be monitored:

To get a more detailed look at the process, click on the model’s name.

Step 4: Review Models

After training, the best models can be reviewed and selected:

On this screen, models can be reviewed and selected based on previously set metrics. Key points to notice are:

  • Key Metric: (red box) This is the pre-selected “important” metric that should most inform the model picking decision.
  • Detailed Info: (blue box) Here is where additional details of a certain model can be viewed and factored into the decision making process.
  • Copy ID: (green box) This is where the ID of each model can be copied to your clipboard for use in Publishing.
  • API Exporting: (orange box) If the model is to be exported to another site, this option allows it to be immediately formatted into API and published. (For more information on API click here.)

With the model selected, the last stage is to apply the model and make a prediction.

Step 5: Publishing

A model is just a file until it is put it into action, this is called “Publishing.” There are two ways to publish the model:

  • Use built-in commands from to publish the model “offline” on the user’s computer.
  • Use the API function to export the model to another site to be used by other AIs and programs.

Built-in commands use an internal program called “Eva” to function. Key things to know when using Eva:

  • Chat Window: (red box) Here the “conversation” with Eva is shown and can be reviewed for later. This is where the results of the built-in commands will be displayed and downloaded from.
  • Command Box: (blue box) Where commands to Eva are typed out. All commands must first begin with: @eva

In this scenario, the “apply model (task Id) (data Id)” command is used (shown in the red box) as this command will automatically apply the selected prediction model to any selected dataset.

Task and data Id can be found by clicking the gear icon next to each model and data, if you want to use a specific model’s Id then that can be taken from the model review screen.

Through the use of the built-in commands, predictions are created (heavily cropped and colored here):


There are three distinct columns with each one representing a key idea concerning whether or not to stock certain dresses:

  • General Recommendations: (red column) The recommendations of whether to stock or not stock a dress are represented by binary 1s(yes) and 0s(no). Whether a 0 or 1 was selected was based on a 0.5 or 50% threshold pre-selected (as default) for the first column only.
    • Threshold: The point where a decision is made. A 0.5 or 50% threshold means that if the prediction gave a “greater than 50%” certainty that a dress should be stocked, it said “yes”(1) and if it was “less than 50%” certain, it gave a “no”(0).
  • Probability of “No”: (blue column) This column shows how “certain” or “confident” the prediction was concerning a “no”(0) answer.
  • Probability of “Yes”: (yellow column) This column shows how “certain” or “confident” the prediction was concerning a “yes”(1) answer.

The second and third columns are useful for more detailed predictions using a different threshold than the prebuilt 50% of the first column.

The first row may of shown a 1 “yes” to recommend the dress be stocked, but a breakdown of the first row shows there was still a 0.43 (43%) confidence that it should not be stocked and only a 0.56 (56%) confidence that it should. This is useful information for possible situations in the example scenario:

If, after using the first column, the dress store found it did not have room for all the recommended dresses suggested in the first column. The second and third columns could be used to only pick dresses with greater than 60% (0.6) confidence they should be stocked, this new threshold would turn the first row’s predictions into a “no”(0).

This also works the other way as a lower threshold could be sought to stock more dresses if the store space is greater than the first column suggested.

Create Your Free Account

Share this Post