Forecasting is the process of using historical data to predict the future. It is commonly used to predict future sales or demand so that business can plan ahead of future events. Forecasting works by predicting numerical quantities for the future using observations from the past and therefore does not require any knowledge about the future. The immediate applications of forecasting include Sales Forecasting, Manufacturing Forecasting, and Customer Retention, to name a few.

While forecasting primarily uses historical data to make predictions, that same technology is also used in our other tasks. If you happen to have prior knowledge about future transactions and want to predict the transaction amounts, you can use numeric prediction (regression) or classification as they will both still take advantage of historical transactions when making predictions.

However, while they may use similar technology, forecasting does not require a trained model to make predictions. Instead, a forecasting model directly predicts the future using the provided data. When you update the data or change the goal of the prediction when forecasting, a new prediction is made. This way, all predictions are based on the latest data available.

Forecasting Data

Forecasting uses transactional data to gain insight into the past. In the case of sales forecasting, typically, one would provide rows of individual sales transactions:

FP1

If this leads to excessively large data files, it might be easier to aggregate the transactions together by time and product:

FC2

Both examples share some common information that allows our forecasting algorithms to understand your data.

  • Date/Time (Date Column): A forecasting task must always have a date/time column so that we know the order those transactions took place. The type of time inputted does not matter, it may contain years, days, months, weeks, minutes, or seconds.
    • Common formats are accepted and detected automatically
    • Continuous integers are also acceptable
  • Target Column (Transaction Amount Column): A forecasting task must include a numerical column (sales in this example) for which the prediction is to be made. The quantity must make sense summed up over time so that it is possible to predict the total transaction amount over a period time. For example:
    • Sales amounts in dollars can be summed up over a month to compute the monthly sales.
    • Price in stock markets can’t be summed up over time.
    • Make sure the Target Column is formatted in number format.
      • For example, if you see: 1,234.78, the formatting is incorrect. Update to number, which will remove the comma and display 1234.78.
  • Reporting Columns (SKU and Store ID Columns): These columns are optional.  They are the grouping key for the forecasting report. For a company with a national scope, they might want to forecast their national sales by product or SKU (see the above table), without further breaking down to states, counties, cities.
    • This column is optional because you can do a forecast with just a “time” column and a numeric column. This way we will be predicting the total amount without any breakdown.

The order of the rows of this kind of data would have no relation to one another. You could reshuffle the rows in any order you like and the end prediction would still be the same. The platform will automatically sort the transactions based on the date/time column.

Forecasting Task Specifications

Using OneClick.ai to perform a forecasting task is easy, but there are a few choices one must make before a forecast can be made. We discuss two common scenarios by example.

The first one is the dairy producer case, in this example, a dairy producer is trying to predict how much dairy will be sold next month, so they can gear their milk production to make just enough milk to meet demand while not creating significant waste.

After uploading the data and setting the prediction task to “Forecasting” options will appear to customize the forecast:

After selecting “Forecasting” as your task, a series of options will appear to customize your forecasting:

  • Column to Predict: Pick the column you will be predicting.
  • Date/time column: Pick the name of the column that contains your dates and times.
  • Reporting column: Pick the name of the column that contains your reporting data.
  • Forecasting Range: Select the range of the forecast, how far into the future should it predict for.
    • For the dairy example, we select 4 weeks as this gives us roughly 1 month of forecasting range. 30 days would also work.
  • Forecasting Duration:
    • From: Select a time that the forecast should begin running, this may line up with the range, but sometimes they differ greatly.
      • In this example, we would simply pick the beginning of the month we want to predict for.
    • To: Select a time when the forecast should stop running.
      • We would pick the end of the month we are predicting.
  • Update Frequency: Select how often the forecast should update and give a new forecast.
    • Because new data would only come in every few days, we can safely pick a weekly update schedule to keep the forecast updated. Daily updates would also be acceptable.

With these settings a dairy producer can begin forecasting dairy sales 4 weeks in advance for 1 month, and updates to the forecast each week.

While the range and forecasting duration lined up in this example, in a more complex scenario, one may need to make multiple forecasts at once and they may overlap. To demonstrate this possibility we will use the example of a restaurant that wants to predict how much food needs to be prepared every 20 minutes:

*Specifications not mentioned here are identical to the previous example

  • Forecasting Range: Select the range of the forecast, how far into the future should it predict for.
    • For the restaurant example, we would select 20 minutes as that is the rough window between preparing food and it needing to be thrown away.
  • Forecasting Duration:
    • From: Select a time that the forecast should begin running, this may line up with the range, but sometimes they differ greatly.
      • For the restaurant example, if we wanted our 20-minute forecast to run throughout the day we would select the beginning of the workday.
    • To: Select a time when the forecast should stop running.
      • For the restaurant, we wouldn’t want the forecast to be running when we are closed so we would select the closing time here.
  • Update Frequency: Select how often the forecast should update and give a new forecast.
    • For the restaurant, we would probably set the frequency to every 5 minutes. That way, every 5 minutes we have a new 20-minute forecast as more food is sold and the amount of food needed changes.

Through the use of a short-range forecast (20 minutes) running throughout the day and updating every 5 minutes, a restaurant can have an accurate look into the future and know how much food needs to be prepared for the next 20 minutes. By simply meeting the difference in their forecast they avoid wasting food and can still maximize predicted orders.

Reporting

Once the forecasting parameters are set, the forecasting models will be trained on your dataset and several forecasting models will be available for use. Once available, any model created can be applied to any dataset uploaded to OneClick.ai or exported to a custom API. For detailed instructions on how to do both actions please see our Walkthrough.

Once the report file is created it can be downloaded and viewed. Using the previous dairy sales example, predicting daily sales over a period of a month, opening up the report file in excel gives this information:
FP3.png
The information given for each row is:
  • Starting DateTime: Each row is a moment in time and the starting DateTime is when that particular moment begins. Since our info is a new prediction each day, this starting DateTime is the beginning of each day.
  • Ending DateTime: This is where that moment in time for that row ended, in this example it is the end of the day that began in the starting DateTime.
  • Reporting Column (Product ID):  This is the optional column selected to order our sales to specific products. This is more so the company can keep track of sales per products rather than any modeling step.
  • Prediction (Sales): This is where the predicted value for each moment in time is reported.

From this daily sales information, we can predict how many sales we will have in total each week or month. With that information, dairy production can be increased or decreased precisely meet predicted demand.

Tips For Better Forecasting

  1. Provide as much history as possible. This helps to capture seasonality or yearly changes.
  2. It is okay to aggregate if your data is very large. We can create a model using both aggregated or raw data.
  3. It is okay if some products (reporting columns) don’t have transactions that go back to the same date, simply provide all transactions recorded.
  4. The prediction window typically should be shorter than the history provided, otherwise we will not have enough data to validate a prediction. That leaves the prediction untrustworthy. So the range of the forecast shouldn’t be too far into the future from your data.
  5.  Prediction windows also shouldn’t be too small. They should be at least the same scale as the transactions provided.
    • I.e. we can’t predict daily sales if the history is provided as weekly sales. On top of that, data that is too detailed adds randomness. For some products, daily sales data may be scarce. It is more reliable/accurate to predict sales data for a month rather than for a day due to randomness.

Create Your Free Account

Related Readings

Applications of forecasting:

For a detailed case study using forecasting:

Case Study: Customer Retention

Share this Post