Sales Forecasting with Machine Learning

Machine learning provides a method of easily consuming large datasets to make predictions. This can be particularly useful when forecasting sales with the goal of accurately knowing the amount of a product that will be purchased in advance. If a company had a good knowledge of the future this would help that company manage stock levels and make market decisions.

Determining the future demand of a product is often tackled by using the regression analysis. Which is an analysis of how different variables relate to one another.

Put simply say I have a product of which I know the following:

  • It was sold 50 times yesterday.
  • It has been sold a maximum of 65 times in any one day.
  • It has been sold a minimum of 10 times in any one day.
  • It sells on average 52 times a day.
  • It was sold 360 times last week.
  • It was sold 1533 times last month.
  • It was sold 3231 times in the last two months.

If the desire is to predict how many times it will be sold tomorrow an attempt can be made to use that information to make a prediction. Manually I might say that it will likely be within the minimum and maximum, so between 10 to 65. Then I might say it is likely to be close to the average of 52. The amount it was sold last week probably has an impact but it is difficult to know how much of an impact. Maybe I make a guess at around 50 and over a lifetime I would hope to get better at guessing.

Machine learning gives an alternative to guessing as with the information available, I can instead build a prediction machine that with computer training has many lifetimes of guessing experience and can consider far more information at once than any human mind could easily handle. In machine learning I am not limited to considering the variables I think are important in order to narrow down information into what I can easily consider. Instead I can provide everything I have and allow the computer to determine what is important, which can lead to surprising revelations about relationships which are not clear to a human observer.


The process of achieving this is to build a prediction engine that uses a machine learning model. First I build the model which is simply telling the model what data it should expect and the format of data it should provide.

I then train the model, this involves providing it with historical data and then observing how closely it matches the actual historical sales figures. By going backwards in time I can provide the model with data of which I already know the answer to and this allows us to modify the model based on the data I have. Continual iterations and modifications hopefully allows for an accurate model to be created.

Then I evaluate the model for accuracy to give us some idea of what to expect. This gives information on what are the most important variables and how closely the models predictions are tracking with historical data. Due to the random element of market demand it is unlikely that perfect accuracy will ever be achieved so knowing the level of accuracy to expect can be a guide on how to use the results.

With the use of regression analysis for machine learning I can picture the Prediction Engine as a device that works as a funnel, taking in a huge stream of data and making a single prediction based on the data.