Predictive Analytics – Demand Planning, S&OP/ IBP, Supply Planning, Business Forecasting Blog https://demand-planning.com S&OP/ IBP, Demand Planning, Supply Chain Planning, Business Forecasting Blog Mon, 26 Aug 2019 15:42:42 +0000 en hourly 1 https://wordpress.org/?v=6.6.4 https://demand-planning.com/wp-content/uploads/2014/12/cropped-logo-32x32.jpg Predictive Analytics – Demand Planning, S&OP/ IBP, Supply Planning, Business Forecasting Blog https://demand-planning.com 32 32 Forecaster’s & Planner’s Guide To Data https://demand-planning.com/2019/08/26/forecasting-data-types/ https://demand-planning.com/2019/08/26/forecasting-data-types/#comments Mon, 26 Aug 2019 15:34:56 +0000 https://demand-planning.com/?p=7931

In supply chain and operations, raw materials are substances that are used in the  manufacturing of goods. They are the commodities to be transformed into another state that will either be used or sold. For algorithms or predictive models, data is the raw material that every insight begins with.

A piece of data, or collection of it, can help drive a predictive analytics process and uncover insights. Data are the building blocks and inputs, and without data it is nearly impossible to find answers and make decisions. That said data is not the destination. Data is not a decision. And, while data may take on many forms and be used for many things, data by itself is not insight.

Data Is Information In Its Raw Form

Information is a collection of data points that we can use to understand something about the thing being measured.

Insight is gained by analyzing data and information to understand what is going on with a particular thing or situation. The insight can then be used to make better business decisions.

Data on its own is meaningless. It is just a raw material that needs to be transformed, analyzed, and turned into understanding.

Data on its own is meaningless. It is just a raw material that needs to be transformed, analyzed, turned into understanding and shared by people with the skills, training and commitment to do so. At the same time, predictive modeling or any business insight without data is equally as meaningless. No matter how skilled you are, or how good your model is, it is like trying to produce a finished product without the proper parts.

There’s no arguing the power of data in today’s business landscape. Businesses are analyzing a seemingly endless array of data sources in order to glean insights into just about every activity – both inside their businesses and out. Right now, it seems that enterprises cannot get their hands on enough data for analysis purposes. They are looking at multiple sources and forms of data to collect and use to learn more about customers and markets, and predict how they will behave.

What Are The Different Types Of Data?

We can think about data in terms of how it is organized, as well as the source. Data may be either structured or unstructured and the source can be either internal or external.

Forecasting data types

Knowing what types of data you have, and where they come from, is crucial in the age of Big Data and analytics.

Internal Sources: Internal sources of data are those which are procured and consolidated from different branches within your organization. Examples include: purchase orders, internal transactions, marketing information, loyalty card information, information collected by websites or transactional systems owned by the company, and any other internal source that collects information about your customers.

Before you begin to look for external sources, it’s critical to ensure that all of a business’s internal data sources are mined, analyzed and leveraged for the good of the company. While external data can offer a range of benefits (that we’ll get into later), internal data sources are typically easier and quicker to collect, and can be more relevant for the company’s own purposes and insights.

External Sources: External sources of data are those which are procured, collected, or originate outside of the organization. Examples include external POS or inventory data from a retail partner, paid third party information, demographic and government or other external site data, web crawlers, macroeconomic data, and any other external source that collects information about your customers. Collection of external data may be difficult because the data has much greater variety and the sources are much more numerous.

Structured Data: Is both highly-organized and easy to digest and generally refers to data that has a defined length and format. It is sometimes thought of as more traditional data which may include names, numbers, and information that is easily formatted in columns or rows.  Structured data is largely managed with legacy analytics solutions given its already-organized nature. It may be collected, processed, manipulated and analyzed using traditional relational databases. Before the era of big data and new, emerging data sources, structured data was what organizations used to make business decisions.

Unstructured Data: Does not have an easily definable structure and is unorganized and raw, and typically isn’t a good fit for a mainstream relational database. It is basically the opposite of structured and includes all other data generated through a variety of human activities. Common examples are comments on web pages, word processing documents, videos, photos, audio files, presentations, and many other kinds of files that do not fit into the columns and rows of an excel spreadsheet.

These new data sources are made up largely of streaming data coming from social media platforms, mobile applications, location services, and Internet of Things technologies. Since the diversity among unstructured data sources is so prevalent, businesses have much more trouble managing it than they do with traditional structured data. As a result, companies are being challenged in a way they weren’t before and are having to get creative in order to pull relevant data for analytics.

Don’t Get Left Behind When It Comes To Data

You may believe that only super large companies with massive funding or technology are implementing data analytics and pushing the limits of the types of data that are collected.  While 90% or more of data today is internal structured data, it is important to understand that 90% plus of the data ‘out there’ (external data) is unstructured.

It is important to understand that 90% plus of external data is unstructured.

With this increase in data and the need to be competitive, along with the expansion of data storage capabilities and data analytics tools, the playing field has leveled. While data is not insights, new forms and types of data have given rise to demand for newer insights and this focus on data has embedded itself into the culture of more and more businesses.

 

Eric will reveal  how to update your S&OP process to incorporate predictive analytics to adapt to the changing retail landscape at IBF’s Business Planning, Forecasting & S&OP Conferences in Orlando (Oct 20-23) and Amsterdam (Nov 20-22). Join Eric and a host of forecasting, planning and analytics leaders for unparalleled learning and networking.

]]>
https://demand-planning.com/2019/08/26/forecasting-data-types/feed/ 2
What Is Predictive Analytics? https://demand-planning.com/2019/02/11/what-is-predictive-analytics/ https://demand-planning.com/2019/02/11/what-is-predictive-analytics/#comments Mon, 11 Feb 2019 14:00:22 +0000 https://demand-planning.com/?p=7586

As a discipline, predictive analytics has been around for many decades and has been a hot topic in academia for many years. Its application in the field of demand planning, though, is still relatively untapped. Its effective uses in business are more an exception than a rule. Despite the mass of information available to us, and machine learning algorithms that can model the supply chain for insights, companies have barely scratched the surface with data analytics.

Part of the reason for this may be confusion about traditional demand planning and predictive analytics. Demand is for “something” and can be for a product or service. It manifests itself as a sale to an end-user, an order, a shipment, inter-plant transfer, distribution requirement, etc. Demand planning is a process and techniques used to create a demand plan.

Broadly speaking, there are two approaches to demand forecasting– one is to obtain information or make assumptions about patterns of past purchases, the other is to obtain information or make assumptions about external factors or the likely purchase behavior of the buyer.

Predictive Analytics Tells Us Not Only What Will Happen, But Why

While predictive analytics can be utilized to develop a demand plan, more often than not most demand planners still use only demand to forecast demand. Predictive analytics does not only forecast the demand itself but uses a systematic computational analysis of data or statistics (analytics) to try to determine why. Demand planning only creates an estimate of demand – predictive analytics creates an evaluation of what the future may be “if”.

Predictive Analytics Interprets A Wide Range Of Factors Affecting Demand

Predictive analytics is the philosophy of extracting information from data sets and using advanced statistical algorithms or even machine learning techniques to identify the likelihood of an unknown future outcome. The goal is to go beyond knowing what has happened to providing a best assessment of why or what drivers will impact something occurring in the future. Predictive analytics takes a more humanistic and sometimes intuitive logical approach. Instead of relying on past historical activities, predictive analytics analyzes the influencers, interactions, and activities of the actors (consumers) in demand.

Traditional Demand Planning asks – what did the item do last year? 

The new era of Predictive Analytics asks – what does the consumer do when this happens?

Because of this, it is more than just a forecast of how much we will sell of an item next month but opening up a door into many more insights. This door is not limited to just supply chain either but brings the predictive analytics professional into every function and can add value to every business decision.

[bar id=”7593″]

It Allows For Micro Targeting Campaigns

Applied to business, predictive models are used to analyze current data and historical facts in order to better understand customers, products and partners and to identify potential risks and opportunities for a company. It can be used for micro targeting campaigns to gain strategic advantages or used to determine the color and font on a website that drives the most traffic.  As an online retailer, with predictive analytics you can understand how your page ranking, number of comments, ratings, and “winning the buy box” on Amazon impact your sales on any given day.

It translates to consumer loyalty and less churn of customers, and customizing experiences to make a higher probability of sales.

Predictive analytics has grown in prominence alongside the emergence of big data systems. As enterprises have amassed larger pools of data, they have created increased data mining opportunities to gain predictive insights. Heightened development and commercialization of machine learning tools have also helped expand predictive analytics capabilities. Because of this and the changing business environment, professionals in our field will continue to migrate to new ways of modeling and planning and start to see cross-over of these predictive models into traditional forecasting and planning as well.

Revealing the future by getting into the head of the consumers, rather than by analyzing the history of the item can pay enormous dividends. And, with the abundance of real-time consumer data available today, future demand for your organization’s products and services may be more precisely determined using predictive analytics rather than relying solely on traditional demand planning processes.

I’ll be speaking at the Predictive Business Analytics, Forecasting & Planning Conference in New Orleans from May 6-18, 2019. It’s a 2-day conference with an optional data science workshop, designed to get you up to speed with basics of predictive analytics, or get you to the cutting edge of the field with the the latest methodologies, tools and best practices.

]]>
https://demand-planning.com/2019/02/11/what-is-predictive-analytics/feed/ 2
Beware The Correlation/Causation Forecasting Trap https://demand-planning.com/2019/01/21/beware-the-correlation-causation-forecasting-trap/ https://demand-planning.com/2019/01/21/beware-the-correlation-causation-forecasting-trap/#respond Mon, 21 Jan 2019 14:12:22 +0000 https://demand-planning.com/?p=7540

“Shallow men believe in luck or in circumstance. Strong men believe in cause and effect.” – Ralph Waldo Emerson

It would be easy to see the demise of Sears Roebuck retail stores as a triumph for e-commerce. In actuality, if you were to look at newspaper advertising compared to annual revenue of Sears, you will notice an obvious pattern emerge. When we look at the steady increase of newspaper revenue since the 1950’s, it tracks both the growth of Sears and its decline. The fortunes of the newspaper industry and Sears peaked around the turn of this century with a small resurgence in 2006, only for the former to decline, and the latter forced into bankruptcy.

Looking at this data, it is clear that what Sears must do to rejuvenate itself is pour all of its remaining marketing dollars into newspaper print advertising to boost newspaper revenue to save Sear’s from pending liquidation.

Obviously.

If you did not pick up on my hint of sarcasm and how this is an example of the difference between causation and correlation, don’t feel too bad. Newspaper revenue and Sears’ revenue are only an obscure correlation much like linking ice cream sales to the murder rate in New York City.

My point is that humans are evolutionarily predisposed to see patterns and psychologically inclined to gather and fill in the blanks of information to make connections.

The Human Brain Looks For Correlation, Even If There Is None

The question of cause versus correlation, has haunted science and philosophy from their earliest days, and still dogs our heels for numerous reasons. You would think with the vast number of articles warning of its perils, this would not be a problem anymore, but we still get fooled.

We confuse coincidence with correlation and correlation with causality

We confuse coincidence with correlation, and correlation with causality. Just think of the last time something great happened when you had those special socks on – admit it, they are now your “lucky” socks even though you know deep down it had to be coincidence.

Unfortunately, we run into problems with this in business planning and forecasting as well.  Of course, if we stock out or have too much inventory the “cause” is the forecast. With poor planning we may see a correlation between forecast error and missed sales, or between long supplier lead times and excess inventory – in both scenarios, the cause is most likely not the forecast.

Forecasts do not cause excess inventory, uncertainty does (supply and demand). In addition, forecast error is not always the result, or caused by good (or bad) forecasting but is indicative of the uncertainty of the demand you are actually measuring.

Forecast accuracy is ultimately limited by the nature of the behavior we are trying to forecast

Forecasts Do Not Reduce Demand Variability

Forecast accuracy is ultimately limited by the nature of the behavior we are trying to forecast. Accuracy is the degree of closeness of the statement of quantity to that quantity’s actual (true) value. Whilst I accept that one’s ability to create an accurate forecast is correlated to demand variability, we also need to remember  that an accurate forecast does not reduce demand variability. Demand variability is an expression of how much the demand changes over time, and, to some extent, the predictability of the demand.

Analysis suggests that whatever we can do to reduce volatility in demand for our products, the better we should be at predicting the forecast. Unfortunately, most organizational policies and practices are designed to add volatility to demand rather than make it more stable. We continue to contribute to SKU proliferation, shorter life cycles and more complex channels with dynamic marketing – and still our business partners still attribute missing the forecast as demand planners’ fault.

Historical Patterns Always Repeat, Right?!

There are other ways we can become victim to the deception of causation.  Inexperienced forecasters, and those outside of forecasting, may assume that history will tell us exactly what will happen in the future. If we sold 2,000 units last year, then we will sell 2,000 this year as well. If we are using sophisticated software and the fitted model is showing 10% error, then we should expect no more than 10% variation. If there is an obvious historic pattern, then it will obviously repeat itself with the same predictability.

And we are surprised when this doesn’t happen. This mindest incorporates bad assumptions because we fall into the trap of correlation verse causation.

Know The Limitations Of Time Series Forecasting

Time series analysis looks for correlations in successive observations, not causation. One of the 5 laws of demand planning are that you can’t expect to have the same results from forecasting if there are some changes in environmental conditions. Seems obvious but we too slip into unwittingly thinking seasonality will cause a lift, not remembering the seasonal index is only a correlation and the underlying cause could be dynamic weather conditions or holidays that have whole other dependent variables.

We wear our historic data like a pair of lucky socks

We wear our historic data like a pair of lucky socks without ever understanding the cause and stating it as absolute. We become comfortable and do not do enough to look more at the cause than just the correlation.

In today’s business environment, changes in the marketplace are swift, sudden, and may not follow the historical correlation. Just looking at historic shipments alone may not give you what you need or tell the whole picture.

No matter how many articles are written on causation versus correlation, people are wired to focus more on descriptive analytics and making real or imaginary connections to what happened. But we come from a different background – one that must use the data to infer what is going to happen next. To accomplish what we need, many times, we may not fully understand the “why” but use algorithms to help reveal “what” the future holds. We just need to remember that one may not always cause the other, but just rhyme and move in similar direction and manner.

That said, I too will most likely have my lucky orange stripped socks for the next  special executive S&OP meeting or if I am speaking at a conference.

 

]]>
https://demand-planning.com/2019/01/21/beware-the-correlation-causation-forecasting-trap/feed/ 0
Attention Demand Planners: Is Your Thinking Linear Or Probabilistic? https://demand-planning.com/2018/10/23/probabilistic-forecast/ https://demand-planning.com/2018/10/23/probabilistic-forecast/#comments Tue, 23 Oct 2018 13:23:00 +0000 https://demand-planning.com/?p=7376

Research shows that 5 out of 4 business professionals do not understand statistics (pun intended). Most people don’t have a natural affinity for probability either. But probabilistic thinking lies behind predictive analytics, becoming a data driven organization, and the next wave of demand planning.

It’s not intuitive to most people, and we tend to think linearly or are still stuck in random thinking. Learning the mechanics of probabilistic thinking takes time and unless you are engrossed in it frequently, it does not come naturally.

Most Of Us Think Linearly, And That’s A Problem In Planning

For example, let’s take a hypothetical planner named Lauren who forecasts we will sell 93 units for the next period. We may know that this is only a single best estimate and may present it as such, but what other people still hear is blah blah blah 93 units. I.e, they think the forecast is just a single number even though predicting an exact figure with any degree of confidence is impossible. People, and therefore most companies, think linearly and focus on a number when they should be thinking about the risk or opportunity associated with that number.

But Other Functions Need An Exact Number

But in many ways, we are providing what others want and need. Individuals have difficulty comprehending multiple truths and therefore have a desire for “the” answer. Most MRP systems and supply chains require a discrete signal to drive supply planning and operations, and struggle with uncertainty. Finance is looking for “the” number of what sales will be, so they can plan their P&L to the penny. To accommodate everyone and cater to people’s lack of understanding of probabilities, demand planning provides a single point or deterministic forecast of the what will occur. A Deterministic Forecast is a forecast in which outcomes are a single point determined through known relationships or patterns, without any room for random variation.

As Humans, We Tend To Struggle With Understanding Probabilities

Look at a common recurring miracle that happens in organizations every year. The forecast is 93 units and yes there is upside but the most we will ever sell is 153 units. This is with 95% confidence so there is less than 5% chance we would ever sell more than that. And then the miracle occurs where sales far exceeds expectations for a single period and everyone is shocked. Consider that with a 95% probability of an occurrence in a period of let’s say a week, that would be close to a 20% probability of the inverse (or more than that amount) in any given week during a 4-week period. There is just shy of a 50% chance over the quarter of seeing a miracle, and you should expect that 3.6 miracles may occur over the course of a year.

We all get caught in thinking the implausible is impossible by not comprehending probabilities

Whist this may be slightly confusing, and you may have to pause and reread that paragraph a couple of times, the point is we all get caught in thinking the implausible is impossible by not comprehending probabilities. This is just one subtle example of many I could come up with. The more egregious ones come in the form of confusing accuracy and probability – and no, 55% MAPE does not necessarily mean you could do better with a two-sided coin.

Embrace Ambiguity In Your Forecasting & Planning

So, if so many do not understand it and so many are only asking for deterministic single, why is this important?

At the core it is what demand planning and predictive analytics does. Two of the golden rules of demand planning clearly state:

  • Demand Plans are rarely precise.
  • Demand Plans can be accurate and should include probability and/or an estimate of error.

We as demand planners live in the world of ambiguity and uncertainty, and transform it into insights the business can use. More than managing numbers, we manage assumptions and need to understand their individual contribution. We use weighting and ratios and work towards the best fit of our data sets to the right model to minimize error or uncertainty and provide answers.

Our world is changing as well, and we need to adapt. Predictive analytics and probabilities just may be the train that is taking us into the future. We have already seen a shift from traditional time series modeling to predictive analytics due to omnichannel and e-planning, much of which is driven by regression models or even more sophisticated machine learning and probabilistic forecasting.

Black Swans Are Real, & Probabilistic Thinking Allows Us To Be Ready When They Happen

Most importantly, whether the business knows it or not, they need probabilities and to better understand likelihoods along with uncertainty. Black Swans are real and we need to understand what we can mitigate and what we will accept. Whilst this is being done already by many companies, data driven organizations are using analytics, demand planning, and most of all probabilities to drive advanced decision making. For supply chain, it is more than 93 units with a COV of 10% – it is understanding what to do when you have a 70% chance you can sell 5% less and a 30% chance you may sell 50% more (how would you plan for this?). For Finance, it is a better understanding of cash flow and for marketing, it is about identifying the micro-targeted ads that are likely to succeed.

demand planning black swan event

Probabilistic forecasts allow us to identify the likelihood of a black swan event occurring, meaning we can be prepared when they do happen.

Your Talent For Probabilistic Thinking Will Help Your Company Become Data Driven

Even though there are great benefits, thinking in these terms is not natural, nor is it common. The good news is that probabilistic thinking is like a muscle and the the more you use it, the easier it becomes. Probabilistic thinking takes time and unless you are engrossed in it frequently, it does not come naturally. It is our responsibility as predictive analytics and demand planning professional to embrace and push others to see what we see – not only because it is our job and an important tool in our tool box, not only because it is becoming a necessity in the new e-planning environment, rather because we can help move the company towards being a data driven organization.

 

]]>
https://demand-planning.com/2018/10/23/probabilistic-forecast/feed/ 2
What Is Quick Response Forecasting? https://demand-planning.com/2018/02/28/what-is-quick-response-forecasting-and-why-do-we-need-it/ https://demand-planning.com/2018/02/28/what-is-quick-response-forecasting-and-why-do-we-need-it/#respond Wed, 28 Feb 2018 19:28:18 +0000 https://demand-planning.com/?p=6302

Some time ago I was on a call with IBF board members discussing situations where social media data signals rapid changes in demand for a product. These data might include some favorable online reviews of products or a celebrity mentioning a certain product, leading to a rise in demand. These are great “predictive” downstream demand signals, but since they cause very rapid spikes in demand, current forecasting processes could not incorporate the information quickly enough to act upon it. I blurted out “We need to start thinking about Quick Response Forecasting techniques”, making up the term on the spot.

Since then, Quick Response Forecasting (QRF) has gained a lot of traction among forward-thinking Demand Planners and Forecasters, and I recently wrote a detailed piece on it in the Journal Of Business Forecasting. Now we are at the point where demand planning organziations can identify very real applications for QRF in their own companies, and develop a framework to turn advance knowledge of changes into meeting spikes in demand with sufficient supply.

QRF leverages predictive analytics, social media information, and other Big Data.

What Is Quick Response Forecasting?

Quick Response Forecasting is updating forecasts in line with ‘real’ and rapid changes in demand, both during and between planning cycles. Data sources can be POS data or unstructured data like social media comments.

Quick Response Forecasting Solves The Availability Issue In Case of Spikes In Demand

A contact of mine works for a company that sells nail polish. Lady Gaga, at one of her concerts, wore a unique color of nail polish that his company sells. Shortly thereafter, social media lit up and sales of this item went through the roof. The company and its supply chain partners ran out of the product, as well as a key ingredient that went into making it. The company was not prepared to take full advantage of the rapid change of demand for this nail polish and missed out on a significant revenue opportunity. Had QRF been employed by the company, they would have been able to react quickly enough to ensure enough inventory was available.

Quick Response Forecasting Makes Sense Of Big Data

QRF leverages predictive analytics, social media information, and other Big Data. This relates to the explosion in digital data and the enormous amount of information available about customers and product users on the World Wide Web. As data get bigger, companies are looking for techniques and methods to both find and incorporate a few key demand signals among Big Data’s noisy information deluge.

QRF is a way to maximize revenue from rapidly emerging opportunities that are happening right now, and opportunities that are likely to develop in the very near future.

Quick Response Forecasting Supports Short-Cycle Planning

Like the nail-polish demand spike mentioned above, supply chains often cannot act quickly enough to take full advantage of the opportunity that the demand signal offers. More frequent forecasting has the potential to increase forecast accuracy in terms of identifying rapid changes in demand but supply chain responsiveness may be too sluggish to take full advantage of it. For example, manufacturing managers might complain about getting whipsawed by demand forecasts that change rapidly, despite their increased accuracy.

One quote I often use with regard to planning responsiveness came from a manager who ran the S&OP process at a high-tech company. Generally, these types of companies operate ‘responsive’ rather than “efficient” supply chains. Responsive supply chains handle high-margin, high-value products.

Their major goals are less about minimizing operating costs and inventories, and more about maximizing inventory availability at the point-of sale/consumption in order to capture potential upside revenue. One thing QFR is not, is supply chain optimization.

[bar id=”6270″]

How QFR Can Work In Practice

A typical S&OP process is a routine planning process that would be too disrupted by having to incorporate QRF, so it is not a good candidate process. QRF is needed to support teams that are specially put in place, on an ad hoc basis, to manage significant event-based and substantial demand changes like natural disasters, celebrity endorsements etc. These teams should be cross-functional and be supplemental to the S&OP process. The teams need to be quickly assembled once an on-going QRF forecasting organization detects that a demand spike or significant demand change has occurred, or is likely to occur.

Once the team is put in place, QRF forecasts for the event need to be continually provided to the quick response supply team. If forecasters ask managers whether they need QRF today, they will likely say no. They don’t want their operations to be whipsawed by frequently changing forecasts. This is why a separate and special quick-response supply process will be needed to handle each event.

If you are on the lookout for an organization to partner with to develop QRF and supply response teams, your sales organization is the best bet. The teams are focused on going after significant revenue opportunities, for which current processes are too slow to take advantage of. If you can identify highly lucrative revenue opportunities, Sales will jump at the chance to exploit it. A supply response team will be tasked with taking full advantage of an opportunity, in terms of squeezing as much revenue from it as possible.

Bottom Line: We All Need QRF To Maximize Big Data Opportunities

In short, QRF is a way to maximize revenue from rapidly emerging opportunities that are happening right now, and opportunities that are likely to develop in the very near future. It incorporates the ability to forecast unstructured data like Facebook comments or online reviews, and requires ta specialist supply team that sites apart from the standard S&OP process to respond quickly, switching supply up or down as required. In the age of Big Data where technology promises to deliver greater insight and greater revenue opportunities, QRF is exactly what we need to make it a practical reality.

Larry first discussed QRF in the Spring 2018 issue of the Journal Of Business Forecasting. Become an IBF member and gain access to the journal, as well as a host of other benefits.

]]>
https://demand-planning.com/2018/02/28/what-is-quick-response-forecasting-and-why-do-we-need-it/feed/ 0
How To Use Microsoft Azure https://demand-planning.com/2018/01/29/how-to-make-your-own-powerful-machine-learning-forecasting-models-for-free-without-coding/ https://demand-planning.com/2018/01/29/how-to-make-your-own-powerful-machine-learning-forecasting-models-for-free-without-coding/#comments Mon, 29 Jan 2018 20:12:25 +0000 https://demand-planning.com/?p=6067

If, like me, you work in a small to medium sized enterprise where forecasting is still done with pen and paper, you’d be forgiven for thinking that Machine Learning is the exclusive preserve of big budget corporations. If you thought that, then get ready for a surprise. Not only are advanced data science tools largely accessible to the average user, you can also access them without paying a bean.

If this sounds too good to be true, let me prove it to you with a quick tutorial that will show you just how easy it is to make and deploy a predictive webservice using Microsoft’s Azure Machine Learning (ML) Studio, using real-world (anonymised) data.

What is Azure ML?

To most people the words ‘Microsoft Azure’ conjure up vague ideas of cloud computing and TV adverts with bearded-hipsters working in designer industrial lofts, and yet, in my opinion, the Azure Machine Learning Studio is one of the more powerful and leading predictive modelling tools available on the market. And again, its free.  What’s more, because it has a graphical user interface, you don’t need any advanced coding or mathematical skills to use it. It’s all click and drag. In fact, it is entirely possible to build a machine learning model from beginning to end without typing a single line of code. How’s that for a piece of gold?

You can make a free account or sign in as a guest here – https://studio.azureml.net The free account or guest sign-in to the Microsoft Azure Machine Learning Studio gives you complete access to their easy-to-use drag and drop graphical user interface that allows you to build, test, and deploy predictive analytics solutions.  You don’t need much more.

Microsoft Azure Tutorial Time!

I promised you a quick tutorial on how to make a forecast that drives purchasing and other planning decisions in Azure ML, and a quick tutorial you shall have.

If you’re still with me, here are a couple of resources to help you get rolling:

A great hands on lab: https://github.com/Azure-Readiness/hol-azure-machine-learning

Edx courses you can access for free: https://www.edx.org/course/principles-machine-learning-microsoft-dat203-2x-6

https://www.edx.org/course/data-science-essentials-microsoft-dat203-1x-6

Having pointed you in the direction of more expansive and detailed resources, it’s time to get into this quick demo. Here are the basic steps we’ll go through:

  • Uploading datasets
  • Exploring and visualising data
  • Pre-processing and transforming
  • Predictive modelling
  • Publishing a model and using it in Excel

Uploading Datasets To Microsoft Azure

So, you’ve signed up. Once you’re in, you’re going to want to upload some data. I’m loading up the weekly sales data of a crystal glass product for the years 2016 and 2017 which I’m going to try and forecast.  You can read in a flat file csv. format by clicking on the ‘Datasets’ icon and clicking the big ‘+ New’:

   Then you’re going to want to load up your data from the file location and give it a name you can find easily later. Clicking on the ‘flask’ icon and hitting the same ‘+ New’ button will open a new experiment. You can drag your uploaded dataset from the ‘my datasets’ list on to the blank workflow:

Exploring and Visualizing

Right clicking on the workflow module number (1) will give you access to exploratory data analysis tools either through ‘Visualise’, or by opening a Jupyter notebook (Jupyter is an open source web application) in which to explore the data in either Python or R code. If you want to learn how to use and apply Python to your forecasting, practical insights will also be revealed at IBF’s upcoming New Orleans conference on Predictive Business Analytics & Forecasting.

Clicking on the ‘Visualise’ option calls up a view of the data, summary statistics and graphs. A quick look at the histogram of sales quantity shows that the data has some very large outliers. I’ll have to do something about those during the transformation step. You also get some handy summary statistics for each feature. Let’s have a look at the sales quantity column.

I’m guessing that zero will be Christmas week, when the office is closed. The max is likely to be a promotional offer. I can also see that the standard deviation is nearly 12,000 pieces, which is high compared to the mean. You can also compare columns/features to each other to see if there is any correlation:

Looking at a scatter plot comparison of sales quantity to the consumer confidence index value, that really doesn’t seem to be adding anything to the data. I’ll want to get rid of that feature. I’ve also included a quick Python line plot of sales over the two-year period.

As you can see, there is a lot of variability in the data and perhaps a slight downward trend. Without some powerful explanatory variables, this is going to be a challenge to accurately forecast. A lot of tutorials use rich datasets which the Machine Learning systems can predict well to give you a glossy version. I wanted to keep this real. I work in an SME and getting even basic sales data is an epic battle involving about fifty lines of code.

Pre-processing and Transforming

Now it’s time to transform the data. For simplicity, I’ve loaded a dataset with no missing or invalid entries by cleaning up and resampling sales by week with Python, but you can use the ‘scrub missing values’ module or execute a Python/R script in the Azure ML workspace to take care of this kind of problem.

In this case, all I need to do is change the ‘week’ column into a datetime feature (it loaded as a string object) and drop that OECD consumer confidence index feature as it wasn’t helping. I could equally have excluded the column without code using the select columns module:

One of the other things I’m going to do is to trim outliers from the dataset using another ‘Execute Python Script’ module to identify and remove outliers from the sales quantity column so the results are not skewed by rare sales events.

Again, I could have accomplished a similar effect by using Azure’s inbuilt ‘Clip Values’ module. You genuinely do not have to be able to write code to use Azure (but it helps.)

There are too many possible options within the transformation step to cover in a single article. I will mention one more important step. You should normalise the data to stop differences in scale of the features leading to certain features dominating over others. 90% of the work in forecasting is getting and cleaning the data so that it is usable for analysis (Adobe, take note. Pdf’s are evil and everyone who works with data hates them.) Luckily, you can do all your wrangling inside the machine model, so that when you use the service, it will do all the wrangling automatically based on your modules and code.

The Normalize data module allows you to select columns and choose a method of normalisation including Zscores and Min-Max.

Predictive Modelling In Microsoft Azure

Having completed the data transformation stage, you’re now ready to move on to the fun part – making a Machine Learning model. The first step is to split the data into a training set and a testing set. This should be a familiar practice for anyone working in forecasting. Before you let your forecast out into the wild you want to test how well it performs against the sales history. It’s that or face a screaming sales manager wanting to know where his stock is. I like my life as stress-free as possible.As with nearly everything in Azure ML, data splitting can be achieved by selecting a module. Just click on the search pane and type in what you want to do. I’m going to split my data 70-30.

The next step is to connect the left output of the ‘Split Data’ module to the right input of a ‘Train Model’ module, the right output of the ‘Split Data’ to a ‘Score Model’ module, and a learning model to the right input of the ‘Train model’.

At first this might seem a little complicated, but as you can see, the left output of the ‘Split Data’ is the training dataset which goes through the training model and then outputs the resulting learned technique to the ‘Score Model’ where this learned function is tested against the testing dataset which comes in through the right data input node. In the ‘Train Model’ module you must select a single column of interest. In this case it is the quantity of product sold that I want to know. 

Microsoft offer a couple of guides to help you choose the right machine learning algorithm. Here’s a broad discussion and if short on time, check this lightning quick guidance. In the above I’ve opted for a simple Linear Regression module and for comparison purposes I’ve included a Decision Forest Regression by adding connectors to the same ‘Split Data’ module. One of the great things about Azure ML is you can very quickly add and compare lots of models during your building and testing phase, and then clear them down before launching your web service.

Azure ML offers a wide array of machine learning algorithms from linear and polynomial regression to powerful adaptive boosted ensemble methods and neural networks. I think the best way to get to know these is to build your own models and try them out. As I have two competing models at work, I’ve added in an ‘Evaluate Model’ module and linked in the two ‘Score Model’ modules so that I can compare the results. I’ve also put in a quick Python script to graph the residuals and plot the forecasts against the results.

Here’s the Decision Forest algorithm predictions against the actual sales quantity:

Clearly something happened around May 2016 that the Decision Forest model is unable to explain, but it seems to do quite well in finding the peaks over the rest of the period 2017. Looking at the Linear Regression model, one can see that it does a better job of finding the peak around May 2016 but is consistently overestimating in the latter half of 2017.

Clicking on the ‘Evaluate Model’ module enables a more detailed statistical view of the comparative accuracy of the two models. The linear regression model is the top row and the decision forest model is the bottom row.

Coefficient of determinations of 0.60 and 0.72. The models are explaining between half and three-quarters of the variance in sales. The Decision Forest overall scored significantly better. As results go, neither brilliant nor terrible. A perfect coefficient of determination of 1 would suggest the model was overfitted and therefore unlikely to perform well on new data. The range of sales was from 0 to nearly 80,000, so I’ll take 4421 pieces of mean absolute error without a complaint.

It would really be ideal if we had a little more information at the feature engineering stage. The ending inventory in-stock value from each week, or customer forecasts from the S&OP process as features would help accuracy.

One of the benefits of forecasting in this way is you can incorporate features without having to worry about how accurate they are as the model will figure that out for you. I’d recommend having as many as possible and then pruning. I think the next step for this model would be to try incorporating inventory and S&OP pipeline customer forecasts as a feature. Building a model is an iterative process and one can and should keep improving it over time.

Publishing A Model And Consuming It In Excel

Azure ML makes setting up a model as a webservice and using it in Excel very easy. To deploy the model, simply click on the ‘Setup Web Service’ icon at the bottom of the screen.

Once you’ve deployed the webservice, you’ll get an API (Application Programming Interface) key and a Request Response URL link. You’ll need these to access your app in Excel and start predicting beyond your training and testing set. Finally, you’re ready to open good old Excel. Go to the ‘Insert tab’ and select the ‘Store’ icon to download the free Azure add-in for Excel.

Then all you need to do is click the ‘+ Add web service’ button and paste in your Response Request URL and your secure API key, so that only your team can access the service.

After that it’s a simple process to input the new sales weeks to be predicted for the item and the known data for other variables (in this case promotions, holiday days in the week, historic average annual/seasonal sales pattern for the category etc.). You can make this easy by clicking on the ‘Use sample data’ to populate the column headers so you don’t have to remember the order of the columns used in the training set.

Congratulations! You now have a basic predictive webservice built for producing forecasts. By adding in additional features to your dataset and retraining and improving the model, you can rapidly build up a business specific forecasting function using Machine Learning that is secure, shareable and scalable.

Good luck!

If you’re keen to leverage Python and R in your forecasting, we also recommend attending IBF’s upcoming Predictive Analytics, Forecasting & Planning conference in New Orleans where attendees will receive hands-on Python training. For practical and step-by-step insight into applying Machine Learning with R for forecasting in your organization, check out IBF’s Demand Planning & Forecasting Bootcamp w/ Hands-On Data Science & Predictive Business Analytics Workshop in Chicago.

 

 

 

 

]]>
https://demand-planning.com/2018/01/29/how-to-make-your-own-powerful-machine-learning-forecasting-models-for-free-without-coding/feed/ 1