
How Python and Data Science Can Improve Daily Forecasting Tasks
The last 6 years of my career have been soley devoted to creating hypothetical futures, selecting the most likely future, and developing strategies and business contingencies to begin preparing for those eventualities. When I began in this career path I was under the honest impression that I would kick things off with a refresher on statistics, perhaps even a course on market patterns or maybe even a class on coding. I was wrong.
Like many corporate analysts I was introduced to Microsoft Excel and the “sweet science” of CAGR Roll Ups. The process often involved an arbitrary method of quantifying market assumptions, and was far less scientific that I believed it could be. I knew that there had to be a better way and I actively sought out institutions that could educate me on how.
Being able to quantify your assumptions is the cornerstone for time series forecasting. It allows you to observe qualitative elements in your business environment understand their direct effect on sales. In a traditional board room this is often done by establishing a ratio between sales and that variable (e.g. If we sell 12 radios this year and spent 6 dollars, that means we sell 2 radios per dollar spent). The problem with this methodology is that is over simplifies the relationship between variables, and does not take into account the communal effect of other variables in question. There were to essential questions that were not being answered:
- What are the mathematically quantifiable relationships between individual variables?
- How strong are these relationships?
Luckily, the answers were not hard to find. In fact, they were available in Flatiron Academy’s introductory material and could be expressed in two concepts and four simple lines of code:
Correlation:
correlation_matrix = np.corrcoef(sales, inventory)
correlation_matrix[0,1]
Covariance:
covariance_matrix = np.cov(sales, inventory)
covariance_matrix[0,1]
Correlation determines positive or negative relationships between variables and Covariance determines the strength of these relationships. These simple statistical constructs can be the foundation of better forecasts, and more importantly, a better understanding of how the ecosystem of variable within a company relate to one another.
It is my aspiration to continue to search for more statistically sound methods to make good business decisions through Data Science and hopefully construct a platform that allows other operations managers to do the same.
Get new content delivered directly to your inbox.