the new old statistical learning
You have probably heard of the term statistical learning and how it relates to machine learning and even artificial intelligence. This is a relatively new term, but it has become very popular in recent years, especially with the spread of data science (another topic that deserves a post). But although this term is new, many of the concepts and techniques we use today were developed more than 100 years ago.
In the early 19th century, Legendre and Gauss published articles on the least squares method, which implemented the oldest form of what is now known as linear regression, a technique that is used to predict quantitative values, such as an individual’s salary and price of houses.
In 1936, Fisher proposed a linear discriminant analysis, which can be used to predict qualitative values, such as whether a patient survives or dies, or whether the stock market increases or decreases (Fisher is the guy, right?). And in the 1940s, several guys proposed logistic regression as an alternative to linear discriminant analysis, which today is often sold as a revolutionary machine learning technique.
In 1972, Nelder and Wedderburn proposed the generalized linear models (GLM) for an entire class of statistical learning methods that include linear regression, logistics and Poisson as special cases. In the same decade, several other techniques were proposed, but these were almost exclusively linear methods. In the following decade, due to the advancement of computational technology, Breiman, Friedman, Olshen and Stone presented the classification and regression trees (CART) and cross-validation for model selection, techniques that are still widely used and provide good results.
Since then, statistical learning has emerged as a new field of statistics, with a focus on modeling and forecasting. And in recent years, progress in statistical learning has been marked by increased computing power and also by the increasing availability of powerful software and high-level programming languages, like Python and R.
Finally, I would like you to take a look here, to see the timeline of machine learning and statistical learning techniques (a beautiful visualization made by this guy). Do you still think that statistical learning techniques (or even machine learning techniques) are as new as people really tell you? Or are people just now discovering these techniques?
I see ya in the next post, or on twitter.
References
Friedman, J., Hastie, T. and Tibshirani, R., 2001. The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics. Vancouver
Leave a comment