A Popular Query
Many a times in interactive sessions I get asked, “What is meant by learning?” Or, more precisely, “How does a software learn?” Let us start from the concept of Machine Learning.
Machine learning is a sub-field of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions, rather than following strictly static program instructions.
Logistic Regression is a simple Machine Learning technique. The variable of interest (target / dependent variable) here is binary taking two values: 0 or 1, or say “Yes” or “No”. There remains a bunch of independent (or explanatory) variables based on which I plan to make my Logistic Regression Model LEARN to predict 0 or 1 for a new record which is yet to be classified. The classifier should have a desired level of accuracy.
We model the probability of being labelled as ‘0’ as a function of the known explanatory variables xi and the unknown coefficients βi’s. Given the entire dataset of the dependent and independent variables we solve (in iterations) for the coefficients βi’s to establish the best predictor among the Logistic Regression models. There are readily available packages for Logistic Regression that solve the βi’s.
Setting βi’s to their correct values is what, in my opinion, means LEARNING. Once they are set to correct values, we say out classifier has learnt from the data (or training data) and is ready to be used for classification of the unlabeled records.
Learning from data, or Machine Learning, has come up as a super successful concept in literature and industry as well. But the reason behind its success is yet to be established mathematically. No one knows how such an implementation of non-linearity in place of the good old linear link function caused such a big success.
Question 1: What type of Activation Function is used in Logistic Regression?
Question 2: Can a Linear Activation Function be used in any Neural Network?