Predictive modeling is a technique that uses mathematical and computational methods to predict
an event or outcome. A mathematical approach uses an equation-based model that describes the
phenomenon under consideration. The model is used to forecast an outcome at some future state
or time based upon changes to the model inputs. The model parameters help explain how model
inputs influence the outcome. Examples include time series regression models for
predicting airline traffic volume or predicting fuel efficiency based on a linear regression model of
engine speed versus load.
The computational predictive modeling approach differs from the mathematical approach because
it relies on models that are not easy to explain in equation form and often require simulation
techniques to create a prediction. Predictive modeling is often performed using curve and surface
fitting, time series regression, approaches. Regardless of the approach used, the process of creating
a predictive model is the same across methods. The steps are:
➢ Clean the data by removing outliners and treating missing data
➢ Identify a parametric on nonparametric predictive modeling approach to use
➢ Preprocess the data into a form suitable for the chosen modeling algorithm
➢ Specify a subset of the data to be used for training the model
➢ Train, or estimate, model parameters from the training data set
➢ Conduct model performance or goodness-of-fit tests to check model adequacy
➢ Validate predictive modeling accuracy on data not used for calibrating the model
➢ Use the model for prediction if satisfied with its performance.
An artificial neuron network (ANN) is a computational model based on the structure and functions
of biological neural networks. Information that flows through the network affects the structure of
the ANN because a neural network changes – or learns, in a sense – based on that input and output.
ANNs are considered nonlinear statistical data modelling tools where the complex relationships
between inputs and outputs are modelled or patterns are found. ANN is also known as a neural
network.
An ANN has several advantages but one of the most recognized of these is the fact that it can
actually learn from observing data sets. In this way, ANN is used as a random function
approximation tool. These types of tools help estimate the most cost-effective and ideal methods
for arriving at solutions while defining computing functions or distributions. ANN takes data
samples rather than entire data sets to arrive at solutions, which saves both time and money. ANNs
are considered fairly simple mathematical models to enhance existing data analysis technologies.
ANNs have three layers that are interconnected. The first layer consists of input neurons. Those
neurons send data on to the second layer, which in turn sends the output neurons to the third layer
Statistical analysis is basically performed to see the contribution of different economic factors on the sales of passenger vehicles so that while developing an ANN Model, the major factors are only to be considered as neuron and the factors which are not contributing much are to be discarded. ➢ Based on the correlation and regression analysis of the macroeconomic factors, three factors are majorly contributing to the sales of passenger vehicles which will be included in the ANN model, the rest of two are discarded. sales R-squared R-squared (adj) income 0.932 (strong positive correlation) 86.9% 90.7% interest -0.636 (intermediate negative correlation) 40.5 % 40.2% Unemployment -0.775 (strong negative correlation) 97.39 % 97.16% Inflation -0.041 (very weak negative correlation) 53.83% 58.0 % GDP 0.035 (very weak positive correlation) 50.39% 43.73%
In this paper we have studied the conventional forecasting & statistical methods for finding the predictive analysis of passenger vehicles with the month-wise data of 16 years. Firstly, we have collected all the sales data over the past 16 years, then statistical methods i.e. correlation & regression methods were implemented. In the correlation analysis, we were interested in finding the relation between the sales and different macro-economic factors which affected the sales of passenger vehicles which includes GDP, Per Capita Income, Unemployment, Inflation, etc. Then regression analysis was implemented to characterize the variation of dependent variable with the independent variable and how critically it is affected. After, the regression analysis was implemented, it was found that three factors affected the sales critically (Per Capita Income, Unemployment, Interest rate). sales R-squared R-squared (adj) income 0.932 (strong positive correlation) 86.9% 90.7% interest -0.636 (intermediate negative correlation) 40.5 % 40.2% Unemployment -0.775 (strong negative correlation) 97.39 % 97.16% Then the above factors were taken for the ANN model which basically consists of neural network. The above factors was taken as input & the target values was taken as the previous sales data of passenger vehicles. A network was created consists of training state (to train the network),Validation state (validating the data), Testing of the data, etc. & at last, a trained network was created for predicting the sales of the data for the upcoming years which the help of the different macro-economic factors & the predictive data was extracted for the sales of passenger vehicles over the past years.