Over the past two decades Machine Learning has become one of the mainstays of information technology. The really basic idea is to take a dataset D and adapt from it while taking certain factors into account, and then predict an output with a given input. Some overview of all fields in Machine Learning (but we just do a really basic linear regression to get into it):
From: Deep Learning Ian Goodfellow, Yoshua Bengio, Aaron Courville
Example: Predicting house prizes (not-linear regression)
You think about buying a new Ancient Villa and want to know when, which and where to buy it. You could now do different things: Either predict the price on a timeline (like how much the prices in winter differ from summer) or predict the price of houses in different area(e.g. cities). We stick on the first part because its way easier for now.
Example:
Just a simple function: ypredict(x) = 100 + sin(x) * 5
On this (not linear) regression prediction the y-axis I put the price of a house with e.g. 150qm, the x-axis shows all months of a year. We can now see that the price for buying a house is different for each month. If you want to buy a house in the middle of January you probably pay 100.000€ and in the end of April you tend to pay roughly 10.000€ less. Now one would likely buy a house around April-May.
But why is this called regression? Well as said earlier, we have a continuous output for every value we put into our predictor ypredict(x).
Linear regression:
What is a linear function?
In genrel we have:
Where b is our bias, a our slope and y our output value.
However, in Machine Learning one uses different letters:
We would call them Beta-hat. The beta hat indicated with 0 is still the same bias and the other one is also the same slope, just different names. But we change y to y-hat because we say it's a predictor.
But how would one train a model like this and how can we optimize it?
We need:
Our observed dataset contains x values and y values. We later want to predict a y value for any input x.
Now we have our data set we can move on to our training algorithm.
We now have to calculate our betas, bias and slope.
We do that by minimizing the error measure and solve after beta: Residual Sum of Squares(RSS)
Keep in mind: N is the size of the dataset, c.f. indices of our x,y columns above.
It results into a least squares problem by minimizing it after beta0 and beta1
Furthermore, I need to mention that xbar is the mean of all x. Which would be the sum over all x divided by x.
And last we use the Mean Squared Error measure to see how good our model is by using the sum of dividing our predicted y_hat(x) from the real y-value we have in our dataset and then square it.
But since this would be a lot of work to do hand-written, so we instead use Python to compute all those things.
Display All
In the end we get the following result:
For this specific dataset we see that linear regression performs quite good.
From: Deep Learning Ian Goodfellow, Yoshua Bengio, Aaron Courville
Example: Predicting house prizes (not-linear regression)
You think about buying a new Ancient Villa and want to know when, which and where to buy it. You could now do different things: Either predict the price on a timeline (like how much the prices in winter differ from summer) or predict the price of houses in different area(e.g. cities). We stick on the first part because its way easier for now.
Example:
Just a simple function: ypredict(x) = 100 + sin(x) * 5
On this (not linear) regression prediction the y-axis I put the price of a house with e.g. 150qm, the x-axis shows all months of a year. We can now see that the price for buying a house is different for each month. If you want to buy a house in the middle of January you probably pay 100.000€ and in the end of April you tend to pay roughly 10.000€ less. Now one would likely buy a house around April-May.
But why is this called regression? Well as said earlier, we have a continuous output for every value we put into our predictor ypredict(x).
Linear regression:
What is a linear function?
In genrel we have:
Where b is our bias, a our slope and y our output value.
However, in Machine Learning one uses different letters:
We would call them Beta-hat. The beta hat indicated with 0 is still the same bias and the other one is also the same slope, just different names. But we change y to y-hat because we say it's a predictor.
But how would one train a model like this and how can we optimize it?
We need:
- a dataset with two variables
- a training algorithm with a formula to minimize analytically (usually we minimize a cost function/error measure)
- an error measure, for our wrong predictions and to look how good our model predicts. In general we look how much difference there is between the real and predicted values.
Our observed dataset contains x values and y values. We later want to predict a y value for any input x.
Now we have our data set we can move on to our training algorithm.
We now have to calculate our betas, bias and slope.
We do that by minimizing the error measure and solve after beta: Residual Sum of Squares(RSS)
Keep in mind: N is the size of the dataset, c.f. indices of our x,y columns above.
It results into a least squares problem by minimizing it after beta0 and beta1
Furthermore, I need to mention that xbar is the mean of all x. Which would be the sum over all x divided by x.
And last we use the Mean Squared Error measure to see how good our model is by using the sum of dividing our predicted y_hat(x) from the real y-value we have in our dataset and then square it.
But since this would be a lot of work to do hand-written, so we instead use Python to compute all those things.
Source Code
- """
- @author: Chris
- """
- #used for calculating the solution of the predictor
- def predictor(beta0, beta1, x):
- return beta0 + beta1 * x
- import pandas as pd
- import numpy as np
- import matplotlib.pyplot as plt
- #read dataset
- dataset = pd.read_csv("tutorial3.dat", sep = " ",
- names= ["x", "y"])
- #compute ß1
- xmean = np.mean(dataset["x"])
- ymean = np.mean(dataset["y"])
- numerator = 0
- denominator = 0
- #calcilate numerator and denominator of the formula
- for i, row in dataset.iterrows():
- numerator += ((row["x"] - xmean)) * (row["y"] - ymean)
- denominator += (row["x"] - xmean)**2
- #fraction
- beta1 = numerator / denominator
- #compute ß0 using the formula given
- beta0 = ymean - beta1 * xmean
- print("Predictor:")
- print("yhat = " + str(beta0) + " + " + str(beta1) + "*x")
- #plot our data
- plt.scatter(dataset["x"],dataset["y"])
- #plot predictor function
- x = np.linspace(1,6, 100)
- y = beta0 + beta1 * x
- plt.plot(x,y, color = "red")
- plt.show()
- #Calculate Mean Squared error using the formula
- MSE = 0
- for i, row in dataset.iterrows():
- MSE += (row["y"] - predictor(beta0, beta1, row["x"])) **2
- MSE = MSE / len(dataset)
- print("Mean Squared Error:")
- print(np.around(MSE,4))
In the end we get the following result:
For this specific dataset we see that linear regression performs quite good.
The post was edited 3 times, last by Chryssy ().