Linear Regression using Matrices
Introduction
For points (x1, y1), (x2, y2), ... (xn, yn), the least square regression line can be given by:
which will minimize the sum of the squared error, which are the error in using the regression function f(x) to estimate the true y values
where ei = yi - f(xi) is the error approximating yi .
Let's see if we can set this up as a system of equations and then solve using matrices.
Using our points (x1, y1), (x2, y2), ... (xn, yn) we would have the following system of equations:
Now let's set up an matrix equation. Let:
This gives us the matrix equation: Y = XA + E.
We now just need to solve this for A.
The solution to least square regression equation Y = XA + E is:
The sum of the squared errors (SSE) is:
Example
Example: Determine the least squares regression line using a matrics. The price is $x and y is the monthly sales. Then find the sum of the squared errors.
The solution is to work out A using:
Step 1: Get the matrices Y and X.
Step 2: Work out the XTX
Step 3: Find the inverse of XTX. (The tutorial about the Inverse is here)
Step 4: Find the XTY
Step 5: Finally, the result
and the Squared Error is
References & Resources
- http://www.youtube.com/watch?v=Qa_FI92_qo8
Latest Post
- Dependency injection
- Directives and Pipes
- Data binding
- HTTP Get vs. Post
- Node.js is everywhere
- MongoDB root user
- Combine JavaScript and CSS
- Inline Small JavaScript and CSS
- Minify JavaScript and CSS
- Defer Parsing of JavaScript
- Prefer Async Script Loading
- Components, Bootstrap and DOM
- What is HEAD in git?
- Show the changes in Git.
- What is AngularJS 2?
- Confidence Interval for a Population Mean
- Accuracy vs. Precision
- Sampling Distribution
- Working with the Normal Distribution
- Standardized score - Z score
- Percentile
- Evaluating the Normal Distribution
- What is Nodejs? Advantages and disadvantage?
- How do I debug Nodejs applications?
- Sync directory search using fs.readdirSync