Introduction to Linear Regression
Linear regression is a widely used statistical technique for modeling the relationship between a dependent variable and one or more independent variables. It is a simple yet powerful tool for understanding and predicting trends and patterns in data.
In C++, you can implement linear regression using various libraries and techniques. Let's take a look at a simple example:
1#include <iostream>
2#include <vector>
3#include <cmath>
4
5// Function to perform linear regression
6void linearRegression(const std::vector<double>& x, const std::vector<double>& y) {
7 // Calculate the mean of x and y
8 double sumX = 0;
9 double sumY = 0;
10 for (const double& value : x) {
11 sumX += value;
12 }
13 for (const double& value : y) {
14 sumY += value;
15 }
16 double meanX = sumX / x.size();
17 double meanY = sumY / y.size();
18
19 // Calculate the coefficients
20 double numerator = 0;
21 double denominator = 0;
22 for (int i = 0; i < x.size(); i++) {
23 numerator += (x[i] - meanX) * (y[i] - meanY);
24 denominator += (x[i] - meanX) * (x[i] - meanX);
25 }
26 double slope = numerator / denominator;
27 double intercept = meanY - slope * meanX;
28
29 std::cout << "Slope: " << slope << std::endl;
30 std::cout << "Intercept: " << intercept << std::endl;
31}
32
33int main() {
34 std::vector<double> x = {1, 2, 3, 4, 5};
35 std::vector<double> y = {2, 3, 4, 5, 6};
36 linearRegression(x, y);
37
38 return 0;
39}
In the code above, we define a function linearRegression
that takes two vectors x
and y
representing the independent and dependent variables, respectively. It calculates the slope and intercept of the linear regression line using the least squares method.
You can customize the input data and observe how the slope and intercept change based on the relationship between x
and y
.
Linear regression is commonly used in various fields, including finance, economics, and data analysis, to model and predict linear relationships between variables. It can help you make informed decisions and predictions based on the available data.