Question Regarding Linear Regression (Statistics)

Page 1 of 1 [ 4 posts ] 

jofra49
Emu Egg
Emu Egg

Joined: 5 Apr 2023
Age: 37
Gender: Male
Posts: 1
Location: USA

05 Apr 2023, 7:01 am

Can linear regression be used to determine how rates change over time?

I am trying to do a project that determines how employment rates change each year for 5 years, for each industry, for certain geographical regions, and how to predict said rates for the next 5 years.

How can this be done with Python?



Jono
Veteran
Veteran

User avatar

Joined: 10 Jul 2008
Age: 43
Gender: Male
Posts: 5,606
Location: Johannesburg, South Africa

06 Apr 2023, 3:32 am

I'm not sure. I haven't worked much with statistical methods. The relationship between variables may be nonlinear, in which case I'm not sure that linear regression would be suitable. I've found an article about using Python code to find nonlinear correlations in data but I don't know if that's what you're looking for:

https://www.freecodecamp.org/news/how-machines-make-predictions-finding-correlations-in-complex-data-dfd9f0d87889/



stratozyck
Deinonychus
Deinonychus

Joined: 28 Jun 2022
Age: 41
Gender: Male
Posts: 366
Location: US

06 Apr 2023, 6:29 pm

jofra49 wrote:
Can linear regression be used to determine how rates change over time?

I am trying to do a project that determines how employment rates change each year for 5 years, for each industry, for certain geographical regions, and how to predict said rates for the next 5 years.

How can this be done with Python?


You could do it in MS Excel if you wanted to.

For most regression techniques they assume all variables have the same level of "integration."

If you violate this assumption you get bad predictions.

For example, lets say you ran a regression of unemployment rates on GDP. Well, unemployment rates are between 0 and 1 and GDP trends upwards over a long horizon. So lets say over a short time you found a correlation. Lets say it was negative, higher GDP -> lower unemployment rate. You would probably find this depending on where you took the data.

But if you started plugging in numbers for the future you'd get this result where as GDP goes higher, predicted unemployment rates go to zero - which doesn't make sense. In fact, with linear regression you'd get negative unemployment rate predictions.

Instead what you do is run a regression on another variable that is bounded or "stationary." So instead of GDP, you use GDP growth rates. While they can be negative infinity to positive infinity, for practical uses in the data set they tend to vary in a bounded range.

To make up for short comings of linear regression, you'd use logistical regression instead, which would bound the output between 0 and 1. Linear regression won't give you biased or incorrect estimates and in fact might be preferable to logistic regression for practical usage. A problem with logistic regression is that as estimates get close to 0 and 1, they tend to get unstable. By that I mean, a small change in the inputs can lead to a large swing in outputs.



DanielW
Veteran
Veteran

User avatar

Joined: 17 Jan 2019
Age: 35
Gender: Male
Posts: 1,873
Location: PNW USA

07 Apr 2023, 11:34 am

^^^Very well said ^^^