Highest scored 'linear-regression' questions

324 votes

10 answers

480k views

Add regression line equation and R^2 on graph

I wonder how to add regression line equation and R^2 on the ggplot. My code is: library(ggplot2) df <- data.frame(x = c(1:100)) df$y <- 2 + 3 * df$x + rnorm(100, sd = 40) p <- ggplot(data = ...

MYaseen208

23.3k

asked Sep 26, 2011 at 0:52

279 votes

15 answers

299k views

What is the difference between linear regression and logistic regression? [closed]

When we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an outcome given the input values. ...

London guy

27.7k

asked Aug 27, 2012 at 17:49

238 votes

7 answers

472k views

How to do exponential and logarithmic curve fitting in Python? I found only polynomial fitting

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic). I use Python and Numpy and for polynomial fitting there is a ...

Tomas Novotny

7,957

asked Aug 8, 2010 at 7:36

193 votes

6 answers

620k views

Adding a regression line on a ggplot

I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this... data = data.frame(x.plot=rep(seq(1,5),10),y.plot=rnorm(50)) ...

Remi.b

17.8k

asked Mar 26, 2013 at 9:40

166 votes

6 answers

338k views

How to force R to use a specified factor level as reference in a regression?

How can I tell R to use a certain level as reference if I use binary explanatory variables in a regression? It's just using some level by default. lm(x ~ y + as.factor(b)) with b {0, 1, 2, 3, 4}. ...

Matt Bannert

28k

asked Oct 6, 2010 at 11:46

155 votes

15 answers

243k views

Multiple linear regression in Python

I can't seem to find any python libraries that do multiple regression. The only things I find only do simple regression. I need to regress my dependent variable (y) against several independent ...

Zach

4,654

asked Jul 13, 2012 at 22:14

136 votes

10 answers

137k views

Linear Regression and group by in R

I want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 states). I want to fit a regression for ...

JD Long

60.3k

asked Jul 23, 2009 at 4:00

117 votes

8 answers

396k views

Linear regression with matplotlib / numpy

I'm trying to generate a linear regression on a scatter plot I have generated, however my data is in list format, and all of the examples I can find of using polyfit require using arange. arange doesn'...

user771224

asked May 27, 2011 at 5:32

113 votes

8 answers

330k views

Accuracy Score ValueError: Can't Handle mix of binary and continuous target

I'm using linear_model.LinearRegression from scikit-learn as a predictive model. It works and it's perfect. I have a problem to evaluate the predicted results using the accuracy_score metric. This is ...

Arij SEDIRI

2,108

asked Jun 24, 2016 at 13:57

83 votes

8 answers

234k views

How to overplot a line on a scatter plot in python?

I have two vectors of data and I've put them into pyplot.scatter(). Now I'd like to over plot a linear fit to these data. How would I do this? I've tried using scikitlearn and np.polyfit().

goldisfine

4,800

asked Sep 28, 2013 at 16:05

77 votes

4 answers

148k views

Linear regression analysis with string/categorical features (variables)?

Regression algorithms seem to be working on features represented as numbers. For example: This data set doesn't contain categorical features/variables. It's quite clear how to do regression on this ...

Erba Aitbayev

4,303

asked Nov 30, 2015 at 20:21

76 votes

6 answers

149k views

How to get a regression summary in scikit-learn like R does?

As an R user, I wanted to also get up to speed on scikit. Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output. ...

mpg

3,779

asked Oct 11, 2014 at 21:04

75 votes

4 answers

27k views

why gradient descent when we can solve linear regression analytically

what is the benefit of using Gradient Descent in the linear regression space? looks like the we can solve the problem (finding theta0-n that minimum the cost func) with analytical method so why we ...

John

2,127

asked Aug 12, 2013 at 16:18

65 votes

6 answers

209k views

gradient descent using python and numpy

def gradient(X_norm,y,theta,alpha,m,n,num_it): temp=np.array(np.zeros_like(theta,float)) for i in range(0,num_it): h=np.dot(X_norm,theta) #temp[j]=theta[j]-(alpha/m)*( np.sum( ...

Madan Ram

876

asked Jul 22, 2013 at 9:55

61 votes

1 answer

233k views

How to calculate the 95% confidence interval for the slope in a linear regression model in R

Here is an exercise from Introductory Statistics with R: With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression model to the relation. According to the fitted model, ...

Yu Fu

1,171

asked Mar 2, 2013 at 22:09

58 votes

10 answers

66k views

Cost Function, Linear Regression, trying to avoid hard coding theta. Octave.

I'm in the second week of Professor Andrew Ng's Machine Learning course through Coursera. We're working on linear regression and right now I'm dealing with coding the cost function. The code I've ...

OhNoNotScott

824

asked Mar 25, 2014 at 4:22

57 votes

6 answers

82k views

Why do I get only one parameter from a statsmodels OLS fit

Here is what I am doing: $ python Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin >>> import statsmodels.api as sm >>&...

Tom

2,859

asked Dec 20, 2013 at 10:27

55 votes

1 answer

6k views

Is there a better alternative than string manipulation to programmatically build formulas?

Everyone else's functions seem to take formula objects and then do dark magic to them somewhere deep inside and I'm jealous. I'm writing a function that fits multiple models. Parts of the formulas ...

bokov

3,484

asked Oct 19, 2012 at 5:17

54 votes

2 answers

36k views

How to make seaborn regplot partially see through (alpha)

When using seaborn barplot, I can specify an alpha to make the bars semi-translucent. However, when I try this with seaborn regplot, I get an error saying this is an unexpected argument. I read the ...

qwertylpc

2,036

asked Oct 8, 2015 at 3:01

51 votes

6 answers

109k views

TensorFlow: "Attempting to use uninitialized value" in variable initialization

I am trying to implement multivariate linear regression in Python using TensorFlow, but have run into some logical and implementation issues. My code throws the following error: Attempting to use ...

NEW USER

797

asked Mar 15, 2016 at 9:55

49 votes

3 answers

128k views

predict.lm() in a loop. warning: prediction from a rank-deficient fit may be misleading

This R code throws a warning # Fit regression model to each cluster y <- list() length(y) <- k vars <- list() length(vars) <- k f <- list() length(f) <- k for (i in 1:k) { vars[...

Mahsa

561

asked Oct 25, 2014 at 1:56

46 votes

5 answers

127k views

How to extract the regression coefficient from statsmodels.api?

result = sm.OLS(gold_lookback, silver_lookback ).fit() After I get the result, how can I get the coefficient and the constant? In other words, if y = ax + c how to get the values a and c?

JOHN

1,461

asked Nov 20, 2017 at 9:01

46 votes

3 answers

70k views

Linear Regression with a known fixed intercept in R

I want to calculate a linear regression using the lm() function in R. Additionally I want to get the slope of a regression, where I explicitly give the intercept to lm(). I found an example on the ...

R_User

10.9k

asked Sep 7, 2011 at 11:38

44 votes

8 answers

158k views

Error in Confusion Matrix : the data and reference factors must have the same number of levels

I've trained a Linear Regression model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error: Error in confusionMatrix.default(pred, testing$Final) : the ...

abcd

441

asked May 2, 2015 at 11:57

42 votes

5 answers

65k views

Linear Regression :: Normalization (Vs) Standardization

I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables. Normalization = x -xmin/ xmax – xmin Zero ...

Santosh Kumar

521

asked Aug 20, 2015 at 1:32

41 votes

7 answers

62k views

Linear Regression in Javascript [closed]

I want to do Least Squares Fitting in Javascript in a web browser. Currently users enter data point information using HTML text inputs and then I grab that data with jQuery and graph it with Flot. ...

Chris W.

38.4k

asked Jun 1, 2011 at 1:28

40 votes

4 answers

68k views

How to force zero interception in linear regression?

I have some more or less linear data of the form: x = [0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 20.0, 40.0, 60.0, 80.0] y = [0.50505332505407008, 1.1207373784533172, 2.1981844719020001, ...

Kyra Tafar

403

asked Apr 3, 2012 at 9:45

40 votes

7 answers

40k views

predict.lm() with an unknown factor level in test data

I am fitting a model to factor data and predicting. If the newdata in predict.lm() contains a single factor level that is unknown to the model, all of predict.lm() fails and returns an error. Is ...

Stephan Kolassa

8,219

asked Nov 26, 2010 at 12:15

39 votes

5 answers

158k views

Linear Regression on Pandas DataFrame using Sklearn ( IndexError: tuple index out of range)

I'm new to Python and trying to perform linear regression using sklearn on a pandas dataframe. This is what I did: data = pd.read_csv('xxxx.csv') After that I got a DataFrame of two columns, let's ...

Dinosaur

665

asked Apr 29, 2015 at 3:58

39 votes

1 answer

48k views

In the LinearRegression method in sklearn, what exactly is the fit_intercept parameter doing? [closed]

In the sklearn.linear_model.LinearRegression method, there is a parameter that is fit_intercept = TRUE or fit_intercept = FALSE. I am wondering if we set it to TRUE, does it add an additional ...

user321627

2,430

asked Oct 16, 2017 at 21:47

37 votes

9 answers

72k views

How to find the features names of the coefficients using scikit linear regression?

I use scikit linear regression and if I change the order of the features, the coef are still printed in the same order, hence I would like to know the mapping of the feature with the coeff. #training ...

amehta

1,327

asked Jan 7, 2016 at 7:58

37 votes

2 answers

30k views

How (and why) do you use contrasts?

Under what cases do you create contrasts in your analysis? How is it done and what is it used for? I checked ?contrasts and ?C - both lead to "Chapter 2 of Statistical Models in S", which is not ...

Tal Galili

25k

asked Feb 28, 2010 at 20:51

35 votes

9 answers

142k views

ValueError: Expected 2D array, got 1D array instead:

While practicing Simple Linear Regression Model I got this error, I think there is something wrong with my data set. Here is my data set: Here is independent variable X: Here is dependent variable ...

danyialKhan

697

asked Jul 3, 2018 at 8:40

35 votes

2 answers

15k views

Pandas rolling regression: alternatives to looping

I got good use out of pandas' MovingOLS class (source here) within the deprecated stats/ols module. Unfortunately, it was gutted completely with pandas 0.20. The question of how to run rolling OLS ...

Brad Solomon

39.6k

asked Jun 6, 2017 at 1:31

34 votes

2 answers

56k views

How does predict.lm() compute confidence interval and prediction interval?

I ran a regression: CopierDataRegression <- lm(V1~V2, data=CopierData1) and my task was to obtain a 90% confidence interval for the mean response given V2=6 and 90% prediction interval when V2=...

Mitty

495

asked Jun 29, 2016 at 20:30

34 votes

6 answers

75k views

python linear regression predict by date

I want to predict a value at a date in the future with simple linear regression, but I can't due to the date format. This is the dataframe I have: data_df = date value 2016-01-15 1555 ...

jeangelj

4,428

asked Oct 24, 2016 at 11:35

32 votes

8 answers

69k views

Are there any Linear Regression Function in SQL Server?

Are there any Linear Regression Function in SQL Server 2005/2008, similar to the the Linear Regression functions in Oracle ?

rao

1,034

asked Mar 29, 2010 at 9:31

32 votes

3 answers

62k views

How to add interaction term in Python sklearn

If I have independent variables [x1, x2, x3] If I fit linear regression in sklearn it will give me something like this: y = a*x1 + b*x2 + c*x3 + intercept Polynomial regression with poly =2 will ...

Dylan

915

asked Aug 23, 2017 at 0:47

32 votes

3 answers

52k views

Python scikit learn Linear Model Parameter Standard Error

I am working with sklearn and specifically the linear_model module. After fitting a simple linear as in import pandas as pd import numpy as np from sklearn import linear_model randn = np.random....

Ryan

745

asked Mar 13, 2014 at 14:20

31 votes

8 answers

132k views

Scikit-Learn Linear Regression how to get coefficient's respective features?

I'm trying to perform feature selection by evaluating my regressions coefficient outputs, and select the features with the highest magnitude coefficients. The problem is, I don't know how to get the ...

jeffrey

3,254

asked Nov 15, 2014 at 23:14

31 votes

2 answers

84k views

lme4::lmer reports "fixed-effect model matrix is rank deficient", do I need a fix and how to?

I am trying to run a mixed-effects model that predicts F2_difference with the rest of the columns as predictors, but I get an error message that says fixed-effect model matrix is rank deficient so ...

Lisa

939

asked May 7, 2016 at 16:06

31 votes

3 answers

35k views

OLS Regression: Scikit vs. Statsmodels? [closed]

Short version: I was using the scikit LinearRegression on some data, but I'm used to p-values so put the data into the statsmodels OLS, and although the R^2 is about the same the variable coefficients ...

Nat Poor

451

asked Feb 26, 2014 at 22:34

30 votes

2 answers

61k views

How to plot statsmodels linear regression (OLS) cleanly

Problem Statement: I have some nice data in a pandas dataframe. I'd like to run simple linear regression on it: Using statsmodels, I perform my regression. Now, how do I get my plot? I've tried ...

Alex Lenail

13.7k

asked Feb 15, 2017 at 23:20

28 votes

1 answer

25k views

Linear Regression and Gradient Descent in Scikit learn?

In this Coursera course for machine learning, it says gradient descent should converge. I'm using Linear regression from scikit learn. It doesn't provide gradient descent info. I have seen many ...

Netro

7,229

asked Dec 26, 2015 at 6:57

28 votes

3 answers

50k views

Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression

If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with: theta = inv(X^T * X) * X^T * y one step is to calculate inv(X^T*X). Therefore ...

2Obe

3,640

asked Mar 19, 2018 at 7:04

27 votes

2 answers

58k views

geom_smooth in ggplot2 not working/showing up

I am trying to add a linear regression line to my graph, but when it's run, it's not showing up. The code below is simplified. There are usually multiple points on each day. The graph comes out fine ...

E Phillips

287

asked Feb 22, 2016 at 17:49

27 votes

3 answers

66k views

How to get the P Value in a Variable from OLSResults in Python?

The OLSResults of df2 = pd.read_csv("MultipleRegression.csv") X = df2[['Distance', 'CarrierNum', 'Day', 'DayOfBooking']] Y = df2['Price'] X = add_constant(X) fit = sm.OLS(Y, X).fit() print(fit....

Addzy K

715

asked Dec 10, 2016 at 11:38

25 votes

5 answers

28k views

Can scipy.stats identify and mask obvious outliers?

With scipy.stats.linregress I am performing a simple linear regression on some sets of highly correlated x,y experimental data, and initially visually inspecting each x,y scatter plot for outliers. ...

a different ben

3,950

asked Apr 19, 2012 at 15:14

25 votes

3 answers

9k views

Comparing Results from StandardScaler vs Normalizer in Linear Regression

I'm working through some examples of Linear Regression under different scenarios, comparing the results from using Normalizer and StandardScaler, and the results are puzzling. I'm using the boston ...

Jonathan Bechtel

3,527

asked Jan 7, 2019 at 1:12

25 votes

3 answers

44k views

Efficient Cointegration Test in Python

I am wondering if there is a better way to test if two variables are cointegrated than the following method: import numpy as np import statsmodels.api as sm import statsmodels.tsa.stattools as ts y =...

Akavall

84.3k

asked Jul 6, 2012 at 13:16

Collectives™ on Stack Overflow

Questions tagged [linear-regression]

Related Tags