Questions tagged [linear-regression]

for issues related to linear regression modelling approach

linear-regression
Filter by
Sorted by
Tagged with
324 votes
10 answers
480k views

Add regression line equation and R^2 on graph

I wonder how to add regression line equation and R^2 on the ggplot. My code is: library(ggplot2) df <- data.frame(x = c(1:100)) df$y <- 2 + 3 * df$x + rnorm(100, sd = 40) p <- ggplot(data = ...
MYaseen208's user avatar
  • 23.3k
279 votes
15 answers
299k views

What is the difference between linear regression and logistic regression? [closed]

When we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an outcome given the input values. ...
London guy's user avatar
  • 27.7k
238 votes
7 answers
472k views

How to do exponential and logarithmic curve fitting in Python? I found only polynomial fitting

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic). I use Python and Numpy and for polynomial fitting there is a ...
Tomas Novotny's user avatar
193 votes
6 answers
620k views

Adding a regression line on a ggplot

I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this... data = data.frame(x.plot=rep(seq(1,5),10),y.plot=rnorm(50)) ...
Remi.b's user avatar
  • 17.8k
166 votes
6 answers
338k views

How to force R to use a specified factor level as reference in a regression?

How can I tell R to use a certain level as reference if I use binary explanatory variables in a regression? It's just using some level by default. lm(x ~ y + as.factor(b)) with b {0, 1, 2, 3, 4}. ...
Matt Bannert's user avatar
155 votes
15 answers
243k views

Multiple linear regression in Python

I can't seem to find any python libraries that do multiple regression. The only things I find only do simple regression. I need to regress my dependent variable (y) against several independent ...
Zach's user avatar
  • 4,654
136 votes
10 answers
137k views

Linear Regression and group by in R

I want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 states). I want to fit a regression for ...
JD Long's user avatar
  • 60.3k
117 votes
8 answers
396k views

Linear regression with matplotlib / numpy

I'm trying to generate a linear regression on a scatter plot I have generated, however my data is in list format, and all of the examples I can find of using polyfit require using arange. arange doesn'...
user avatar
113 votes
8 answers
330k views

Accuracy Score ValueError: Can't Handle mix of binary and continuous target

I'm using linear_model.LinearRegression from scikit-learn as a predictive model. It works and it's perfect. I have a problem to evaluate the predicted results using the accuracy_score metric. This is ...
Arij SEDIRI's user avatar
  • 2,108
83 votes
8 answers
234k views

How to overplot a line on a scatter plot in python?

I have two vectors of data and I've put them into pyplot.scatter(). Now I'd like to over plot a linear fit to these data. How would I do this? I've tried using scikitlearn and np.polyfit().
goldisfine's user avatar
  • 4,800
77 votes
4 answers
148k views

Linear regression analysis with string/categorical features (variables)?

Regression algorithms seem to be working on features represented as numbers. For example: This data set doesn't contain categorical features/variables. It's quite clear how to do regression on this ...
Erba Aitbayev's user avatar
76 votes
6 answers
149k views

How to get a regression summary in scikit-learn like R does?

As an R user, I wanted to also get up to speed on scikit. Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output. ...
mpg's user avatar
  • 3,779
75 votes
4 answers
27k views

why gradient descent when we can solve linear regression analytically

what is the benefit of using Gradient Descent in the linear regression space? looks like the we can solve the problem (finding theta0-n that minimum the cost func) with analytical method so why we ...
John's user avatar
  • 2,127
65 votes
6 answers
209k views

gradient descent using python and numpy

def gradient(X_norm,y,theta,alpha,m,n,num_it): temp=np.array(np.zeros_like(theta,float)) for i in range(0,num_it): h=np.dot(X_norm,theta) #temp[j]=theta[j]-(alpha/m)*( np.sum( ...
Madan Ram's user avatar
  • 876
61 votes
1 answer
233k views

How to calculate the 95% confidence interval for the slope in a linear regression model in R

Here is an exercise from Introductory Statistics with R: With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression model to the relation. According to the fitted model, ...
Yu Fu's user avatar
  • 1,171
58 votes
10 answers
66k views

Cost Function, Linear Regression, trying to avoid hard coding theta. Octave.

I'm in the second week of Professor Andrew Ng's Machine Learning course through Coursera. We're working on linear regression and right now I'm dealing with coding the cost function. The code I've ...
OhNoNotScott's user avatar
57 votes
6 answers
82k views

Why do I get only one parameter from a statsmodels OLS fit

Here is what I am doing: $ python Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin >>> import statsmodels.api as sm >>&...
Tom's user avatar
  • 2,859
55 votes
1 answer
6k views

Is there a better alternative than string manipulation to programmatically build formulas?

Everyone else's functions seem to take formula objects and then do dark magic to them somewhere deep inside and I'm jealous. I'm writing a function that fits multiple models. Parts of the formulas ...
bokov's user avatar
  • 3,484
54 votes
2 answers
36k views

How to make seaborn regplot partially see through (alpha)

When using seaborn barplot, I can specify an alpha to make the bars semi-translucent. However, when I try this with seaborn regplot, I get an error saying this is an unexpected argument. I read the ...
qwertylpc's user avatar
  • 2,036
51 votes
6 answers
109k views

TensorFlow: "Attempting to use uninitialized value" in variable initialization

I am trying to implement multivariate linear regression in Python using TensorFlow, but have run into some logical and implementation issues. My code throws the following error: Attempting to use ...
NEW USER's user avatar
  • 797
49 votes
3 answers
128k views

predict.lm() in a loop. warning: prediction from a rank-deficient fit may be misleading

This R code throws a warning # Fit regression model to each cluster y <- list() length(y) <- k vars <- list() length(vars) <- k f <- list() length(f) <- k for (i in 1:k) { vars[...
Mahsa's user avatar
  • 561
46 votes
5 answers
127k views

How to extract the regression coefficient from statsmodels.api?

result = sm.OLS(gold_lookback, silver_lookback ).fit() After I get the result, how can I get the coefficient and the constant? In other words, if y = ax + c how to get the values a and c?
JOHN's user avatar
  • 1,461
46 votes
3 answers
70k views

Linear Regression with a known fixed intercept in R

I want to calculate a linear regression using the lm() function in R. Additionally I want to get the slope of a regression, where I explicitly give the intercept to lm(). I found an example on the ...
R_User's user avatar
  • 10.9k
44 votes
8 answers
158k views

Error in Confusion Matrix : the data and reference factors must have the same number of levels

I've trained a Linear Regression model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error: Error in confusionMatrix.default(pred, testing$Final) : the ...
abcd's user avatar
  • 441
42 votes
5 answers
65k views

Linear Regression :: Normalization (Vs) Standardization

I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables. Normalization = x -xmin/ xmax – xmin   Zero ...
Santosh Kumar's user avatar
41 votes
7 answers
62k views

Linear Regression in Javascript [closed]

I want to do Least Squares Fitting in Javascript in a web browser. Currently users enter data point information using HTML text inputs and then I grab that data with jQuery and graph it with Flot. ...
Chris W.'s user avatar
  • 38.4k
40 votes
4 answers
68k views

How to force zero interception in linear regression?

I have some more or less linear data of the form: x = [0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 20.0, 40.0, 60.0, 80.0] y = [0.50505332505407008, 1.1207373784533172, 2.1981844719020001, ...
Kyra Tafar's user avatar
40 votes
7 answers
40k views

predict.lm() with an unknown factor level in test data

I am fitting a model to factor data and predicting. If the newdata in predict.lm() contains a single factor level that is unknown to the model, all of predict.lm() fails and returns an error. Is ...
Stephan Kolassa's user avatar
39 votes
5 answers
158k views

Linear Regression on Pandas DataFrame using Sklearn ( IndexError: tuple index out of range)

I'm new to Python and trying to perform linear regression using sklearn on a pandas dataframe. This is what I did: data = pd.read_csv('xxxx.csv') After that I got a DataFrame of two columns, let's ...
Dinosaur's user avatar
  • 665
39 votes
1 answer
48k views

In the LinearRegression method in sklearn, what exactly is the fit_intercept parameter doing? [closed]

In the sklearn.linear_model.LinearRegression method, there is a parameter that is fit_intercept = TRUE or fit_intercept = FALSE. I am wondering if we set it to TRUE, does it add an additional ...
user321627's user avatar
  • 2,430
37 votes
9 answers
72k views

How to find the features names of the coefficients using scikit linear regression?

I use scikit linear regression and if I change the order of the features, the coef are still printed in the same order, hence I would like to know the mapping of the feature with the coeff. #training ...
amehta's user avatar
  • 1,327
37 votes
2 answers
30k views

How (and why) do you use contrasts?

Under what cases do you create contrasts in your analysis? How is it done and what is it used for? I checked ?contrasts and ?C - both lead to "Chapter 2 of Statistical Models in S", which is not ...
Tal Galili's user avatar
35 votes
9 answers
142k views

ValueError: Expected 2D array, got 1D array instead:

While practicing Simple Linear Regression Model I got this error, I think there is something wrong with my data set. Here is my data set: Here is independent variable X: Here is dependent variable ...
danyialKhan's user avatar
35 votes
2 answers
15k views

Pandas rolling regression: alternatives to looping

I got good use out of pandas' MovingOLS class (source here) within the deprecated stats/ols module. Unfortunately, it was gutted completely with pandas 0.20. The question of how to run rolling OLS ...
Brad Solomon's user avatar
  • 39.6k
34 votes
2 answers
56k views

How does predict.lm() compute confidence interval and prediction interval?

I ran a regression: CopierDataRegression <- lm(V1~V2, data=CopierData1) and my task was to obtain a 90% confidence interval for the mean response given V2=6 and 90% prediction interval when V2=...
Mitty's user avatar
  • 495
34 votes
6 answers
75k views

python linear regression predict by date

I want to predict a value at a date in the future with simple linear regression, but I can't due to the date format. This is the dataframe I have: data_df = date value 2016-01-15 1555 ...
jeangelj's user avatar
  • 4,428
32 votes
8 answers
69k views

Are there any Linear Regression Function in SQL Server?

Are there any Linear Regression Function in SQL Server 2005/2008, similar to the the Linear Regression functions in Oracle ?
rao's user avatar
  • 1,034
32 votes
3 answers
62k views

How to add interaction term in Python sklearn

If I have independent variables [x1, x2, x3] If I fit linear regression in sklearn it will give me something like this: y = a*x1 + b*x2 + c*x3 + intercept Polynomial regression with poly =2 will ...
Dylan's user avatar
  • 915
32 votes
3 answers
52k views

Python scikit learn Linear Model Parameter Standard Error

I am working with sklearn and specifically the linear_model module. After fitting a simple linear as in import pandas as pd import numpy as np from sklearn import linear_model randn = np.random....
Ryan's user avatar
  • 745
31 votes
8 answers
132k views

Scikit-Learn Linear Regression how to get coefficient's respective features?

I'm trying to perform feature selection by evaluating my regressions coefficient outputs, and select the features with the highest magnitude coefficients. The problem is, I don't know how to get the ...
jeffrey's user avatar
  • 3,254
31 votes
2 answers
84k views

lme4::lmer reports "fixed-effect model matrix is rank deficient", do I need a fix and how to?

I am trying to run a mixed-effects model that predicts F2_difference with the rest of the columns as predictors, but I get an error message that says fixed-effect model matrix is rank deficient so ...
Lisa's user avatar
  • 939
31 votes
3 answers
35k views

OLS Regression: Scikit vs. Statsmodels? [closed]

Short version: I was using the scikit LinearRegression on some data, but I'm used to p-values so put the data into the statsmodels OLS, and although the R^2 is about the same the variable coefficients ...
Nat Poor's user avatar
  • 451
30 votes
2 answers
61k views

How to plot statsmodels linear regression (OLS) cleanly

Problem Statement: I have some nice data in a pandas dataframe. I'd like to run simple linear regression on it: Using statsmodels, I perform my regression. Now, how do I get my plot? I've tried ...
Alex Lenail's user avatar
  • 13.7k
28 votes
1 answer
25k views

Linear Regression and Gradient Descent in Scikit learn?

In this Coursera course for machine learning, it says gradient descent should converge. I'm using Linear regression from scikit learn. It doesn't provide gradient descent info. I have seen many ...
Netro's user avatar
  • 7,229
28 votes
3 answers
50k views

Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression

If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with: theta = inv(X^T * X) * X^T * y one step is to calculate inv(X^T*X). Therefore ...
2Obe's user avatar
  • 3,640
27 votes
2 answers
58k views

geom_smooth in ggplot2 not working/showing up

I am trying to add a linear regression line to my graph, but when it's run, it's not showing up. The code below is simplified. There are usually multiple points on each day. The graph comes out fine ...
E Phillips's user avatar
27 votes
3 answers
66k views

How to get the P Value in a Variable from OLSResults in Python?

The OLSResults of df2 = pd.read_csv("MultipleRegression.csv") X = df2[['Distance', 'CarrierNum', 'Day', 'DayOfBooking']] Y = df2['Price'] X = add_constant(X) fit = sm.OLS(Y, X).fit() print(fit....
Addzy K's user avatar
  • 715
25 votes
5 answers
28k views

Can scipy.stats identify and mask obvious outliers?

With scipy.stats.linregress I am performing a simple linear regression on some sets of highly correlated x,y experimental data, and initially visually inspecting each x,y scatter plot for outliers. ...
a different ben's user avatar
25 votes
3 answers
9k views

Comparing Results from StandardScaler vs Normalizer in Linear Regression

I'm working through some examples of Linear Regression under different scenarios, comparing the results from using Normalizer and StandardScaler, and the results are puzzling. I'm using the boston ...
Jonathan Bechtel's user avatar
25 votes
3 answers
44k views

Efficient Cointegration Test in Python

I am wondering if there is a better way to test if two variables are cointegrated than the following method: import numpy as np import statsmodels.api as sm import statsmodels.tsa.stattools as ts y =...
Akavall's user avatar
  • 84.3k

1
2 3 4 5
132