Analysis for Home Prices for Austin, Texas

 
Get Writing Help
 

Introduction

Multiple regression is used to explain situations where several explanatory variables works together to explain a response. In this project, a multiple regression model was used to compare home prices in Austin. Home prices are thought to depend on various independent factors. A sample of about 30 homes in Austin was chosen. A multiple regression analysis was used to determine how various explanatory factors affect the prices of homes in Austin. Multiple regression analysis was used to come up with a regression equation which can be used to predict future home prices in Austin or which can explain why prices for homes in Austin changes from one place to another.

Austin Background

Austin is the capital city of Texas and the seat of Travis County. It is the thirteen most popular city of the United States of America and was among the first three fast–growing cities of USA from 2000 to 2006 (AustinTexas, 2012). The population size of Austin is about 820,611. The city of Austin was started around 1830’s when immigrants settled at the center of Austin. It was made Texas capital by 1839. Initially it was known as Waterloo but the name was later changed to Austin (AustinTexas, 2012).

Austin is a center of high technology and innovation and provides employment for graduates from all over USA. A lot of multinational companies are located in Austin which includes: Dell, Apple, Google, PayPal and Pharmaceutical and Biotechnology. As a result, Austin is a large employment provider for the Americans. Home prices in Austin differ from place to another and the cost is determined by various factors. The cost depends on whether is a single family home, condor or a townhouse. The home prices also depend on their geographical location in Austin, the number of bedrooms and bathrooms present in them and whether the houses contain an attached garage. Last but not least, the home prices are very much determined by the size of the plot where the home is constructed (Zillow, 2012).

Model Specification and Data

The purpose of multiple regression models is to analyse the relation between metric or dichotomous independent variable and a metric dependent variable. Once a relationship exists between the independent variables and the dependent variable, the relationship can be used to predict the values of the dependent variable. The relationship follows the format given below:

Y=β0+ β1x1+ Β2x23x3+ β 3X3 4X4…… βn xK + E

Where

β 0, β 1, β 2, β3 and β4 are parameters

Y= Response or the dependent variable

X1, X2, X3 and X4 are the independent variables/Explanatory variables

Ε= Random error variable/Error term

Values of X1, X2, X3 and X4 are known constants but the values of β 0, β 1, β 2, β3 and β4 have to be estimated to come up with the regression equation. Once they are known one can predict how the independent variables affect the response of dependent variable Y and one can also determine how one independent factor affects the dependent variable as long as the other factors are held constant. For instance, the value of β 1 indicates the change in the mean response per increase in X1 when the rest of the independent variables in the model are held constant.

The parameters β 0, β 1, β 2, β 3 and β4 are frequently called partial regression coefficients because they reflect the partial effect of one independent variable when the rest of the independent variables included in the model are held constant. The value of the error is minimized in multiple regression analysis and the value becomes smaller and smaller as more independent variables are included in analysis (Lind et al, 2012). Regression estimates are more reliable when the number of observations is high or when we have a large data set, when the variance of a given explanatory variable is high and when we have less closely related independent variables.

To analyse home prices of Austin using multiple regression analysis, data was first collected to obtain a representative sample. The data included house types, house plot size in terms of square feet, price per square feet, number of bedrooms present, number of bathrooms present, availability of an attached garage, house views, type of house cooling systems, whether the houses have swimming pools and their location in Austin. All these factors affect the prices of home in Austin making some homes to be costly than others.

The data was obtained online from Zillow, a real estate advertising company using probability sampling whereby a convenient sample was obtained randomly (Zillow, 2012). A sample with a description of 34 homes was obtained which is tabulated in Appendix 1. Four independent variables were selected which include: Square feet(X1), Number of Bedrooms(X2), Number of Bathrooms(X3) and availability of an attached garage (X4) which were subjected to multiple regression model to determine how they affect the home prices for Austin. Once the values of β0 β1, β2 and β3 and β4 were estimated the regression equation was formed.

Presentation and Discussion of Results

The data in Appendix 1 was subjected to multiple regression analysis using Microsoft excel 2007 data analysis tool. The significance level used in the analysis was 0.05% (Alpha=0.05). The calculated value of R2,adjusted R2, Standard error of the regression coefficient, F statistic and its significance, the values of the parameters (regression coefficients): β0, β123 and β4,standard error of β0, β123 and β4 ,T-test values of the regression coefficients and their significance and well as the 95% confidence intervals for the regression coefficients have been tabulated in Table 1,Table 2 and Table 3 below.

Table 1: Regression statistics.

Regression Statistics.
Multiple R 0.811315554 Square root of R2
R Square 0.658232928 R2
Adjusted R Square 0.611092643 Adjusted R-Squared used for several X’S
Standard Error 105815.3112 Standard deviation of the regression error (E)
Observations 34 Total number of observation used in regression (n)

Alpha =0.05

Table 2: Analysis of Variance.

ANOVA
df SS MS F Significance F
Regression 4 6.25381E+11 1.56345E+11 13.96327829 1.82856E-06
Residual 29 3.2471E+11 11196880074
Total 33 9.5009E+11

Alpha =0.05

Regression Sum of Squares=6.2538 Regression Mean Square=1.56345E+11

Residual Sum of Squares=3.247 Residual/Error Sum of Squares=11196880074

Total Sum of Squares=9.500 Calculated F=13.9632

Table 3: Regression Coefficients (β0, β1, β2, β3, β4), Standard error of the regression coefficients (Sb0, Sb1, Sb2, Sb3, Sb4), regression coefficient t-values, t-test p-values (test for β0, β1, β2, β3, β4=0) and 95% confidence intervals for the regression coefficients.

Regression Coefficients Standard Error t Statistics P-value Lower 95% Upper 95%
Intercept 16889.74659 70954.26376 0.238037092 0.81352621 -128228.0147 162007.508
X1(Square feet) 144.6338244 33.12051557 4.366895319 0.00014659 76.89476523 212.372884
X 2(Bedrooms) -37966.89113 33055.33753 -1.148585795 0.26011288 -105572.6463 29638.864
X 3(Bathrooms) 27704.99678 34621.05523 0.800235481 0.43008142 -43103.01054 98513.0041
X4(Garage) 51847.43925 42007.26342 1.23424939 0.22701869 -34067.05978 137761.938

Alpha =0.05

Y-intercept = β0 = 16889.746. This is the value of Y when all the other variables in the model take the value of zero. The regression equation is given below.

Y = 16889.74+144.634 X1-37966.89 X2+27704.99 X3+51847 X4

HOME PRICE = 16889.74 + 144.634 SQUARE FEET – 37966.89 BEDROOMS + 27704.99 BATHROOMS + 51847GARAGE

As a result, this multiple regression model can be used in predicting interval for a particular value of Y for a given set value of X’s. It can also be used to produce an interval estimate for the expected value of Y for a given set value of X’s. Last but not least the multiple regression model can be used to elucidate the relationship between various explanatory variables and the given independent variable by interpretation of the regression coefficients of the X’s.

Discussion

Adjusted R-squared (R2) represents the proportion of variability of home price explained by the independent variables (X’s). This R2 is adjusted so that models with different number of variables can be compared. R-Squared is used in multiple regression analysis because it does not automatically rise when an extra independent variable is added to the model. However when a new variable is added in multiple regression, it affects the coefficients of the existing variables. Adjusted R2 is dependent on t-statistics of the explanatory variable because when the t-test of an extra variable exceeds 1 the adjusted R2 rises but it does not necessarily imply that the extra independent variable is significant (Lind et al, 2012). From the results in Table1, the adjusted R2 value is 0.6111. This denotes that 60.11% of the total variability of the home prices in Austin, Texas can be attributed to the four independent variables (the regressors) which have been considered in the model.

The F-test shows the analysis of variance of regression and is used to test for the significance of the group of four independent variables used in the analysis. The F-test is a test for goodness of fit of the regression line and it tests for joint significance or restriction of the explanatory variables (Lind et al, 2012). The overall F-statistic from the ANOVA Table 2 is 13.963 and the associated level of significance is 1.828 of H0: β1, β2, β3 and β4 = 0 versus Ha: at least one of β 1, β2, β3 and β4 does not equal zero. The significance value of 1.828 is greater than 0.05 which is the required level of significance. Hence the F statistic value is significant implying that the β1, β2, β3, β4 are not equal to zero. As a result there is a linear relationship between the price and at least one of the four independent variables. This shows that an increase in the number of bathrooms, number of bedrooms and even the plot size will directly cause an increase in home prices in Austin.

The t- test values of each partial regression coefficient have been tabulated in Table 3 together with associated level of significance (p-value) for the four regression coefficients which is 0.0014, 0.2601, 0.4300 and 0.227 for β1, β2, β3 and β4 respectively of H0: β1, β2, β3 and β4 = 0 versus Ha: at least one of β1, β2, β3 and β4 does not equal zero. T-test for β1 is less than 0.05 hence is not significant while t-test for β2, β3 and β4 is significant because it is greater than 0.05. Significant t-test indicates that a given independent variable in the model influences the response of the dependent variable while controlling for the other independent variables (Lind et al., 2012). This implies that factors like number of bedrooms, number of bathrooms and availability of an attached garage to house can individually affect home price in Austin.

Conclusion

Multiple regression models are useful for prediction and forecasting of events such future prices of goods and services with the help of other variables which are likely to affect the prices. Multiple regression models can be used in analyzing seasonal effects whereby F-test is mostly used in making the inferences. Changes in demand and predicting the total production cost of companies for future years can also be achieved by carrying out multiple regression analysis. Multiple regression models are also used in explanation of the current state of situations and even in theory building. In multiple regression analysis, one is interested with the number and significance of relationship between the independent variables and the dependent variable however correlation between variables needs to be considered. This is because in many non-experimental situations such as in business, economic, social and biological sciences, most of the explanatory variables used in multiple regressions are related to themselves.

In this context, multiple regression model can be used tell why the home prices in Austin are differing from one place to another and which factors are positively or negatively influencing the home prices and which factors do not affect home prices in Austin. For instance the price of a single family house with a plot size of 3000 square feet, 5 bedrooms, 3 bathrooms and an attached garage for sell in Austin can be estimated using the regression equation obtained above which will be:

Y=16889.74+144.634 X1-37966.89 X2+27704.99 X3+51847 X4

Home Price in USD = 16889.74 + 144.634(3000) -37966.89(5) +27704.99(3) +5184(1) =$.349, 256.26

References

AustinTexas. (2012). The official Website for the city of Austin. Web.

Lind, D.A., Marchal, W.G. & Wathen, S. (2012). Basic Statistics for Business & Economics.8th edition, Publisher: McGraw-Hill Irwin.ISBN 978-0-07-352147-3.

Zillow. (2012). Online Real Estate Advertising Company. Web.

Appendices

Appendix 1: Description of Austin Homes

PRICE(P) IN USD SQUARE FEET(SQFT) P/SQFT IN USD BEDROOMS BATHROOMS GARAGE LOCATION AT AUSTIN,TEXAS HOUSE TYPE
303,500 2264 134 3 2 1 10309 THISTLE MOSS COVE 78739 SINGLE FAMILY(S.F)
219,000 1705 98 3 2 1 13318 PERTHSHIRE ST,78729 SINGLE FAMILY(S.F)
280,000 2673 89 3 3 1 8005 WELDON SPRINGS ST. S.F
448,000 3601 104 5 5.5 0 2724 COLLYWOOD DR. S.F
210,000 968 216 2 1.5 0 3601 MANCHACA RD APT 106 CONDOR( C)
259,900 2656 97 4 2.5 1 8822 WHITEWORTH LOOP S.F
139,900 1348 103 3 2 1 1106 JORDAN LN S.F
725,000 3023 239 3 2.5 1 3312 WESTHILL DR. CONDOR( C)
124,000 1080 115 3 1 0 6706 TAMPA CV. S.F
180,920 1918 94 3 2 0 14416 LAKE VICTOR C
170,000 720 236 1 1 0 3601 MANCHACA RD APT 111 S.F
89,900 1049 85 2 2 0 1015E YAGER LN UNIT 50 S.F
99,900 1352 73 3 2 1 901E MENDOWMERE LN S.F
239,900 2214 108 3 2.5 0 10608 DESERT WILLOW LOOP TOWNHOUSE
329,900 1741 189 3 2 0 400 E 33RD ST. C
635,000 2655 239 3 3 1 1036 LIBERTY PARK DR.#901 S.F
165,000 1743 94 3 2.5 1 1704 CANNON YEOMANS TRL. S.F
199,900 2000 99 4 2 1 10400 YELLOWSTONE DR. S.F
139,000 1300 106 2 1 0 2731 LYONS RD. S.F
475,000 2694 176 4 2 1 5301 MARSH CREEK DR. S.F
294,000 1706 172 3 2 0 1415 SUFFOLK DR. S.F
400,000 1892 211 4 3 1 1311 PALO DURO RD. S.F
299,000 2059 145 3 2.5 1 3007 SUN RIDGE DR. C
124,000 882 141 2 1.5 1 6700 COOPER LN APT 42 S.F
549,000 3167 173 3 3 1 11820 GRANITE BAY PI S.F
299,000 2059 145 4 3 1 3007 SUNRIDGE DR. S.F
400,000 1892 211 3 2.5 1 1311 PALO DURO RD. S.F
407,000 3572 113 4 3 1 9509 BUNGALOW LN S.F
324,000 1780 182 3 2 1 2202 ALEXANDER AVE. C
350,000 3529 98 2 2 0 804 E 45TH ST. C
114,900 1043 110 2 2 1 1417 W BAKER # B S.F
529,000 3502 148 4 4 1 5732 MISTY HILL CV. S.F
235,000 2143 109 4 2 1 5637 WAGON TRAN RD. S.F
675,000 4040 167 4 4.5 1 7201 CERCIS CV. S.F

Appendix 2: Regression statistics, ANOVA, Regression coefficient statistics and Residual output

Regression Statistics
Multiple R 0.811315554
R Square 0.658232928
Adjusted R Square 0.611092643
Standard Error 105815.3112
Observations 34
ANOVA
df SS MS F Significance F
Regression 4 6.25381E+11 1.56345E+11 13.96327829 1.82856E-06
Residual 29 3.2471E+11 11196880074
Total 33 9.5009E+11
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 16889.74659 70954.26376 0.238037092 0.81352621 -128228.0147 162007.508
X1(Square feet) 144.6338244 33.12051557 4.366895319 0.00014659 76.89476523 212.372884
X 2(Bedrooms) -37966.89113 33055.33753 -1.148585795 0.26011288 -105572.6463 29638.864
X 3(Bathrooms) 27704.99678 34621.05523 0.800235481 0.43008142 -43103.01054 98513.0041
X4(Garage) 51847.43925 42007.26342 1.23424939 0.22701869 -34067.05978 137761.938
RESIDUAL OUTPUT
Observation Predicted Y(Price) Residuals Standard Residuals
1 337697.4845 -34197.48447 -0.344749479
2 256847.1766 -37847.17663 -0.381542521
3 424557.7154 -144557.7154 -1.457305935
4 500259.1749 -52259.17493 -0.52683183
5 122519.0015 87480.99847 0.881907811
6 370279.5509 -110379.5509 -1.112751223
7 205212.9013 -65312.90131 -0.658428216
8 461327.0556 263672.9444 2.658122714
9 86898.60034 37101.39966 0.374024242
10 235806.742 -54886.74198 -0.55332069
11 110764.2058 59235.79418 0.597164075
12 148086.8397 -58186.8397 -0.586589423
13 205791.4366 -105891.4366 -1.067505934
14 292470.8524 -52570.85239 -0.529973893
15 210206.5551 119693.4449 1.206645852
16 421954.3066 213045.6934 2.14774253
17 276195.7603 -111195.7603 -1.120979541
18 261547.2637 -61647.26369 -0.621474426
19 156684.9328 -17684.93284 -0.178284207
20 361923.1378 113076.8622 1.139943184
21 205144.3712 88855.6288 0.895765645
22 273631.8074 126368.1926 1.273934888
23 321900.0489 -22900.04886 -0.230858498
24 161927.9319 -37927.93188 -0.382356625
25 496006.8247 52993.17531 0.534231388
26 297785.6561 1214.343883 0.012241965
27 297746.2002 102253.7998 1.030834425
28 516616.6324 -109616.6324 -1.105060139
29 267694.7135 56305.28654 0.56762123
30 506778.7242 -156778.7242 -1.580507583
31 199066.476 -84166.476 -0.84849366
32 534197.2615 -5197.261516 -0.052394298
33 282229.9006 -47229.90058 -0.476131034
34 625862.7574 49137.24256 0.495359207
 
Get Writing Help
 

Discover more from Academic Focused Tutors

Subscribe to get the latest posts sent to your email.


Posted

in

by

Tags: