Data Mining > QUESTIONS & ANSWERS > Harvard University DATA SCIENCE DATA SCIEN PH12.7X Linear Regression - 15 Q&A. 100% Score (All)

Harvard University DATA SCIENCE DATA SCIEN PH12.7X Linear Regression - 15 Q&A. 100% Score

Document Content and Description Below

1 Which of the following is NOT true about linear regression? Review Later Linear regression allows us to predict new values of the independent variable. Linear regression allows us to model how th... e target variable changes with the independent variables. In linear regression, the target variable is a continuous quantity. Linear regression is used to predict new values of the target variable. 2 The ordinary least squares (OLS) algorithm ________________ . Review Later Maximizes the sum of square residuals Minimizes the sum of square residuals Minimizes the square of the sum of residuals Maximizes the square of the sum of residuals 3 Overfitting occurs when _____________. Review Later The sum of square residuals is too large Our model does not have enough complexity The average of the errors is positive Our model becomes too specific to the training data 4. Using multiple linear regression to add in more independent variables ___________. Review Later can help explain more variation in the target variable allows us to fit a non-linear model to the data allows us to add more observational data to the model reduces the overfitting of the data 5. Multicollinearity is the phenomenon where _________________. Review Later the independent variables are strongly correlated with the residuals the target variable is strongly correlated with the residuals the independent variables are strongly correlated with other independent variables the target variable is strongly correlated with an independent variable 6 Which of the following is NOT an assumption of ordinary least squares (OLS): Review Later Homoscedasticity of Errors Endogeneity Random Sampling Linearity 7 Which assumption of OLS assumes that there is no correlation between the error and the independent variables? Review Later Zero Mean Errors Multicollinearity Endogeneity Autocorrelation of Errors 8 A regression analysis between sales (S) (in $1000) and price (P) (in dollars) resulted in the following equation: S = 50,000 - 8P The above equation implies that an ___________. Review Later increase of $1 in price is associated with a decrease of $8 in sales increase of $1 in price is associated with a decrease of $8000 in sales increase of $1 in price is associated with a decrease of $42,000 in sales increase of $8 in price is associated with an increase of $8,000 in sales 9 Which of the following is the formula for the mean square error? Review Later 10 Suppose we build a model to predict a store's sales with three independent variables; customers per day, average daily temperature, and number of products available. If we calculate the p-values for these variables as below, which variables are significant and should be kept in the model? Select all that apply. Variable p-Value Customers per day (I) 0.0 Average daily temperature (II) 0.54 Number of products available (III) 0.03 Review Later Variable I Variable II Variable III 11 Suppose we have produced a simple linear regression model with the following form: y = 0.65x + 2.9 We then calculate the coefficient of determination as 0.92 and a p-value of 0.1. Which of the following best describes our model? Review Later The model explains a high amount of variance, and the slope is statistically significant The model explains a high amount of variance but the slope is statistically insignificant The model explains a low amount of variance, but the slope is statistically significant The model explains a low amount of variance but the slope is statistically significant 12 Which of the following evaluation metrics is relative to the total error? Review Later Mean absolute error Mean square error Root mean square error Coefficient of determination 13 Which method of regression produces a probability distribution as opposed to a point estimate? Review Later Bayesian Regression Poisson Regression LASSO Regression Logistic Regression 14 You are given a dataset of air pollution readings from several locations in an urban setting. The measurements are taken every hour and include information about traffic flow. To perform regression on this longitudinal data, what kind of regression technique would you use? Review Later Repeated Measures Regression LASSO Regression Log-Log Regression Polynomial Regression 15 You are working with customer data from a large video-on-demand provider, which contains numerical fields with information such as average number of hours watched per month, number of logins per month, time spent browsing per month etc. In this data, there is a flag that indicates whether the customer canceled the service or not (1 for yes, 0 for no). You are looking to build a model from this data to classify what current customers will cancel. What type of model would you use? Review Later Random Effects Poisson Regression Logistic Regression Bayesian Regression [Show More]

Last updated: 1 year ago

Preview 1 out of 5 pages

Reviews( 0 )

$9.00

Add to cart

Instant download

Can't find what you want? Try our AI powered Search

OR

GET ASSIGNMENT HELP
76
0

Document information


Connected school, study & course


About the document


Uploaded On

Aug 16, 2022

Number of pages

5

Written in

Seller


seller-icon
QuizMaster

Member since 4 years

1087 Documents Sold


Additional information

This document has been written for:

Uploaded

Aug 16, 2022

Downloads

 0

Views

 76

Document Keyword Tags

Recommended For You

What is Browsegrades

In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Browsegrades · High quality services·