Programming > QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC Homework 10: Linear Regression (All)

University of California, Berkeley DATA MISC Homework 10: Linear Regression

Document Content and Description Below

University of California, Berkeley DATA MISC  Homework 10: Linear Regression Reading: Prediction (https://www.inferentialthinking.com/chapters/15/prediction.html) 1. Triple Jump Distances vs... . Vertical Jump Heights Does skill in one sport imply skill in a related sport? The answer might be different for different activities. Let us find out whether it's true for the triple jump (https://en.wikipedia.org/wiki/Triple_jump) (a horizontal jump similar to a long jump) and the vertical jump. Since we're learning about linear regression, we will look specifically for a linear association between skill level in the two sports. The following data was collected by observing 40 collegiate level soccer players. Each athlete's distances in both jump activities were measured in centimeters. Run the cell below to load the data. Question 1 Before running a regression, it's important to see what the data look like, because our eyes are good at picking out unusual patterns in data. Draw a scatter plot with the triple jump distances on the horizontal axis and the vertical jump heights on vertical axis that also shows the regression line. See the documentation on scatter here (http://data8.org/datascience/_autosummary/datascience.tables.Table.scatter.html#datascience.tables.Table.scatt for instructions on how to have Python draw the regression line automatically. Question 2 Does the correlation coefficient r look closest to 0, .5, or -.5? Explain. The correlation coefficient r looks closest to 0.5. The fitted line is trending up so r is positive, and the slope of the fitted line looks close to 0.5. Question 3 Create a function called regression_parameters . It takes as its argument a table with two columns. The first column is the x-axis, and the second column is the y-axis. It should compute the correlation between the two columns, then compute the slope and intercept of the regression line that predicts the second column from the first, in original units (centimeters). It should return an array with three elements: the correlation coefficient of the two columns, the slope of the regression line, and the intercept of the regression line. Question 4 Let's use parameters to predict what certain athletes' vertical jump heights would be given their triple jump distances. The world record for the triple jump distance is 18.29 meters by Johnathan Edwards. What's our prediction for what Edwards' vertical jump would be? Hint: Make sure to convert from meters to centimeters! Question 5 Do you expect this estimate to be accurate within a few centimeters? Why or why not? Hint: Compare Edwards' triple jump distance to the triple jump distances in jumps . Is it relatively similar to the rest of the data? No, because Edward's triple jump distance is much greater than any of the triple jump distances in the table. Because it is an outlier, estiamting his vertical jump distance using the data in the table will not be accurate within a few centimeters. 2. Cryptocurrencies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running tests --------------------------------------------------------------------- Test summary Passed: 1 Failed: 0 [ooooooooook] 100.0% passed Saving notebook... Saved 'hw10.ipynb'. Backup... 100% complete Backup successful for user: [email protected] URL: https://okpy.org/cal/data8/fa18/hw10/backups/KZnENn NOTE: this is only a backup. To submit your assignment, use: python3 ok --submit 11/8/2018 hw10 https://datahub.berkeley.edu/user/jasonshi/nbconvert/html/materials-fa18/materials/fa18/hw/hw10/hw10.ipynb?download=false 7/25 Imagine you're an investor in December 2017. Cryptocurrencies, online currencies backed by secure software, are becoming extremely valuable, and you want in on the action! The two most valuable cryptocurrencies are Bitcoin (BTC) and Ethereum (ETH). Each one has a dollar price attached to it at any given moment in time. For example, on December 1st, 2017, one BTC costs 10859.56 and one ETH costs 424.64. You want to predict the price of ETH at some point in time based on the price of BTC. Below, we load (https://www.kaggle.com/jessevent/all-crypto-currencies/data) two tables called btc and eth . Each has 5 columns: date , the date open , the value of the currency at the beginning of the day close , the value of the currency at the end of the day market , the market cap or total dollar value invested in the currency day , the number of days since the start of our data Question 1 In the cell below, make one or two plots to investigate the opening prices of BTC and ETH as a function of time. Then comment on whether you think the values roughly move together. The values roughly moved together when they were lower in value, but started spreading out once they were higher in value. In all, they show a positive relationship Question 2 Now, calculate the correlation coefficient between the opening prices of BTC and ETH. Hint: It may be helpful to define and use the function std_units . Question 3 Regardless of your conclusions above, write a function eth_predictor which takes an opening BTC price and predicts the price of ETH. Again, it will be helpful to use the function regression_parameters that you defined earlier in this homework. Note: Make sure that your eth_predictor is using linear regression. Question 4 Now, using the eth_predictor you defined in the previous question, make a scatter plot with BTC prices along the x-axis and both real and predicted ETH prices along the y-axis. The color of the dots for the real ETH prices should be different from the color for the predicted ETH prices. Hints: An example of such a scatter plot is generated <a href= "https://www.inferentialthinking.com/chapters/15/2/regression-line.html (https://www.inferentialthinking.com/chapters/15/2/regression-line.html) "> here. </a> Think about the table that must be produced and used to generate this scatter plot. What data should the columns represent? Based on the data that you need, how many columns should be present in this table? Also, what should each row represent? Constructing the table will be the main part of this question; once you have this table, generating the scatter plot should be straightforward as usual. [Show More]

Last updated: 1 year ago

Preview 1 out of 25 pages

Reviews( 0 )

Recommended For You

 Programming> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC Homework 2: Arrays and Tables¶ (All)

preview
University of California, Berkeley DATA MISC Homework 2: Arrays and Tables¶

University of California, Berkeley DATA MISC Homework 2: Arrays and Tables¶ Recommended Reading: • Data Types • Sequences • Tables. 1. Creating Arrays¶ Question 1. Make an array called...

By QuizMaster , Uploaded: Oct 02, 2022

$9

 Programming> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC  Homework 03  (All)

preview
University of California, Berkeley DATA MISC  Homework 03 

hw03 September 13, 2018 1 Homework 3: Table Manipulation and Visualization Reading: * Visualization Please complete this notebook by filling in the cells provided. Before you begin, execute the f...

By QuizMaster , Uploaded: Oct 02, 2022

$9

 Programming> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC Homework 6: Probability, Simulation, Estimation, and Assessing Models (All)

preview
University of California, Berkeley DATA MISC Homework 6: Probability, Simulation, Estimation, and Assessing Models

Homework 6: Probability, Simulation, Estimation, and Assessing Models Reading: Randomness (https://www.inferentialthinking.com/chapters/09/randomness.html) Sampling and Empirical Distributions (ht...

By QuizMaster , Uploaded: Oct 02, 2022

$9

 Programming> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC Homework 9: Central Limit Theorem (All)

preview
University of California, Berkeley DATA MISC Homework 9: Central Limit Theorem

University of California, Berkeley DATA MISC 1 Homework 9: Central Limit Theorem Reading: * Why the mean matters Please complete this notebook by filling in the cells provided. Before you begi...

By QuizMaster , Uploaded: Oct 02, 2022

$9

 Computer Science> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC  Lab 09 (All)

preview
University of California, Berkeley DATA MISC  Lab 09

University of California, Berkeley DATA MISC  Lab 09 August 6, 2019 [64]: # Initialize OK from client.api.notebook import Notebook ok = Notebook('lab09.ok') ==================================...

By Kirsch , Uploaded: Nov 08, 2022

$9

 Database Management> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC rah031 (All)

preview
University of California, Berkeley DATA MISC rah031

Project 3 - Classification Welcome to the third project of Data 8! You will build a classifier that guesses whether a movie is romance or action, using only the numbers of times words appear in the...

By AGRADES , Uploaded: Nov 08, 2022

$10

 DATA ANALYSIS> QUESTIONS & ANSWERS > University of California, Berkeley DATA MISC ProfMask8468 (All)

preview
University of California, Berkeley DATA MISC ProfMask8468

hw09 July 14, 2020 [1]: # Initialize OK from client.api.notebook import Notebook ok = Notebook('hw09.ok') ===================================================================== Assignment: Resamp...

By AGRADES , Uploaded: Nov 08, 2022

$8

 Management> QUESTIONS & ANSWERS > Global Logistics and Supply Chain Management, 4th Edition by Mangan 4th edition Quiz - Answers (All)

preview
Global Logistics and Supply Chain Management, 4th Edition by Mangan 4th edition Quiz - Answers

Test Bank for Global Logistics and Supply Chain Management, 4th Edition, 4e by Mangan, Lalwani, Calatayud TEST BANK ISBN-13: 9781119702993 Full chapters included (This book has combined Test Bank-Ch...

By Test-Bank Lounge , Uploaded: Sep 29, 2022

$10

 *NURSING> QUESTIONS & ANSWERS > NURSING NR 228 Nutrition Week 7 edapt - Nutrition and Biliary Health- nutrition for gastrointestinal health - Chamberlain College of Nursing (All)

preview
NURSING NR 228 Nutrition Week 7 edapt - Nutrition and Biliary Health- nutrition for gastrointestinal health - Chamberlain College of Nursing

NURSING NR 228 Nutrition Week 7 edapt - Nutrition and Biliary Health- nutrition for gastrointestinal health - Chamberlain College of Nursing

By QuizMaster , Uploaded: Mar 29, 2023

$10.5

 *NURSING> QUESTIONS & ANSWERS > NURSING NR 228 Nutrition- Week 3 nutrition edapt - Introduction to Energy - Chamberlain College of Nursing (All)

preview
NURSING NR 228 Nutrition- Week 3 nutrition edapt - Introduction to Energy - Chamberlain College of Nursing

NURSING NR 228 Nutrition- Week 3 nutrition edapt - Introduction to Energy - Chamberlain College of Nursing

By QuizMaster , Uploaded: Mar 29, 2023

$3

$9.00

Add to cart

Instant download

Can't find what you want? Try our AI powered Search

OR

GET ASSIGNMENT HELP
28
0

Document information


Connected school, study & course



About the document


Uploaded On

Oct 02, 2022

Number of pages

25

Written in

Seller


seller-icon
QuizMaster

Member since 4 years

1064 Documents Sold


Additional information

This document has been written for:

Uploaded

Oct 02, 2022

Downloads

 0

Views

 28

Document Keyword Tags

THE BEST STUDY GUIDES

Avoid resits and achieve higher grades with the best study guides, textbook notes, and class notes written by your fellow students

custom preview

Avoid examination resits

Your fellow students know the appropriate material to use to deliver high quality content. With this great service and assistance from fellow students, you can become well prepared and avoid having to resits exams.

custom preview

Get the best grades

Your fellow student knows the best materials to research on and use. This guarantee you the best grades in your examination. Your fellow students use high quality materials, textbooks and notes to ensure high quality

custom preview

Earn from your notes

Get paid by selling your notes and study materials to other students. Earn alot of cash and help other students in study by providing them with appropriate and high quality study materials.

WHAT STUDENTS SAY ABOUT US


What is Browsegrades

In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Browsegrades · High quality services·