Information Technology > QUESTIONS & ANSWERS > Georgia Tech ISYE - 6501 Homework 2 Due Date: Thursday, September 3rd, 2020, Graded A+ (All)

Georgia Tech ISYE - 6501 Homework 2 Due Date: Thursday, September 3rd, 2020, Graded A+

Document Content and Description Below

ISYE - 6501 Homework 2 Due Date: Thursday, September 3rd, 2020 Contents 1 ISYE - 6501 Homework 2 2 2 Homework Analysis 2 2.1 Analysis 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . 2 2.2 Analysis 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Analysis 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 11 ISYE - 6501 Homework 2 This document contains my analysis for ISYE - 6501 Homework 2 which is due on Thursday, September 3rd, 2020. Enjoy! 2 Homework Analysis 2.1 Analysis 3.1 Q: Using the same data set (credit_card_data.txt or credit_card_data-headers.txt) as in Question 2.2, use the ksvm or kknn function to find a good classifier. (a) using cross-validation (do this for the k-nearest-neighbors model; SVM is optional) RESULTS By using cross-validation at 10 folds on a k-nearest-neighbors (KNN) model, at k=15 with a rectangular kernel, we were able to achieve an accuracy score of roughly 85% (85.47009%). This means that 85 out of every 100 applicants is predicted correct! THE CODE: # needed library rm(list=ls()) library(kknn) library(dplyr) set.seed(12345) # read data into R data_path <- "data 3.1/" data_filename <- "credit_card_data-headers.txt" credit_data <- read.delim(paste0(data_path, data_filename), header=TRUE) # train-valid-test-split sample_split <- sample(1:3, size=nrow(credit_data), prob=c(0.7,0.15,0.15), replace = TRUE) train_credit <- credit_data[sample_split==1,] valid_credit <- credit_data[sample_split==2,] test_credit <- credit_data[sample_split==3,] # training our model train_model <- train.kknn(R1~., train_credit, kmax=100, scale=TRUE, kcv=10, kernel=c("rectangular", "triangular", "epanechnikov", "gaussian", "rank", "optimal"), kpar=list()) train_model ## ## Call: 2## train.kknn(formula = R1 ~ ., data = train_credit, kmax = 100, kernel = c("rectangular", "triangul ## ## Type of response variable: continuous ## minimal mean absolute error: 0.221968 ## Minimal mean squared error: 0.1175795 ## Best kernel: rectangular ## Best k: 15 Using cross-validation at 10 folds, we can see that the best kernel for our model is rectangular with a k of 15. Now that we have the best parameters for our model, let’s use them to train our validation data. # validating our model valid_model <- train.kknn(R1~., valid_credit, ks=15, kernel="rectangular", scale=TRUE, kpar=list()) valid_pred <- round(predict(valid_model, valid_credit)) accuracy_score <- sum(valid_pred == valid_credit[,11]) / nrow(valid_credit) accuracy_score * 100 ## [1] 91 Our validation model provides and accuracy score of 91%. Now, let’s run the model through our test data to measure its true performance on data it hasn’t seen with before. # run test data through model test_pred <- round(predict(valid_model, test_credit)) accuracy_score <- sum(test_pred == test_credit[,11]) / nrow(test_credit) accuracy_score * 100 ## [1] 85.47009 Our test data provides an accuracy score of roughly 85%. This is lower than the validation data accuracy score (91%). We can conclude that our model favored the randomness of the data it was validated (valid_credit) on over data it hadn’t seen before (test_credit). Thus, being closer to the model’s true performance. (b) splitting the data into training, validation, and test data sets (pick either KNN or SVM; the other is optional). RESULTS A train-valid-test split allows us to produce a Support Vector Machine (SVM) with a splinedot kernel and C value of 1 that achieves an accuracy score of roughly 82% (82.05128). OUR CLASSIFIER EQUATION Based on the SVM model classifier equation below: classifier = β0 + β1x1 + β2x2...βpxi With an error margin (C) of 1 and the splinedot kernel, we produced the following classifier equation: classifier = −0.1994897 + 0.040093784x1 + 0.105580560x2 − 0.023190123x3 + 0.019249666x4 + 0.379332434x5 − 0.118072986x6 − 0.001239596x7 − 0.025400298x8 + 0.026589464x9 + 0.101784342x10 3THE CODE: # needed library library(kernlab) library(magicfor) library(ggplot2) library(hrbrthemes) set.seed(12345) # train-valid-test-split sample_split <- sample(1:3, size=nrow(credit_data), prob=c(0.7,0.15,0.15), replace = TRUE) train_credit <- credit_data[sample_split==1,] valid_credit <- credit_data[sample_split==2,] test_credit <- credit_data[sample_split==3,] # training our model magic_for(print, silent = TRUE) kerns <- list("rbfdot", "polydot", "vanilladot", "tanhdot", "laplacedot", "besseldot", "anovadot", "splinedot") for (kern in kerns) { train_model <- ksvm(R1~., data=train_credit, type="C-svc", kernel=kern, C=1, scaled=TRUE, kpar=list()) train_pred <- predict(train_model, train_credit[,1:10]) accuracy <- sum(train_pred == train_credit[,11]) / nrow(train_credit) print(accuracy) } kern_accuracy <- magic_result_as_dataframe() # displaying our model’s best kernel ggplot(kern_accuracy, aes(x=kern, y=accuracy, color=accuracy)) + geom_point(size=8) + ylim(c(.7, 1)) + labs(title="Accuracy Scores vs Kernel Function", y="Accuracy Scores", x="Kernel Function") + theme(plot.title = element_text(hjust=0.5), axis.title.x = element_text(hjust=0.5, size=14), axis.title.y = element_text(hjust=0.5, size=14), legend.position = "None", axis.line=element_line(size = 1, linetype="solid")) + geom_vline(xintercept="splinedot", linetype="dotted", color="blue", size=2) [Show More]

Last updated: 1 year ago

Preview 1 out of 9 pages

Add to cart

Instant download

document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Add to cart

Instant download

Also available in bundle (1)

GEORGIA TECH BUNDLE, ALL ISYE 6501 EXAMS, HOMEWORKS, QUESTIONS AND ANSWERS, NOTES AND SUMMARIIES, ALL YOU NEED

GEORGIA TECH BUNDLE, ALL ISYE 6501 EXAMS, HOMEWORKS, QUESTIONS AND ANSWERS, NOTES AND SUMMARIIES, ALL YOU NEED

By bundleHub Solution guider 1 year ago

$60

59  

Reviews( 0 )

$7.00

Add to cart

Instant download

Can't find what you want? Try our AI powered Search

OR

REQUEST DOCUMENT
134
0

Document information


Connected school, study & course


About the document


Uploaded On

Sep 03, 2022

Number of pages

9

Written in

Seller


seller-icon
bundleHub Solution guider

Member since 2 years

313 Documents Sold


Additional information

This document has been written for:

Uploaded

Sep 03, 2022

Downloads

 0

Views

 134

Document Keyword Tags

More From bundleHub Solution guider

View all bundleHub Solution guider's documents »
What is Browsegrades

In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Browsegrades · High quality services·