Computer Science > QUESTIONS & ANSWERS > UIUC-CS412 An Introduction to Data Mining_University of Illinois (Fall 2020) Final Exam. 100 marks, (All)
UIUC-CS412 \An Introduction to Data Mining" (Fall 2020) Final Exam Minkowski Distance [10 points] Given three data points in 2-D space: x1 = (1; 0)0, x2 = (-1; 0)0 and x3 = (a; b)0, where a and b ... are two unknown numbers. Let d1 be the distance between x1 and x3, and d2 be the distance between x2 and x3 (a) [6 pts] What are the L2, L1 and L1 distances between x1 and x2 respectively? (b) [1:5 pts] If we use L2 distance, under which condition does d1 = d2? (c) [2:5 pts] If we use L1 distance, under which condition does d1 = d2? half point for each part 3 2 Basic Statistics and Normalization [10 points] Table 1 provides the information of 9 randomly sampled students’ final exam scores of an online course. Table 1: Final Exam Scores of 9 Students. (a) [3 pts] What is the median score? (b) [3 pts] [True or False]. If one student’s score improves, the sample mean will definitely increase as well. (c) [2 pts] [True or False]. If scores of six students improve, the median will definitely increase as well. (d) [2 pts] Suppose scores of k (1 ≤ k ≤ 9) students improve and the remaining (9 - k) students’ scores remain the same. What is the minimal k so that the median will definitely increases? 3 Data Warehouse [10 points] (a) [4 pts] Suppose we build a data warehouse with three dimensions, including location, supplier, and time. If we do not consider the concept hierarchy, how many cuboids are there in total? (b) [6 pts] Suppose the location dimension has three different values, including Urbana, Chicago and New York City; the supplier dimension has two different values, including Dairy Land and Land O’Lakes; the time dimension has twelve different values, ranging from January to December. How many base cells are there in total (3 pts)? How many aggregated cells are there in total (3 pts)? 4 Pattern Evaluations [10 points] Giving two itemsets A and B and the following contingency table (Table 2). A :A Prow B a b a + b :B c d c + d Pcol a + c b + d a + b + c + d Table 2: Contingency Table of Problem 4 (a) [4 pts] If we use lift as the interestingness measure, under which condition will we conclude that A and B are positively correlated? Solution: ad > bc (b) [3 pts] Suppose we conclude that A and B are positively correlated based on lift. [True or False] Now suppose we increase d while keep a; b; c unchanged, A and B will still be positively correlated based on lift. [Show More]
Last updated: 1 year ago
Preview 1 out of 13 pages
Instant download
Buy this document to get the full access instantly
Instant Download Access after purchase
Add to cartInstant download
Connected school, study & course
About the document
Uploaded On
Apr 02, 2023
Number of pages
13
Written in
This document has been written for:
Uploaded
Apr 02, 2023
Downloads
0
Views
50
In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Browsegrades · High quality services·