Algorithms & Data Structures Question

Description

in the file

Don't use plagiarized sources. Get Your Custom Assignment on
Algorithms & Data Structures Question
From as Little as $13/Page

Unformatted Attachment Preview

College of Computing and Informatics
Assignment 1
Deadline: Day 05/03/2024 @ 23:59
[Total Mark for this Assignment is 8]
Student Details:
Name:
ID:
CRN:
Instructions:
• You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on
Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g. misspell words, remove spaces between
words, hide characters, use different character sets, convert text into image or languages other than English or
any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information on the
cover page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified otherwise by
the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
Question One
Pg. 01
Learning
Outcome(s):4
Evaluate the
performance of
data mining
algorithms.
Question One
2 Marks
An online streaming platform wants to analyze whether there is a significant association
between users’ subscription plans (Basic, Premium, Family) and their preferred device
for streaming (Smartphone, Tablet, Laptop). The platform samples 300 subscribers
randomly and records their subscription plans and preferred streaming devices.
Based on the data, can the streaming platform conclude that there is a relationship
between subscription plans and preferred streaming devices among its users? (Use Chisquare with a significance level (α) set at 0.05).
Smartphone
Tablet
Laptop
Basic
50
30
20
Premium
60
40
20
Family
30
20
30
Question Two
Pg. 02
Learning
Outcome(s):2
Demonstrate a
wide range of
clustering,
Question Two
2 Marks
It is important to define/select similarity measures in data analysis. However, there is no
commonly accepted subjective similarity measure. Results can vary depending on the
similarity measures used. Nonetheless, seemingly different similarity measures may be
equivalent after some transformation.
estimation,
prediction, and
Suppose we have the following 2-D data set:
classification
algorithms to
solve a specific
program or
application.
Consider the data as a pair of data points. Given a new data point, x = (1.4,1.6) as a
query, rank the database points based on similarity with the query using Euclidean
distance, Manhattan distance, supremum distance, and cosine similarity.
Question Three
Pg. 03
Learning
Outcome(s):3
Question Three
2 Marks
Suppose a group of 12 sales price records has been sorted as follows:
Employ data
5,10,11,13,15,35,50,55,72,92,204,215.
mining and data
Partition them into three bins by each of the following methods:
warehousing
techniques to
(a) equal-frequency (equal-depth) partitioning
solve real-world
(b) equal-width partitioning
problems.
Let’s partition the sales price records into three bins using the given methods:
(a) Equal-Frequency (Equal-Depth) Partitioning:
– Equal-frequency partitioning divides the data into bins such that each bin
contains an equal number of data points.
1. Divide the total number of records (12) by the number of bins (3) to
determine the frequency per bin:
– Frequency per bin = 12 / 3 = 4
2. Assign each record to its corresponding bin based on its position in the
sorted list:
– Bin 1: {5, 10, 11, 13}
– Bin 2: {15, 35, 50, 55}
– Bin 3: {72, 92, 204, 215}
Question Three
Pg. 04
(b) Equal-Width Partitioning:
– Equal-width partitioning divides the data into bins such that each bin has the
same width or range of values.
1. Determine the width or range of values for each bin:
– Total range = Max value – Min value = 215 – 5 = 210
– Width per bin = Total range / Number of bins = 210 / 3 ≈ 70
2. Assign each record to its corresponding bin based on its value:
– Bin 1: {5, 10, 11, 13, 15}
– Bin 2: {35, 50, 55, 72}
– Bin 3: {92, 204, 215}
In summary:
(a) Equal-Frequency Partitioning:
– Bin 1: {5, 10, 11, 13}
– Bin 2: {15, 35, 50, 55}
– Bin 3: {72, 92, 204, 215}
Question Three
Pg. 05
(b) Equal-Width Partitioning:
– Bin 1: {5, 10, 11, 13, 15}
– Bin 2: {35, 50, 55, 72}
– Bin 3: {92, 204, 215}
Question Four
Pg. 06
Learning
Outcome(s):1
Define different
data mining tasks,
problems, and the
algorithms most
appropriate for
addressing them.
Question Four
2 Marks
Suppose that a data warehouse consists of four dimensions (date, spectator, location,
and game) and two measures (count and charge) where charge is the fare that a spectator
pays when watching a game on a given date.
Spectators may be (students, adults, or seniors), with each category having its own
charge rate.
(a) Draw a star schema diagram for the data warehouse.
(b) Starting with the base cuboid [date, spectator, location, game], what specific OLAP
operations should be performed to list the total charge paid by student spectators at GM
Place in 2010?
Pg. 07
Question Four
Select a company or organization of your preference and examine its official
website to discern the type of privacy policy in place, particularly concerning the
protection of employee data. Evaluate the information provided on the website
to identify the specific measures and guidelines implemented to ensure the
privacy and security of employee information.

Purchase answer to see full
attachment