Description
Unformatted Attachment Preview
Buy a Vineyard Assignment (Part 1)
Introduction
As Mr Mahi’s wealth grows he has taken interest in the more finer things in life. Wine is one of
them. He enjoys drinking wine so much that he has decided to buy a vineyard. He currently is
looking at different vineyards that are for sale and will decide to purchase one.
This assignment will have 2 parts. One is due this week and one is due next week.
To decide which vineyard to purchase we need to know the price, size and quality of the wine
that it produces. The size and price are easy to figure out. Quality is another story. Figuring out
the quality of wine is not easy.
Here is what we are going to do. We have a wine data set. In the data set we know 11 attributes
about almost 5,000 wines. These are easy to measure, like pH level. There is a simple test for
that. The people that made the data set hired experience wine judges to judge each wine. Their
quality score (from 0 to 10) is in column 12.
1. Fixed acidity
2. Volatile acidity
3. Citric acid
4. Residual sugar
5. Chlorides
6. Free sulfur dioxide
7. Total sulfur dioxide
8. Density
9. pH
10. Sulphates
11. Alcohol
12. Quality – this is the value that we are trying to predict.
Before we try to create a wine classification solution, we see a problem with the data. The
values are very different ranges and scales. For example Chlorides is normally less than 0.05,
while Total sulfur dioxide is normally over 100.
You and Mai decided that you need some pre-processing of the data. For each column you
would like the minimum value to scale to 0 and the maximum value scale to 1. For example,
let’s say that the values in a column of data are [0, 2, 3, 4, 5, 10]. 0 is the smallest, that stays 0.
10 is the largest, that becomes 1. Your scaled data would be [0, .2, .3, .4, .5, 1.0].
Instructions
Create a Python program to load the wine quality data file. Scale all of the columns to be scaled
values between 0 and 1. Then print a few lines of the original data, followed by the scaled data.
You must use Scikit Learn; you cannot just hand write a bunch of Python code to do it.
Resources
Your resources are Python Scikit Learn and the data file winedata.csv.
Submit
For this assignment just submit 1 .py Python file.
Rubric
• On-time 10%
• Code correctly uses Scikin Learn to create the program. 40%
• Program produces the correct output. 50%
Buy a Vineyard Assignment (Part 2)
Introduction
In the previous week you did preprocessing on the data. Now you need to finish the assignment
using the preprocessed data from the previous week. Basically, you’ll just add on to the Python
program that you created last week.
Mr Mahi has identified 3 vineyards that are for sale. They are listed below. All 3 are basically
the same price ($1 million). Each vineyard only produces 1 wine and we have decided that we
should buy the vineyard based upon the equation of # acres * quality of wine (equ. 1). For
example, if a vineyard has 80 acres with a wine quality of 6, that would give a score of 480.
Happy Valley Vineyard
100 acres
Hilltop Circle Vineyard
75 acres
Old Line Vineyard
100 acres
Mr Mahi purchased a bottle of wine from each vineyard and has someone run the different test to
determine the values for the attributes. Here is what they found.
Attribute
Fixed acidity
Volatile acidity
Happy Valley
Vineyard
7.9
0.18
Hilltop Circle
Vineyard
6.6
0.22
Old Line Vineyard
8.1
0.35
Citric acid
Residual sugar
Chlorides
Free sulfur dioxide
Total sulfur dioxide
Density
pH
Sulphates
Alcohol
0.04
19.5
0.044
47
97
0.9938
3.14
0.42
10.1
0.15
4.2
0.35
35
138
0.8750
3.05
0.63
8.1
0.14
1.5
0.37
45
132
0.8850
3.75
0.55
9.5
Your next task is to create a K Nearest Neighbor algorithm (which you already did in the videos
from week 2) with K set to 5 and determine the predicted quality of each wine. Then using
equation 1 (above) you can produce a score for each vineyard and let Mr. Mahi know which
vineyard to buy.
Instructions
Extend onto the Python program from last week. It must build a KNN algorithm with K set to 5.
Have it predict the quality (1 to 10) of each of the 3 bottles of wine. Then create a report using
the report template in week 3. Please remember that this report needs to be read and understood
by business people. Make it clear which vineyard to buy.
Resources
Your resources are Python Scikit Learn and the data file winedata.csv.
Submit
For this assignment submit:
1. .py Python file.
2. A copy of the report
Rubric
• On-time 10%
• Well-written (grammar, spelling, etc) 10%
• Report can be understood by business person 30%
• Code correctly uses Scikin Learn to create a KNN. 25%
• Program produces the correct output. 15%
Purchase answer to see full
attachment