Description
Clustering or PCA Project Implementation: Please implement any one algorithm and submit the deliverables as specified in the project rubric.
Please submit: .ipynb file, html or pdf file, excel file used for the project
1. Concise overview of background information related to the data has been stated.
2. Research objectives have been clearly stated in context of the scenario.
3, Research questions are aligned with the overall objective of the project.
1. Generate summaries in context of the research questions to extract patterns.
1. Able to implement the assigned ML algorithmic technique.
2. Conducted correlation test, t test, chi square test, ANOVA test if required
3. Conducted the stated tests by implementing accurate and cogent Hypothesis framework.
4. Articulated accurate interpretation of p-value. Conclusions are valid and warranted.
5. Implications of test conclusion with regards to the scenario are clearly stated .
1. Conducted evaluation of the model performance using prediction accuracy, ROC scores etc.
2, Able or attempted to change different parameters to improve performance.
1. The scenario addressed and analyzed by the algorithm is contextually appropriate.
2.The code self-explanatory with comments within its structure.
1. Submitted a write up file, python code file as well as a pdf, html or word file on Canvas. 2.Additionally submitted the .csv or .excel data file.
1.Adequate holistic and insightful conclusion of the project findings.
2.Short discussions regarding the shortcomings of the data and findings
3.Optionally commenting on previously conducted research that is similar in terms of scope and/or data used.
4.Planned futuristic pathway reflection for conducting exploratory and investigative research.
Define the Research problem and questions.
Analyze data by descriptive statistics and graphical visualization.
Prepare data by using relevant preprocessing transformations, data cleaning, data standardization,deaing with null and outlier values. Divide data into test and training set.
Fit the train data.
Predict the test data.
Evaluate the first algorithm and its model performance.
Evaluate the current algorithm and variety of algorithms by creating test harness for diverse algorithms in conjuction with resampling techniques like cross validation, variable importance. bootstrapping. Improve Result by playing with hyperparameters and innovative methods like ensembles.
Choose the best model and present the results.