Exploring the Hadoop Environment

Description

Unit 9 Assignment: Exploring the Hadoop Environment

Don't use plagiarized sources. Get Your Custom Assignment on
Exploring the Hadoop Environment
From as Little as $13/Page

Outcomes addressed in this activity:

Unit Outcomes:

Migrate structured data from a MySQL database into the Apache Hadoop Distributed File System (HDFS).
Perform data analysis using Apache Hive.

Course Outcome:

IT350-6: Explore non-relational database alternatives.

Purpose

Structured data entails data that is in a standardized format, has a well-defined structure, complies to a data model, follows a persistent order, and is easily accessed by humans and programs. Structured data consists of clearly defined data types with patterns that make them easily searchable. This data type is generally stored in a relational database.

You will migrate structured data from an existing MySQL relational database to the Apache Hadoop Distributed File System (HDFS). You will perform basic data analysis by querying the migrated data in Apache Hive. Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis functionality.Assignment Instructions

Navigate to the Academic Tools area of this course and select Library, then Required Readings to review the Unit 9 videos covering facets associated with Hadoop. It is very important that you watch the Unit 9 videos before completing the assignment.
The assignment work will be performed within Codio’s cloud-based learning environment. Navigate to this course’s main menu and select Codio to access this platform.

Your course instructor will provide you with the Codio connection details for accessing the specific online lab environment. The lab environment consists of a Linux virtual machine that has MySQL, Apache Hadoop, and Apache Hive. The work will be performed using a command line interface (CLI) within a Linux Terminal window.
Complete Lab Exercise 1 contained in the following lab document:

IT350 Codio Big Data Labs

In a Microsoft Word document, describe your experience of completing this lab exercise in 250–300 words.
In addition to the Word document, you are required to provide the screen.log file and two comma separated value (CSV) files as part of the assignment submission. Details on the screen.log and CSV files are contained in the lab document. The submitted screen.log and CSV files provide verification of the completed lab work.