Kurser

Kursusadministration

Brug for hjælp?

  • Gregersensvej 8
  • 2630 Taastrup
Google MapsApple MapsRejseplanen
  • Forskerparken Fyn, Forskerparken 10F
  • 5230 Odense M
Google MapsApple MapsRejseplanen
  • Teknologiparken Kongsvang Allé 29
  • 8000 Aarhus C
Google MapsApple MapsRejseplanen
  • NordsøcentretPostboks 104
  • 9850Hirtshals
Google MapsApple MapsRejseplanen
  • Gammel Ålbovej 1
  • 6092Sønder Stenderup
Google MapsApple MapsRejseplanen
HDP Analyst: Data Science
Nyt
3 dages virtual classroom

HDP Analyst: Data Science

Dette kursus introducerer dig til processer, der er knyttet til data science i Hadoop, heriblandt machine learning og natural language processing og kommer til at arbejde med forskellige programmeringssprog og -værktøjer bl.a. Python, IPython, Mahout, Pig, NumPy, pandas, SciPy og Scikitlearn.

Virtuelle_kurser_ikoner

Virtuelt kursus

Dette virtuelle kursus foregår på din egen computer live via GoToMeeting med en engelsktalende underviser. Under kurset har du mulighed for at stille spørgsmål, deltage i diskussioner, se whiteboard på din skærm og lave lab øvelser.

Get introduced to data science

This course provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, IPython, Mahout, Pig, NumPy, pandas, SciPy, Scikitlearn), the Natural Language Toolkit (NLTK), and Spark MLlib.

Prerequisites

Please note:

Hortonworks courses are delivered using electronic courseware. for delegates attending remotely (Virtual classes or Attend from Anywhere) you must ensure that you have dual monitors or a single monitor plus tablet device. Dual monitors are required in order to allow you to view labs and lab instructions on separate screens.

Technical pre-requisites

Students must have experience with at least one programming or scripting language, knowledge in statistics and/or mathematics, and a basic understanding of big data and Hadoop principles. Students new to Hadoop are encouraged to attend the HDP Overview: Apache Hadoop Essentials course.

Target Audience

Architects, software developers, analysts and data scientists who need to apply data science and machine learning on Hadoop.

Couse overview

Delegates will learn how to:

  • Recognize use cases for data science on Hadoop
  • Describe the Hadoop and YARN architecture
  • Describe supervised and unsupervised learning differences
  • Use Mahout to run a machine learning algorithm on Hadoop
  • Describe the data science life cycle
  • Use Pig to transform and prepare data on Hadoop
  • Write a Python script
  • Describe options for running Python code on a Hadoop cluster
  • Write a Pig User-Defined Function in Python
  • Use Pig streaming on Hadoop with a Python script
  • Use machine learning algorithms
  • Describe use cases for Natural Language Processing (NLP)
  • Use the Natural Language Toolkit (NLTK)
  • Describe the components of a Spark application
  • Write a Spark application in Python
  • Run machine learning algorithms using Spark MLlib
  • Take data science into production

Hands-On Labs:

  • Lab: Setting Up a Development Environment
  • Demo: Block Storage
  • Lab: Using HDFS Commands
  • Demo: MapReduce
  • Lab: Using Apache Mahout for Machine Learning
  • Demo: Apache Pig
  • Lab: Getting Started with Apache Pig
  • Lab: Exploring Data with Pig
  • Lab: Using the IPython Notebook
  • Demo: The NumPy Package
  • Demo: The pandas Library
  • Lab: Data Analysis with Python
  • Lab: Interpolating Data Points
  • Lab: Defining a Pig UDF in Python
  • Lab: Streaming Python with Pig
  • Demo: Classification with Scikit-Learn
  • Lab: Computing K-Nearest Neighbor
  • Lab: Generating a K-Means Clustering
  • Lab: POS Tagging Using a Decision Tree
  • Lab: Using NLTK for Natural Language Processing
  • Lab: Classifying Text using Naive Bayes
  • Lab: Using Spark Transformations and Actions
  • Lab Using Spark MLlib
  • Lab: Creating a Spam Classifier with MLlib

Læs mere om vores virtuelle kurser og se svar på dine spørgsmål (FAQ).

Søgte du et andet virtuelt kursus?

Vi tilbyder virtuelle kurser inden for mange forskellige områder. Kontakt os på tlf. 72203000 eller kurser@teknologisk.dk, så vi kan hjælpe med at imødekomme dit behov.

Har du faglige spørgsmål så kontakt
Andre kurser