follow us

Follow AITathens on Twitter faceebook in_logo

Happening now...
PhD-home-194x112
e-ban-180x103-2

 


Big Data and Data Analytics

Overview
This course provides an introduction to the science and technologies behind large-scale data analytics, including statistical inference, machine learning, data mining, data visualization and distributed computing techniques, working smoothly together to generate actionable knowledge to the data analyst.

The course introduces the most successful techniques of the above fields, and exposes participants to the most successful commercial as well as open-source tools and solutions for data analytics, in addition to the theoretical underpinnings of each technology.
Objectives

1. To introduce students to the science and technologies that drive Big Data and Data Analytics in particular.

2. To enable students to think critically about techniques and architectures in Data Analytics and Business Intelligence, and understand the advantages and shortcomings of any proposed solution and perform informed evaluations.

3. To give students a thorough understanding of the interconnections of the field with the fields of Data Bases, Data Warehousing, Statistics, Machine Learning and Data Mining, Optimization, and Distributed Computing (“Big CPU, Big Data”).

4. To introduce students to the mathematics of each taught technique, so as to understand why they work, when they do.

Who should attend
  • Middle Managers who wish to understand opportunities associated with Big Data technologies
  • Researchers and Young Entrepreneurs interested in building high-tech Big Data products and services
  • ICT Developers and Solution Providers wishing to get started with Big Data applications
Course Description

The This course provides an introduction to the science and technologies behind large-scale data analytics, including statistical inference, machine learning, data mining, data visualization and distributed computing techniques, working smoothly together to generate actionable knowledge to the data analyst.

The course introduces the most successful techniques of the above fields, and exposes participants to the most successful commercial as well as open-source tools and solutions for data analytics, in addition to the theoretical underpinnings of each technology.


Topics

1. Review of Relational Database/Warehouse Technologies
a. Characterizations of Big-Datasets (four Vs)
b. Data Cubes
2. Review of Parallel/Distributed Computing Infrastructures: From multi-threading to Map/Reduce to Apache Spark
a. Shared-memory parallel processing & multi-threading
b. The Java Memory Model
c. Distributed Computing: From PVM and MPI to Apache Hadoop to Apache Spark
d. Contemporary Client/Server Models: Web Services
3. Knowledge Discovery in Databases: Unsupervised Methods
a. Clustering
b. Mining Frequent Item-sets and Association Rules in Big Datasets
c. Mining Quantitative Association Rules in Big Datasets
d. Mining Frequent Episodes in Large-Scale Time-Series Data
e. Recommender Systems: Collaborative Filtering
4. Knowledge Discovery in Databases: Supervised Methods
a. Introduction to Classical Statistical Inference and Naïve Bayesian Classification
b. Performance Metrics: Precision, Recall etc.
c. Performance Evaluation Methodologies: leave-one-out, k-fold cross-validation etc.
d. Decision Trees and Regression
e. Support Vector Machines and Neural Networks
f. Implementation Issues in Distributed Computing Environments
5. Data Visualization
a. From pie-charts to pivot tables and cubes
b. Programming Libraries: From Excel to JFreeChart
6. Tools of the Trade
a. Oracle BI Suite
b. Weka/Knime
7. Advanced Topics
a. Matrix Factorization and its Connections to Clustering & Recommender Systems
b. Support Vector Clustering

Dates & Schedule

6-9 June, 2016

Instructors

Ioannis T. Christou

 

Bookmark and Share

Professional's Views

“At the start of Leadership for the 21st Century, I was frustrated, mad, and uncomfortable. Never before had I been forced to examine myself so closely. By the program’s end, I was truly a different person – reenergized and renewed, and with a greater level of self-awareness than I’d ever had in my life.”

Participant, April 2004

Faculty

 

 

View images_vip 

 

Affiliated with Aalborg University-CTiF, Harvard-Kennedy School Of Goverment © ATHENS INFORMATION TECHNOLOGY designed by {Linakis+Associates}