KNIME Software Pieces 2017 KNIME.com AG. All Rights Reserved. 1
A Peek into KNIME Big Data Labs The Big Data Team KNIME 2017 KNIME.com AG. All Rights Reserved.
KNIME Big Data Connectors Package required drivers/libraries for HDFS, Hive, Impala access Runs on Hadoop Preconfigured connectors Hive Cloudera Impala (secured) HDFS, webhdfs, httpfs Support for Kerberos secured cluster Extends the open source database and remote file handling integration 2017 KNIME.com AG. All Rights Reserved. 3
KNIME Spark Executor Based on Spark MLlib Scalable machine learning library Runs on Hadoop Algorithms for Classification (decision tree, naïve bayes, ) Regression (logistic regression, linear regression, ) Clustering (k-means) Collaborative filtering (ALS) Dimensionality reduction (SVD, PCA) Supports Spark version 1.2, 1.3, 1.5 and 1.6 Support for Kerberos secured cluster 2017 KNIME.com AG. All Rights Reserved. 4
The Question Wouldn t it be great to know if your flight will be delayed? https://pixabay.com/en/staircase-airport-modern-technology-1149599/ 2017 KNIME.com AG. All Rights Reserved. 5
The Answer Of course, so let s learn a model that does! https://pixabay.com/en/banner-yes-no-decision-choice-1183407/ 2017 KNIME.com AG. All Rights Reserved. 6
The Airport Chicago O Hare International Airport https://commons.wikimedia.org/wiki/file:o%27hare_with_aa_plane.jpg Foto Ad Meskens 2017 KNIME.com AG. All Rights Reserved. 7
The Airport Flughafen Berlin Brandenburg https://de.wikipedia.org/wiki/datei:bbi_2010-07-23_5.jpg 2017 KNIME.com AG. All Rights Reserved. 8
The Data Historical Flight Data Airport and City Information Geo Coordinates Airplane Data Radar Images Textual Weather Reports https://commons.wikimedia.org/wiki/file:world-airline-routema p-2009.png 2017 KNIME.com AG. All Rights Reserved. 9
The Challenges Many Data Sources and Formats Analyze the Data Computing Constraints Large Unstructured Data What s new What s cooking 2017 KNIME.com AG. All Rights Reserved. 10
The Challenges Many Data Sources and Formats Analyze the Data Computing Constraints Large Unstructured Data What s new What s cooking 2017 KNIME.com AG. All Rights Reserved. 11
New Spark Reader and Writer Nodes Read and write various data formats from scalable storage e.g. HDFS Data preview in the node dialog 2017 KNIME.com AG. All Rights Reserved. 12
Virtual Data Warehouse 2017 KNIME.com AG. All Rights Reserved. 13
Different Data Sources and Formats 2017 KNIME.com AG. All Rights Reserved. 14
The Challenges Many Data Sources and Formats Analyze the Data https://pixabay.com/en/ball-binary-magnifying-glass-hand-958950/ What s new Computing Constraints Large Unstructured Data What s cooking 2017 KNIME.com AG. All Rights Reserved. 15
Model Learning on Spark 2017 KNIME.com AG. All Rights Reserved. 16
Do you speak SQL? Spark SQL Query node with syntax highlighting and query completion 2017 KNIME.com AG. All Rights Reserved. 17
Ad-hoc Analysis on Spark 2017 KNIME.com AG. All Rights Reserved. 18
Ad-hoc Analysis and Model Learning on Spark 2017 KNIME.com AG. All Rights Reserved. 19
The Challenges Many Data Sources and Formats Analyze the Data What s new https://www.flickr.com/photos/76657755@n04/7027596629 Computing Constraints Large Unstructured Data What s cooking 2017 KNIME.com AG. All Rights Reserved. 20
Moving to the Cloud Amazon EMR Cluster support Microsoft Azure HDInsight support Support for Cloud Connectors 2017 KNIME.com AG. All Rights Reserved. 21
Automatic Cluster Management Run the cluster only when it is needed https://commons.wikimedia.org/wiki/file:stopwatch_a.jpg 2017 KNIME.com AG. All Rights Reserved. 22
Automatic Cluster Management 2017 KNIME.com AG. All Rights Reserved. 23
The Challenges Many Data Sources and Formats Analyze the Data What s new https://pixabay.com/en/files-paper-office-paperwork-stack-1614223/ Computing Constraints Large Unstructured Data What s cooking 2017 KNIME.com AG. All Rights Reserved. 24
Image Data 2017 KNIME.com AG. All Rights Reserved. 25
Image Data -> KNIME Image Processing 2017 KNIME.com AG. All Rights Reserved. 26
Textual Data https://pixabay.com/en/emotions-man-happy-sad-face-adult-371238/ 2017 KNIME.com AG. All Rights Reserved. 27
Textual Data -> KNIME Text Processing 2017 KNIME.com AG. All Rights Reserved. 28
Chemical Data 2017 KNIME.com AG. All Rights Reserved. 29
Chemical Data -> Chemistry Extensions 2017 KNIME.com AG. All Rights Reserved. 30
More than 1500 native KNIME Nodes 2017 KNIME.com AG. All Rights Reserved. 31
What if you could use all KNIME nodes on Spark? 2017 KNIME.com AG. All Rights Reserved. 32
You Could Analyse Radar Data! 2017 KNIME.com AG. All Rights Reserved. 33
KNIME Image Processing on Spark 2017 KNIME.com AG. All Rights Reserved. 34
You Could Do Sentiment Analysis! https://pixabay.com/en/emotions-man-happy-sad-face-adult-371238/ 2017 KNIME.com AG. All Rights Reserved. 35
KNIME Text Processing on Spark 2017 KNIME.com AG. All Rights Reserved. 36
You Could Mine Chemical Structures! 2017 KNIME.com AG. All Rights Reserved. 37
Go to Greg s Talk Tomorrow 2017 KNIME.com AG. All Rights Reserved. 38
Behind the Scene Cluster Worker Node Cluster Worker Node Spark Executor JVM Spark Executor JVM Input RDD RDD Partition RDD Partition KNIME Workflow Execute KNIME workflow on Spark (OSGI) (OSGI) KNIME Workflow KNIME Workflow KNIME Analytics Platform KNIME Server Output RDD RDD Partition RDD Partition Workflow Replica 2017 KNIME.com AG. All Rights Reserved. 39
What about the cluster? 2017 KNIME.com AG. All Rights Reserved. 40
Let KNIME handle it! 2017 KNIME.com AG. All Rights Reserved. 41
The KNIME trademark and logo and OPEN FOR INNOVATION trademark are used by KNIME.com AG under license from KNIME GmbH, and are registered in the United States. KNIME is also registered in Germany. 2017 KNIME.com AG. All Rights Reserved. 42