Sinfonia: a new paradigm for building scalable distributed systems

Similar documents
Advanced Superscalar Architectures. Speculative and Out-of-Order Execution

Lecture 14: Instruction Level Parallelism

Organized by Hosted by In collaboration with Supported by

What s cooking. Bernd Wiswedel KNIME.com AG. All Rights Reserved.

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

PRODUCT PORTFOLIO. Electric Vehicle Infrastructure ABB Ability Connected Services

MED-HUB E and Transfer Cart

Smart Charging and Vehicle Grid Integration Silicon Valley Leadership Group PEV Forum December 16, 2014

SCA Based Implementation of STANAG 4285 in a Joint Effort under the NATO RTO/IST Panel

Practical Resource Management in Power-Constrained, High Performance Computing

Merger of the generator interconnection processes of Valley Electric and the ISO;

Advanced Superscalar Architectures

Control System for a Diesel Generator and UPS

Outline. Background Performed evaluations. General experiences Future work. ATAM Experiences. Architecture used in 3O3P project SA-AFL architecture

Integrated System Models Graph Trace Analysis Distributed Engineering Workstation

SECONDS CAN COST MILLIONS

ARC-H: Adaptive replacement cache management for heterogeneous storage devices

Dell EMC SCv ,000 Mailbox Exchange 2016 Resiliency Storage Solution using 10K drives

KNIME Server Workshop

Adventures in Clojure Navigating the STM sea and exploring Worlds. Tom Van Cutsem

P1 - Public summary report

CS 152 Computer Architecture and Engineering. Lecture 15 - Advanced Superscalars

What s Cooking. Bernd Wiswedel KNIME KNIME AG. All Rights Reserved.

Release Enhancements GXP Xplorer GXP WebView

CONNECTED PROPULSION - THE FUTURE IS NOW

Spreading Innovation for the Power Sector Transformation Globally. Amsterdam, 3 October 2017

Optimizing Performance and Fuel Economy of a Dual-Clutch Transmission Powertrain with Model-Based Design

Towards Realizing Autonomous Driving Based on Distributed Decision Making for Complex Urban Environments

P2 - Public summary report

Spreading Innovation for the Power Sector Transformation Globally. Amsterdam, 3 October 2017

Decoupling Loads for Nano-Instruction Set Computers

Fiorano ESB 2007 Oracle Enterprise Gateway Integration Guide

Harnessing Demand Flexibility. Match Renewable Production

In-Place Associative Computing:

The Session.. Rosaria Silipo Phil Winters KNIME KNIME.com AG. All Right Reserved.

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media

Patrick Fuhrmann. The DESY Storage Cloud

1 Descriptions of Use Case

Setup of a multi-os platform based on the Xen hypervisor. An industral case study. Paolo Burgio

CACHE LINE AWARE OPTIMIZATIONS FOR CCNUMA SYSTEMS

Charging and Billing. Russ Clark November 19, 2008

JMS Performance Comparison Performance Comparison for Publish Subscribe Messaging

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University

Proxy Lookup Service of the Bank of Lithuania. Market Infrastructure Department Banking Service

ZT-USB Series User Manual

Smart Grid 2.0: Moving Beyond Smart Meters

Scaling industrial control technologies for food & beverage industry

EnSync Energy Systems Breakthrough Technology. Erik Bakke Project Development Executive February 2017

Field Programmable Gate Arrays a Case Study

Erin Kelley 1 Gregory Lane 1 David Schönholzer 2 Wagacha Peter Waiganjo 3. CEGA Conference on Infrastructure Monitoring, October 2016

GEAR UNITS IN FOCUS. PROCESSES UNDER CONTROL.

Can Public Transportation Compete with Automated and Connected Cars?

Cboe Futures Exchange Multicast Depth of Book (PITCH) Implementation Guide. Version 1.0.1

SIRIUS 2001 A Drive-by-Wire University Project

CHANGE OF IT THROUGH DIGITALIZATION. KLAUS STRAUB, CIO BMW GROUP

Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Supercomputers

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Enhancing Energy Efficiency of Database Applications Using SSDs

Installation Manual uniflow Gen. Budget Connector for ibos

Explicit Simulation of Dampened Starter System using Altair Radioss

Innovative System on Chip solutions enabling Smart grid deployments

Copyright 2012 EMC Corporation. All rights reserved.

For personal use only

Beam Test Results and ORCA validation for CMS EMU CSC front-end electronics N. Terentiev

Sandrine Fagot - Simulog - France Hans Peter De Koning - ESA/ESTEC - The Netherlands

FLEXIBILITY FOR THE HIGH-END DATA CENTER. Copyright 2013 EMC Corporation. All rights reserved.

The Fusion Grid Research Project (1/2018 6/2020)

Impact of Distributed Generation and Storage on Zero Net Energy (ZNE)

OPER:03. Questions about Operational Analysis. en-gb. Issue Scania CV AB Sweden

WHITE PAPER. Informatica PowerCenter 8 on HP Integrity Servers: Doubling Performance with Linear Scalability for 64-bit Enterprise Data Integration

Helsinki Pilot. 1. Background. 2. Challenges st challenge

DRAFT Evaluation Scores. Transit

C-ITS in Taiwan. Michael Li

Storage and Memory Hierarchy CS165

Southern California Edison Rule 21 Storage Charging Interconnection Load Process Guide. Version 1.1

Market development for green cars. Geneva, 24 April 2012 Andrea Beltramello, Directorate for Science, Technology and Industry, OECD

Connected and Automated Vehicle Program Plan. Dean H. Gustafson, PE, PTOE VDOT Statewide Operations Engineer February 10, 2016

minispec Plus Release Letter Innovation with Integrity Version 001 AIC

Decision on Merced Irrigation District Transition Agreement

for STANDARD, HYBRID and ELECTRIC VEHICLES with R134a refrigerant For the brands OPEL CHEVROLET (Europe) VAUXHALL KONFORT 760R

Analysis and Correlation for Body Attachment Stiffness in BIW

5G V2X. The automotive use-case for 5G. Dino Flore 5GAA Director General

David A. Ostrowski Global Data Insights and Analytics

CS 6354: Tomasulo. 21 September 2016

A simulator for the control network of smart grid architectures

Scheduling. Purpose of scheduling. Scheduling. Scheduling. Concurrent & Distributed Systems Purpose of scheduling.

A Communication-centric Look at Automated Driving

An Introduction to Automated Vehicles

Designing an Effective Authentication Topology. Gil Kirkpatrick CTO, NetPro

Get started with online permitting without any out-ofpocket expenses and minimal investment of time

Clean Energy Transmission Summit. Stephen Beuning Director Market Operations 1/9/13

ValveLink SNAP-ON Application

Ibergrid Transition to EGI

Additions, Revisions, or Updates

Integrated ADAS HIL System with the Combination of CarMaker and Various ADAS Test Benches. Jinjong Lee, Konrad Yu-Mi Song, Hyundai-Autron

Forecast the charging power demand for an electric vehicle. Dr. Wilson Maluenda, FH Vorarlberg; Philipp Österle, Illwerke VKW;

6th Meeting. January 28, Marius Dupuis / January 28, 2010 copyright note according to DIN 34 Slide 1

Additions, Revisions, or Updates

KNIME Software Pieces KNIME.com AG. All Rights Reserved. 1

What s Cooking. Bernd Wiswedel KNIME KNIME AG. All Rights Reserved.

Transcription:

CS848 Paper Presentation Sinfonia: a new paradigm for building scalable distributed systems Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David R. Cheriton School of Computer Science University of Waterloo 22 February 2010

Motivation Increasing need for scalable distributed systems/applications Large data centers (1000s servers) Serve billions of users around the world Sharing data Current solution: use message-passing Complex protocols Error prone Hard to use February 22, 2010 Sinfonia 2

Outline Sinfonia Structure Minitransactions Design Choices Two Applications Evaluation Conclusion Questions & Discussions February 22, 2010 Sinfonia 3

Focus of Sinfonia Data Center Environment Small and predictable network latencies Trustworthy applications Nodes may crash Target: Infrastructure applications Applications that support other applications Examples: lock managers, cluster file systems, and group communication services Need to provide reliability, consistency, and scalability February 22, 2010 Sinfonia 4

Sinfonia application node application node application node application node user library user library user library user library Sinfonia minitransactions Memory node Memory node Memory node February 22, 2010 Sinfonia 5

Outline Sinfonia Structure Minitransactions Design Choices Two Applications Evaluation Conclusion Questions & Discussions February 22, 2010 Sinfonia 6

Minitransactions Minitransactions: Atomically update data at multiple memory nodes Consistes of: a set of compare items, a set of read items, a set of write items Semantics: Check data in compare items (equality comparison) If all match then apply read and write items compare items mem-id add len data mem-id add len data read items mem-id add len mem-id add len write items mem-id add len data mem-id add len data February 22, 2010 Sinfonia 7

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 37 38 17 234 123 70 34 123 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 8

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 37 38 17 234 123 70 34 123 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 9

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 37 38 17 234 123 70 34 123 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 10

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 37 38 17 45 123 70 34 123 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 11

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 37 38 17 45 123 70 34 123 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 12

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 848 38 17 45 123 70 34 37 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 13

Minitransactions (example) API: Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); } Example: t = new Minitransaction(); t.cmp(2,3,1,70); t.write(1,2,1,45); t.write(3,4,2,37,848); status = t.exec_and_commit(); 5 4 78 37 78 848 38 17 45 123 70 34 37 56 34 46 3 3 Memnode 1 Memnode 2 Memnode 3 February 22, 2010 Sinfonia 14

Minitransactions Balance between: Functionality (Power): powerful enough, general-purpose, easy to use Efficiency: can be executed and commited efficiently, with a small number of network round-trips February 22, 2010 Sinfonia 15

Minitransaction Efficiency coordinator p1 p2 p3 application m1 m2 m3 node execute execute (piggybaking) Two-phase commit Two-phase commit Traditional transactions Sinfonia minitransactions February 22, 2010 Sinfonia 16

Outline Sinfonia Structure Minitransactions Design Choices Two Applications Evaluation Conclusion Questions & Discussions February 22, 2010 Sinfonia 17

Caching and Load Balancing Caching Sinfonia does not cache data at application nodes Caching is left to application nodes Load balancing Sinfonia does not balance data across memory nodes Load balancing is left to application nodes Sinfonia provides per-memory-node load information February 22, 2010 Sinfonia 18

Fault Tolerance Mechanisms for fault tolerance: Disk image Logging Replication Backup Trade off between fault tolerance and amount of resources February 22, 2010 Sinfonia 19

Sinfonia Modes February 22, 2010 Sinfonia 20

Sinfonia Modes February 22, 2010 Sinfonia 21

Outline Sinfonia Structure Minitransactions Design Choices Two Applications Evaluation Conclusion Questions & Discussions February 22, 2010 Sinfonia 22

Application: Cluster File System SinfoniaFS Fault tolerant Scalable Exports NFS v2 Each NFS function: a single minitransaction. For each function: Validate cache Modify data February 22, 2010 Sinfonia 23

Application: Group Communication Service GCS: chat room Join and leave Broadcast msgs SinfoniaGCS Messages stored in memory nodes Private queue for each member Global list February 22, 2010 Sinfonia 24

Outline Sinfonia Structure Minitransactions Design Choices Two Applications Evaluation Conclusion Questions & Discussions February 22, 2010 Sinfonia 25

Evaluation: Ease of Use SinfoniaFS LinuxNFS SinfoniaGCS Spread Toolkit lines of code 3,855 (C++) 5,900 (C) 2,492 (C++) 22,148 (C) develop time 1 month unknown 2 months years major versions 1 2 1 4 February 22, 2010 Sinfonia 26

Evaluation: Scalability spread= 2 scalable spread= # of memory node not scalable February 22, 2010 Sinfonia 27

Evaluation: SinfoniaFS February 22, 2010 Sinfonia 28

Evaluation: SinfoniaGCS February 22, 2010 Sinfonia 29

Conclusion Sinfonia: a service for building scalabe distributed systems Protocol design data structure design A sequence of minitransactions over unstructured data Effective in building infrastructure applications Extensions February 22, 2010 Sinfonia 30

Thanks February 22, 2010 Sinfonia 31

Coordinator Crash Traditional 2PC blocks on coordinator crash Not desirable in Sinfonia: Sinfonia does not have control on coordinators Traditional solution: 3PC Sinfonia Solution: modified 2PC+recovery coordinator February 22, 2010 Sinfonia 32

Coordinator Crash coordinator p1 p2 p3 application m1 m2 m3 node log log log log Traditional 2PC Sinfonia 2PC February 22, 2010 Sinfonia 33

Coordinator Crash February 22, 2010 Sinfonia 34