ICPP 2011 Program Guide

ICPP 2011 Schedule plan

13 Sep, 2011 ( Tuesday )

ROOM B2-1 B2-2 B2-3 B2-4
08:00-09:00
Registration
09:00-12:00 CloudSec

Start from 08:45
P2S2

EMS AWASN
12:00-13:30
Lunch
13:30-17:00 SRMPDS P2S2 PSTI AWASN

14 Sep, 2011 ( Wednesday )

ROOM Howard hall B2-1 B2-2 B2-3 B2-4
08:20-08:50
Registration
08:50-09:00 Opening        
09:00-10:00 Keynote 1        
10:00-10:30
Coffee Break
10:30-12:00

Panel 1  to 11:30

Architecture Wireless    

Panel 2  to 12:30

12:00-13:30
Lunch
13:30-15:00   Architecture Wireless Performance and Modeling Compilers
15:00-15:30
Coffee Break
15:30-17:00   Architecture Wireless Performance and Modeling Compilers
18:00- 20:00
Reception


15 Sep, 2011 ( Thursday )

ROOM Howard hall B2-1 B2-2 B2-3 B2-4
08:30-09:00
Registration
09:00-10:00 Keynote 2        
10:00-10:30
Coffee Break
10:30-12:30 Cloud
Computing

Cluster and Grid Computing

Algorithms

Multi-core and Parallel Systems

 
12:30-13:30
Lunch
13:30-15:00  

Cluster and Grid Computing

Algorithms

Multi-core and Parallel Systems

Mobile Computing and Networks
to 16:00

15:00-15:30
Coffee Break
15:30-17:00  

Cluster and Grid Computing

Algorithms

Multi-core and Parallel Systems

 
18:00- 20:30
Banquet

16 Sep, 2011 ( Friday )

ROOM Howard hall B2-1 B2-2 B2-3
08:30-09:00
Registration
09:00-10:00 Keynote 3      
10:00-10:30
Coffee Break
10:30-12:30 Cloud
Computing

OS and Runtime Technology

Algorithms

P2P Computing and Services

Keynote 1 The Cloud, the Client and Big Data

September 14

Žy­z: b3.jpgDr. Dennis Gannon is Director of Cloud Research Engagements for the Microsoft  Technical Policy Group.   Dr. Gannon's research interests include cloud computing, data analytics and “ big data” platforms, large-scale cyberinfrastructure, distributed computing, parallel programming, computational science and problem solving environments.   At Microsoft he and his team are working with the research community to demonstrate the potential of cloud computing to enable broad access to data-intensive scientific research.

Prior to coming to Microsoft, Dr. Gannon was a professor and former chair of computer science at Indiana University and the Science Director for the Indiana Pervasive Technology Labs.  He has published over 100 refereed articles and he has co-edited 3 books.  Dr. Gannon received his Ph.D. in Computer Science from the University of Illinois Urbana-Champaign  after receiving a Ph.D. in Mathematics from the University of California, Davis.

Dennis Gannon
Director of Cloud Research Engagements
eXtreme Computing Group
Microsoft Research

Abstract
The first paradigm of science was experimental. This was quickly followed by the second paradigm, theory, to explain the results of experiments. The third paradigm was computation which allows us to explore theory where experimentation is difficult or impossible. There is a fourth paradigm that can be described as deriving new knowledge from massive amounts of data even in cases where we may have very little theory to guide us. This is important because almost every branch of academic research is inundated by the data deluge and basic research methods have to evolve rapidly to cope with it. Access to massive amounts of digital data has already transformed the IT industry. Massive scale data clouds designed to index the web have transformed the advertising and publishing industry. We have mobile client devices that have applications that give us total information about where we are at any given instant including where to eat and where to catch a cab. Our computers are learning to see and recognize us as we walk through instrumented spaces. That capability is the result of machine learning applied to massive data collections. However, the academic research community is lagging behind in this revolution. While the most adventurous researchers have access to massive supercomputing facilities and communities like high energy physics have well established data analysis pipelines, the majority of researchers limit the scope of their research to what they can do with the computer on their desk. In this talk we will discuss an approach to removing this limitation by building cloud-based data analytics services that are easy to use from the researchers desktop.

◆Keynote 2  Execution Models without Borders

September 15

Žy­z: shapeimage_4.pngDr. Thomas Sterling holds the position of Professor of Informatics and Computing at the Indiana University (IU) School of Informatics and Computing as well as serving as Director of the Laboratory for System Science and Engineering at the IU Center for Research in Extreme Scale Technology (CREST). He also is an Adjunct Professor at the Louisiana State University (LSU) Center for Computation and Technology (CCT) and CSRI Fellow at Sandia National Laboratories. Since receiving his Ph.D from MIT in 1984 as a Hertz Fellow he has engaged in applied research in related fields associated with parallel computing system structures, semantics, and operation in industry, government labs, and academia. Dr. Sterling is best known as the "father of Beowulf" for his pioneering research in commodity/Linux cluster computing. He was awarded the Gordon Bell Prize in 1997 with his collaborators for this work. Thomas Sterling currently leads the ParalleX Research Group to devise a new model of computation establishing the foundation principles to guide the co-design for the development of future generation Exascale computing systems by the end of this decade. His research has been sponsored by NSF, NASA, NSA, DOE, DARPA, Army Corps of Engineers, and Microsoft. He is the co-author of six books and holds six patents.

Thomas Sterling
School of Informatics and Computing
and Pervasive Technology Institute
Indiana University
Fellow, Computer Science Research Institute
Sandia National Laboratories

Abstract
Technology trends have forced parallel processing to new architecture structures that stress conventional usage practices beyond effective applications. The bounds on increased clock rates due to power constraints and increased processor complexity due to ILP limitations have forced multi/manycore structures in combination with GPU accelerators as typified by three of the top four MPP systems in the world. Once the Communicating Sequential Process (CSP) execution model as reflected by the MPI programming interface dominated both MPP and commodity cluster systems, now the parallel processing community struggles to find alternative methods to effectively program and manage such heterogeneous parallel systems. Historically, the field of HPC has experienced 5 previous phase changes where technology and the need for new classes or architecture and programming methods to exploit it, the most recent more than two decades ago. The fundamental issue that challenges the international parallel processing community is the new HPC phase change that will push us into the era of nano-scale technology and multi-billion-way parallelism by the end of this decade. Such a paradigm shift is critical and its lack of adoption is already impeding progress in hardware and software system codesign and extreme scale application development. The critical performance factors that must be addressed are the interrelated behavior properties of starvation, latency, overhead, and contention.As in the past, any new execution model must mitigate their effects of performance degradation. Also, like the past vector model, SIMD, and CSP execution models, the future model must be international in its adoption, general in its breadth of application, and a powerful tool across industrial manufacturers and ISV suppliers. Essentially all borders must be crossed by the future generation execution model. This Keynote Address will focus on this challenge that affects every nation, every user community, every computational challenge, and every producer. The ParalleX experimental execution model will be described as a concrete exemplar of one possible new paradigm that serves as a framework to synthesize prior art while incorporating innovation in order to provide a way forward, eliminating borders and empowering future cooperation among peoples, domains, disciplines, and standards.

◆Keynote 3 The "Single-chip Cloud Computer", an IA Tera-scale Research Processor

September 16

Žy­z: b2.jpgJim Held is an Intel Fellow who leads a virtual team of architects conducting Tera-Scale Computing Research in Intel's Labs. Since joining Intel in 1990, he has led research and development in a variety of Intel's architecture labs concerned with media and interconnect technology, systems software, multi-core processor architecture and virtualization. Before coming to Intel, Jim worked in research and teaching capacities in the Medical School and Department of Computer Science at the University of Minnesota where he earned a Ph.D. (1988) in Computer and Information Science.

Jim Held
Intel Fellow
Director, Tera-Scale Computing Research Intel Labs

Abstract
Intel Labs has created a second generation experimental "Single-chip Cloud Computer," (SCC) that contains the most Intel Architecture cores ever integrated on a silicon CPU chip - 48 cores. SCC is a concept vehicle, incorporating technologies intended to scale multi-core processors to 100 cores and beyond, such as an on-chip network, advanced power management technologies and support for message-passing. Architecturally, SCC is a microcosm of a cloud datacenter. Each core can run a separate OS and software stack and act like an individual compute node that communicates with other compute nodes over the on-die network fabric, thus supporting the "scale-out" message-passing programming models that have been proven to scale to 1000s of processors. The SCC also serves as an experimental platform for a wide range of parallel computing research on scalable programming models and architectures and increasing our understanding of how to build better processors for the Cloud. It is currently being used by a worldwide community of academic and industry co-travelers.
This talk will describe the architecture of the SCC platform and discuss its role in the broader context of our Tera-scale research.

Panel 1

Programming Environments at Extreme Scale
Exa-scale systems are expected to present an extremely large challenge to computational scientists attempting to leverage the full extent of the capabilities such systems will provide.  The systems are expected to have on the order of ten million computational elements, with order several-hundred such cores per node, include multiple types of computational elements, and have non-uniform memory access characteristics with deep memory hierarchies.  Applications are also expected to need touse on the order of ten to one-hundred way concurrency per core, provided by fine-grain multithreading.  The challenges for application developers trying to use such systems is to understand in detail how their codes perform on such systems, and how these codes need to be changed to make better use of such systems. This panel will discuss current and future research directions aimed at helping application developers develop applications for such systems, understand the performance of such applications on these systems, and help in transforming these applications to better utilize such systems.

Panelists:
Barbara Chapman, University of Houston
Robert Harrison, Oak Ridge National Laboratory
Richard L. Graham, Oak Ridge National Laboratory
Martin Schulz, Lawrence Livermore National Laboratory
Wolfgang Nagel, The Technical University of Dresden

Panel 2

Opportunities and Challenges in Multi-core Era: A Cross-Layer Dialogue
Processors have been moved to multi-core architectures for many years. However, it is still very challenging today to build a scalable, high performance, low-power, and cost-effective multi-core system. The shift towards multi-core not only poses challenges for computer architectures, but also brings news issues distributed across several communities such as applications and algorithms, programming models, languages and compilers, virtual machines, operating systems and run-time supports. This panel gives us an opportunity to stimulate discussions and dialogue across the spectrum of different research communities. We would like to invite researchers from parallel algorithms and applications, programming languages and compiler designs, OSs and run-time supports, and computer architectures to share their ideas from different angle of view, and join our discussion on challenges and research directions of multi-core systems.

Panelists:
Tien-Fu Chen, National Chiao Tung University, Taiwan
Jürg Gutknecht, ETH Zurich, Switzerland
Jim Held, Intel Labs, USA
Chung-Ta King, National Tsing Hua University, Taiwan
Jane Win-Shih Liu, Academia Sinica, Taiwan
Shiao-Li Tsao, National Chiao Tung University, Taiwan

Technical Program

  • Architecture
Wednesday, 14 Sep, 10:30-12:00
Location: B2-1
Session Chair: Prof. Krishna Kavi
University of North Texas
A DFA with Extended Character-set for Fast Deep Packet Inspection
Cong Liu, Ai Chen, Di Wu, Jun Zhang and Jie Wu
Symbiotic Scheduling for Shared Caches in Multi-Core Systems Using Memory Footprint Signature
Mrinmoy Ghosh, Ripal Nathuji, Min Lee, Karsten Schwan and Hsien-Hsin Lee
A Distributed Switch Architecture for On-Chip Networks
Antoni Roca, Carles hernandez, Jose Flich, Federico Silla and Jose Duato
Wednesday, 14 Sep, 13:30-15:00
Location: B2-1
Session Chair: Prof. Chung-Ping Chung
National Chiao Tung University
Evaluation of Techniques to Improve Cache Access Uniformities
Izuchukwu Nwachukwu, Kavi Krishna, Ademola Fawibe and Chris Yan
Energy- and Performance-Efficient Thread Mapping in NoC-based CMPs under Process Variations
Carles Hernandez, Federico Silla and Jose Duato
Energy-Efficient Cache Coherence Protocols in Chip-Multiprocessors for Server Consolidation
Antonio Garcia-Guirado, Ricardo Fernandez-Pascual, Alberto Ros and Jose M. Garcia
Wednesday, 14 Sep, 15:30-17:00
Location: B2-1

Session Chair: Prof. Kuo-Wei Hsu
National Chengchi University

PEPCP: A Power-Efficient Parallel Coherence Protocol for Large-Scale Network-on-Chip
Fucen Zeng, Lin Qiao and Wei Wang
Eager meets Lazy: The Impact of Write-Buffering on Hardware Transactional Memory
Anurag Negi, Ruben Titos-Gil, Manuel E. Acacio, Jose M. Garcia and Per Stenstrom
Tolerating Load Miss Latency by Extending Effective Instruction Window with Low Complexity
Walter Li, Chin-Ling Huang and Chung-Ping Chung

  • Wireless/Sensor Networks and Pervasive Computing
Wednesday, 14 Sep, 10:30-12:00
Location: B2-2

Session Chair: Prof. Meng-Shiuan Pan
Tamkang University

Patrolling Mechanisms for Disconnected Targets in Wireless Mobile Data Mules Networks
Chih-Yung Chang, Chih-Yu Lin, Cehn-Yu Hsieh and Yi-Jung Ho
Peer-to-Peer Object Tracking in the Internet of Things
Yanbo Wu, Quan Z. Sheng and Damith Ranasinghe
An Innovative Scheme for Increasing Connectivity in ZigBee Networks
Chia-Ming Wu, Ruay-Shiung Chang and Pu-I Lee
Wednesday, 14 Sep, 13:30-15:00
Location: B2-2

Session Chair: Prof. Tomotaka Wada
Kansai University

A Distributed Flow-Based Guiding Protocol in Wireless Sensor Networks
Po-Yu Chen, Zan-Feng Kao, Wen-Tsuen Chen and Chi-Han Lin
Efficient Bandwidth Allocation with QoS Guarantee for IEEE 802.16 Systems
Da-Nung Lai, Tsung-Chuan Huang and Hung-Yi Chi
Gradient-based Aggregation in Forest of Sensors (GrAFS)
Ravi Prakash and Ehsan Nourbakhsh
Wednesday, 14 Sep, 15:30-17:00
Location: B2-2
Session Chair: Prof. Shikharesh Majumdar
Carleton University
Unilateral Wakeup for Mobile Ad Hoc Networks
Shan-Hung Wu, Jang-Ping Sheu and Chung-Ta King
A Secure Data Aggregation based Trust Management Approach for Dealing with Untrustworthy Motes in Sensor Network
Sanjay Madria
Video-Like Compression for High Efficiency Database Storage of Wireless Sensor Networks
Niang-Ying Huang, Chi-Cheng Chuang and Ray-I Chang

  • Performance and Modeling
Wednesday, 14 Sep, 13:30-15:00
Location: B2-3
Session Chair: Dr. Martin Schulz
Lawrence Livermore National Laboratory
Unveiling Internal Evolution of Parallel Application Computation Phases
Harald Servat, German Llort, Judit Gimenez, Kevin Huck and Jesus Labarta
Cache Pirating: Measuring the Curse of the Shared Cache
David Eklov, Nikos Nikoleris, David Black-Schaffer and Erik Hagersten
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs
Allen Malony, Scott Biersdorff, Sameer Shende, Heike Jagode, Stanimire Tomov, Guido Juckeland, Robert Dietrich, Duncan Poole and Christopher Lamb
Wednesday, 14 Sep, 15:30-17:00
Location: B2-3
Session Chair: Prof. Gul Agha
University of Illinois
Exposing Complex Bug-Triggering Conditions in Distributed Systems via Graph Mining
Eunsoo Seo, Mohammad Maifi Hasan Khan, Prasant Mohapatra, Jiawei Han and Tarek Abdelzaher
Probabilistic Communication and I/O Tracing with Deterministic Replay at Scale
Xing Wu, Karthik Vijayakumar, Frank Mueller, Xiaosong Ma and Philip Roth
Interpreting Performance Data Across Intuitive Domains
Martin Schulz, Joshua A. Levine, Peer-Timo Bremer, Todd Gamblin and Valerio Pascucci

  • Compilers, Programming Models and Languages
Wednesday, 14 Sep, 13:30-15:00
Location: B2-4
Session Chair: Dr. Brice Goglin
INRIA
GStream: A General-Purpose Data Streaming Framework on GPU Clusters
Yongpeng Zhang and Frank Mueller
CAB: Cache Aware Bi-tier Task-stealing in Multi-socket Multi-core Architecture
Quan Chen, Zhiyi Huang, Minyi Guo and Jingyu Zhou
Virtual Topologies for Scalable Resource Management and Contention Attenuation in a Global Address Space Model on the Cray XT5
Weikuan Yu, Vinod Tipparaju, Xinyu Que and Jeffrey Vetter
Wednesday, 14 Sep, 15:30-17:00
Location: B2-4
Session Chair: Prof. Kyoung-Woo Lee
Yonsei University
A Comprehensive Performance Comparison of CUDA and OpenCL
Jianbin Fang, Ana Lucia Varbanescu and Henk Sips
Enabling Multithreading on CGRAs
Jared Pager, Reiley Jeyapaul, Mahdi Hamzeh, Aviral Shrivastava and Sarma Vrudhula
Enhancing the Role of Inlining in Effective Interprocedural Parallelization
Jichi Guo, Mike Stiles, Qing Yi and Kleanthis Psarris

  • Cloud Computing
Thursday, 15 Sep, 10:30-12:30
Location: Howard Hall
Session Chair: Prof. Yili Gong
Wuhan University
Location-aware MapReduce in Virtual Cloud
Yifeng Geng, Shimin Chen, Yongwei Wu, Ryan Wu, Guangwen Yang and Weimin Zheng

WAVNet: Wide-Area Network Virtualization Technique for Virtual Private Cloud
Zheming Xu, Sheng Di, Weida Zhang, Luwei Cheng and Cho-Li Wang

Virtual Machine Provisioning Based on Analytical Performance and QoS in Cloud Computing Environments
Rodrigo Calheiros, Rajiv Ranjan and Rajkumar Buyya
CSR: A Cloud-assisted Speech Recognition Service for Personal Mobile Device
Yu-Shuo Chang, Shih-Hao Hung, Nick Wang and Bor-Shen Lin
Friday, 16 Sep, 10:30-12:30
Location: Howard Hall
Session Chair: Prof. Jen-Jee Chen
National University of Tainan
SQLMR : A Scalable Database Management System for Cloud Computing
Meng-Ju Hsieh, Chao-Rui Chang, Li-Yung Ho, Jan-Jan Wu, Pangfeng Liu and Yeh-Ching Chung
S3: An Efficient Shared Scan Scheduler on MapReduce Framework
Lei Shi, Xiaohui Li and Kian-Lee Tan
Adaptive Disk I/O scheduling for MapReduce in Virtualized Environment
Shadi Ibrahim, Hai Jin, Lu Lu, Bingsheng He and Song Wu
aMOSS: Automated Multi-Objective Server Provisioning with Stress-Strain Curving
Palden Lama and Xiaobo Zhou

  • Cluster and Grid Computing
Thursday, 15 Sep, 10:30-12:00
Location: B2-1
Session Chair: Dr. Richard Graham
Oak Ridge National Laboratory
IDEA ¡X An API for Parallel Computing with Large Spatial Datasets
Baoqiang Yan and Philip Rhodes
Performance of CUDA Virtualized Remote GPUs in High Performance Clusters
Jose Duato, Antonio J. Pena, Federico Silla, Rafael Mayo and Enrique S. Quintana-Orti
CRFS: a Lightweight User-Level Filesystem for Generic Checkpoint/Restart
Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavier Besseron, Hao Wang, Jian Huang and Dhabaleswar Panda
Thursday, 15 Sep, 13:30-15:00
Location: B2-1
Session Chair: Prof. Phillip M. Dickens
University of Maine
Efficient Energy Management using Adaptive Reinforcement Learning-based Scheduling in Large-Scale Distributed Systems
Masnida Hussin, Young Choon Lee and Albert Y. Zomaya
QoS Preference-Aware Replica Selection Strategy Using MapReduce-Based PGA in Data Grids
Runqun Xiong, Junzhou Luo, Aibo Song, Bo Liu and Fang Dong
Optimizing Process-to-Core Mappings for Two Dimensional Broadcast/Reduce on Multicore Architectures
Christer Karlsson, Teresa Davies, Chong Ding, Hui Liu and Zizhong Chen
Thursday, 15 Sep, 15:30-17:00
Location: B2-1
Session Chair: Prof. Kuan-Chou Lai
National Taichung University
An Efficient Programming Paradigm for Shared-Memory Master-Worker Video Decoding on TILE64 Many-Core Platform
Xuan-Yi Lin, Kuan-Chou Lai, Kuan-Ching Li, Shau-Yin Tseng and Yeh-Ching Chung
MiF: Mitigating the Intra-file Fragmentation in Parallel File System
Letian Yi, Jiwu Shu, Youyou Lu, Wei Wang and Weimin Zheng
Checkpoint and Run-Time Adaptation with Pluggable Parallelisation
Bruno Medeiros and Joao Sobral

  • Algorithms Design and Parallelization
Thursday, 15 Sep, 10:30-12:30
Location: B2-2
Session Chair: Dr. Hsi-Ya Chang
National Center for High-Performance Computing
A Scalable Tridiagonal Solver for GPUs
Hee-Seok Kim, Shengzhao Wu, Li-Wen Chang and Wen-Mei Hwu
On the Performance of Greedy Algorithms for Power Consumption Minimization
Anne Benoit, Paul Renaud-Goud and Yves Robert
Optimal Data Allocation for Scratch-Pad Memory on Embedded Multi-core Systems
Yibo Guo, Qingfeng Zhuge, Jingtong Hu, Meikang Qiu, Wei-Che Tseng and Edwin H.-M. Sha
Energy-aware Mappings of Series-parallel Workflows onto Chip Multiprocessors
Anne Benoit, Paul Renaud-Goud, Yves Robert and Rami Melhem
Thursday, 15 Sep, 13:30-15:00
Location: B2-2
Session Chair: Prof. Wen-Chih Peng
National Chiao Tung University
Modeling and Practical Evaluation of a Service Location Problem in Large Scale Networks
Olivier Beaumont, Nicolas Bonichon and Hubert Larcheveque
Optimizing SpMV for Diagonal Sparse Matrices on GPU
Xiangzheng Sun, Yunquan Zhang, Ting Wang, Xianyi Zhang, Liang Yuan and Li Rao
Parallel Discovery of Direct Causal Relations and Markov Boundaries with Applications to Gene Networks
Olga Nikolova and Srinivas Aluru
Thursday, 15 Sep, 15:30-17:00
Location: B2-2
Session Chair: Prof. Philip Wilsey
University of Cincinnati
Bloom Filter Performance on Graphics Engines
Lin Ma, Roger Chamberlain, Jeremy Buhler and Mark Franklin
Kernel Assisted Collective Intra-node MPI Communication Among Multi-core and Many-core CPUs
Teng Ma, George Bosilca, Aurelien Bouteiller, Brice Goglin, Jeffrey Squyres and Jack Dongarra
OCL-BodyScan: A Case Study for Application-centric Programming of Many-Core Processors
Ana Lucia Varbanescu, Milos Raskovic, Henk Sips, Maarten Ditzel and Wouter Vlothuizen
Friday, 16 Sep, 10:30-12:00
Location: B2-2
Session Chair: Prof. Ching-Hsien Hsu
Chung Hua University
Memory Mapping and Task Scheduling Techniques for Computation Models of Image Processing on Many-Core Platform
Ang-Chih Hsieh, Yi-Ta Wu, Shau-Yin Tseng and TingTing Hwang
On The Energy Complexity of Parallel Algorithms
Vijay Anand Korthikanti, Gul Agha and Mark Greenstreet
Cache Accurate Time Skewing in Iterative Stencil Computations
Robert Strzodka, Mohammed Shaheen, Dawid Pajak and Hans-Peter Seidel

  • Mobile Computing and Networks
Thursday, 15 Sep, 13:30-16:00 Location: B2-4 Session Chair: Prof. Sanjay Madria
Missouri University of Science and Technology
Understanding the Flooding in Low-Duty-Cycle Wireless Sensor Networks
Zhenjiang Li, Mo Li, Junliang Liu, Yunhao Liu and Shaojie Tang
On Using Contact Expectation for Routing in Delay Tolerant Networks
Honglong Chen and Wei Lou
Making Many People Happy: Greedy Solutions for Content Distribution
Yunsheng Wang, Yuhong Guo and Jie Wu

ALERT: An Anonymous Location-based Efficient Routing Protocol in MANETs
Lianyu Zhao and Haiying Shen

Privacy Leakage in Access Mode: Revisiting Private RFID Authentication Protocols
Qingsong Yao, Jinsong Han, Yong Qi, Lei Yang and Yunhao Liu

  • Multi-core and Parallel Systems
Thursday, 15 Sep, 10:30-12:00
Location: B2-3
Session Chair: Prof. Taisuke -Arai- BOKU
University of Tsukuba
Moving Database Systems to Multicore - An Auto-Tuning Approach
Victor Pankratius and Martin Heneka
GSNP: A DNA Single-Nucleotide Polymorphism Detection System with GPU Acceleration
Mian Lu, Jiuxin Zhao, Qiong Luo, Bingqiang Wang, Shaohua Fu and Zhe Lin
Understanding Off-chip Memory Contention of Parallel Programs in Multicore Systems
B.M. Tudor, Y.M. Teo and Simon See
Thursday, 15 Sep, 13:30-15:00
Location: B2-3
Session Chair: Dr. Jan-Jan Wu
Academia Sinica
Accelerating Sparse Matrix Vector Multiplication in Iterative Methods Using GPU
Kiran Kumar Matam and Kishore Kothapalli
Implications of Merging Phases on Scalability of Multi-core Architectures
Madhavan Manivannan, Ben Juurlink and Per Stenstrom
UnSync: A Soft Error Resilient Redundant Multicore Architecture
Aviral Shrivastava, Reiley Jeyapaul, Fei Hong, Abhishek Rhisheekesan and Kyoung Lee
Thursday, 15 Sep, 15:30-17:30
Location: B2-3
Session Chair: Prof. Tarek El-Ghazawi
The George Washington University
PC-Mesh: A Dynamic Parallel Concentrated Mesh
Jesus Camacho, Jose Flich, Antoni Roca and Jose Duato
Data-Driven Tasks and their Implementation
Vivek Sarkar and Sagnak Tasirlar
Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks
Jesus Escudero-Sahuquillo, Ernst Gunnar Gran, Pedro Javier Garcia, Jose Flich, Tor Skeie, Olav Lysne, Francisco Jose Quiles and Jose Duato
GPU Resource Sharing and Virtualization on High Performance Computing Systems
Teng Li, Vikram Narayana, Esam El-Araby and Tarek El-Ghazawi

  • OS and Runtime Technology
Friday, 16 Sep, 10:30-12:00
Location: B2-1
Session Chair: Prof. Shanq-Jang Ruan
National Taiwan University of Science and Technology
LnQ: Building High Performance Dynamic Binary Translators with Existing Compiler Backends
Chun-Chen Hsu, Pangfeng Liu, Chien-Min Wang, Jan-Jan Wu, Ding-Yong Hong, Pen-Chung Yew and Wei-Chung Hsu
Memcached Design on High Performance RDMA Capable Interconnects
Jithin Jose, Hari Subramoni, Miao Luo, Minjia Zhang, Jian Huang, Md. Wasi-ur Rahman, Nusrat Islam, Xiangyong Ouyang, Hao Wang, Sayantan Sur and Dhabaleswar Panda
Combining Hard Periodic and Soft Aperiodic Real-Time Task Scheduling on Heterogeneous Compute Resources
Hsiang-Kuo Tang, Parmesh Ramanathan and Katherine Compton

  • P2P Computing and Services
Friday, 16 Sep, 10:30-12:00
Location: B2-3
Session Chair: Dr. Rodrigo Calheiros
University of Melbourne
Probabilistic Best-fit Multi-dimensional Range Query in Self-Organizing Cloud
Sheng Di, Cho-Li Wang, Weida Zhang and Luwei Cheng
ShareStorm: a High-Performance and ISP-Friendly P2P Content Distribution Protocol
Yingchun Lei, Litang Yang, Yili Gong and Wenjie Wang
On the QoS of Offline Download in Retrieving Peer-side File Resource
Yuanjian Xing, Zhi Yang, Chi Chen, Jilong Xue and Yafei Dai

Workshops

  • PSTI
Tuesday, 13 Sep Location: B2-3
13:30-13:35 Welcome and Introduction
Session I

Session Chair : Karl Fuerlinger 
Ludwig Maximilian University Munich

13:35-14:10

Invited Keynote Talk
Building specialized tools using tool component frameworks
Martin Schulz, Lawrence Livermore National Laboratory, USA

14:10-14:30 Critical-path-guided interactive parallelisation
Jonathan Mak and Alan Mycroft
14:30-14:50 Pre-computing Function Results in Multi-Core and Many-Core Processors
Edward C. Herrmann, Prudhvi Janga, and Philip A. Wilsey
14:50-15:20 Coffee Break
Session II

Session Chair : Karl Fuerlinger 
Ludwig Maximilian University Munich

15:20-15:40 Assessing the Performance of MPI Applications Through Time-Independent Trace Replay
Frederic Desprez, George S. Markomanolis, Martin Quinsony, and Frederic Suter
15:40-16:00 Simulation of Large-Scale HPC Architectures
Ian S. Jones and Christian Engelmann
16:00-16:20 Scalable Control and Monitoring of Supercomputer Applications using an Integrated Tool Framework
Gregory R. Watson, Wolfgang Frings, Claudia Knobloch, Carsten Karbach, and Albert L. Rossi

  • P2S2
Tuesday, 13 Sep Location: B2-2
8:45 - 9:00 Opening Remarks

Session Chair : Pavan Balaji
Argonne National Laboratory

Session I

Sessipn Chair : Pavan Balaji
Argonne National Laboratory

9:00-10:00

Invited Keynote Talk
Exascale: Why It Is Different
Dr. Barbara Chapman, University of Houston

10:00-10:30

Coffee Break
Session II Programming Models and Runtime Systems

Session Chair : Martin Schulz
Lawrence Livermore National Laboratory

10:30-12:20 Recomposing An Irregular Algorithm Using a Novel Low-Level PGAS Model
Megan Vance and Peter Kogge
A Middleware for Concurrent Programming in MPI Applications
Tobias Berka, Helge Hagenauer and Marian Vajtersic
Kangaroo: Reliable execution of scientific applications with DAG programming model
Kai Zhang, Kang Chen, Wei Xue
JETS: Language and System Support for Many Parallel Task Computing
Justin Wozniak and Michael Wilde

12:20-13:30

Lunch
Session III Scheduling and Workflows

Session Chair : Vinod Tipparaju
Oak Ridge National Laboratory

13:30-15:20 Energy-Constrained Dynamic Resource Allocation in a Heterogeneous Computing Environmen
B. Dalton Young, Jonathan Apodaca, Luis Diego Briceno, Jay Smith, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel, Bhavesh Khemka, Shirish Bahirat, Adrian Ramirez, and Yong Zou
Job Co-Scheduling on Coupled High-End Computing Systems
Wei Tang, Narayan Desai, Venkatram Vishwanath, Daniel Buettner, Zhiling Lan
Integrating Scientific Workflows and Large Tiled Display Walls: Bridging the Visualization Divide
Hoang Nguyen, David Abramson, Blair Bethwaite, Minh Ngoc Dinh, Colin Enticott, Stephen Firth, Slavisa Garic, Ian Harper, Martin Lackmann, A.B.M. Russel, Stefan Schek, Mary Vail
Restricted Admission Control in View-Oriented Transactional Memory
K. Leung and Z. Huang

15:20-15:40

Coffee Break

Session IV Communication and I/O

Session Chair : Brice Goglin
INRIA

15:40-17:00 CellPilot: A Seamless Communication Solution for Hybrid Cell Clusters
Natalie Girard, William Gardner, John Carter and Gary Grewal
Interval Based I/O: A New Approach to Providing High Performance Parallel I/O
Jeremy Logan and Phillip Dickens

Improving Performance of the Irregular Data Intensive Application with Small Computation Workload for CMPs
Zhimin Gu

  • CloudSec
Tuesday, 13 Sep Location: B2-1
Session I

Session Chair : Chun-Chieh Huang
Minghsin University of Science and Technology, Taiwan

09:00-10:00 A Generic Scheme for Data Sharing in Cloud
Yanjiang Yang
Secure Connectivity for Intra-Cloud and Inter-Cloud Communication
Shiping Chen, Surya Nepal, Ren Liu
10:00-10:20 Coffee Break
Session II

Session Chair : Chun-Chieh Huang
Minghsin University of Science and Technology, Taiwan

10:20-12:00 A Secure Cloud Backup System with Assured Deletion and Version Control
Arthur Rahumed, Henry C. H. Chen, Yang Tang, Patrick P. C. Lee, and John C. S. Lui

A Secure File Allocation Algorithm for Heterogeneous Distributed Systems
Yun Tian, Mohammed I. Alghamdi, Jiong Xie, Shu Yin, Ji Zhang, Meikang Qiu, Yiming Yang, Xiao Qin

Implications of Recovery Schemes for Virtualization Platform
Guanhua Tian, Dan Meng
A Security Framework of Group Location-Based Mobile Applications in Cloud Computing
Author: Yu-Jia Chen, Li-Chun Wang
*Note: each presenter has 20 minutes to present the paper, and 5 minutes for Q&A

  • AWASN
Tuesday, 13 Sep Location: B2-4
Session I WiMAX Session Chair : Yuh-Shyan Chen
09:00-10:00 Handling the Backhaul Link Failure Problem for Femto ABSs in IEEE 802.16m Environments
Yu-Chan Lin, Whai-En Chen, and Meng-Hsuan Lin
A Study for Connection Establishment in Femtocell Network
Yu-Ching Hsu, Show-Shiow Tzeng, and Ching-Wen Huang
Channel-aware Slot Assignment in OFDMA-based Mobile WiMAX Networks
I-Shyan Hwang, Bor-Jiunn Hwang, and Chien-Yao Chiu
A Femtocell-Assisted Data Forwarding Protocol in Relay Enhanced LTE Networks
Yuh-Shyan Chen, Chao-Chun Li, Wen-Lin Chiang
Session II Wireless sensor networks Session Chair : Tomotaka Wada
10:20-12:00 A Controller-Assisted Distributed (CAD) Load Balancing Scheme for ZigBee Networks
Kuei-Li Huang, Chien-Chao Tseng, Jui-Tang Wang, and Tsung-Hsi Yang
Time-Synchronized versus Self-Organized K-Coverage Configuration in WSNs
Meng-Chun Wueng, Prasan Kumar Sahoo, and I-Shyan Hwang
i-Mace: Protecting Females from Saxual and Violent Offenders in a Community via Smartphones
Jou-Chih Chang, Pi-Shih Wang, Kang-Hsuan Fan, Shih-Rong Yang, De-Yuan Su, Min-Shiung Lin, Min-Te Sun, and Yu-Chee Tseng
A Lightweight Secure Data Aggregation Protocol for Wireless Sensor Networks
Hung-Min Sun, Chiung-Hsun Chen, and Po-Chi Li
Barrier Coverage Constructions for Border Security Systems using Wireless Sensors
Koji Yamamoto, Hayato Ozaki, Takuya Suzuki, Tomotaka Wada, Koichi Mutsuura, and Hiromi Okada
Session III Mobile networks and services Session Chair : Chih-Wei Yi
13:30-14:50 End to End Security and Path Security in Network Mobility
Long-Sheng Li, Shr-Shiuan Tzeng, and Rui-Chung Bai
Connectivity Modeling of Vehicular Ad Hoc Networks in Signalized City Roads
Prasan Kumar Sahoo, Ming-Jer Chiang, and Shih-Lin Wu
From Spatial Reuse to Transmission Power Control for CSMA/CA Based Wireless Ad Hoc Networks
Han-Chiuan Luo, Eric Hsiao-Kuang Wu, and Gen-Huey Chen
A Runtime Partitioning Technique for Mobile Web Services
Muhammad Asif and Shikaresh Majumdar
Routing and Buffering Strategies in Delay-Tolerant Networks: Survey and Evaluation
Shou-Chih Lo, Min-Hua Chiang, Jhan-Hua Liou, and Jhih-Siao Gao
Session IV Localization and detection Session Chair : Jehn-Ruey Jiang
15:20-17:00 AR-based Positioning for Mobile Devices
Yaun-Chou Cheng, Ju-Yi Lin, Chih-Wei Yi, Yu-Chee Tseng, Lun-Chia Kuo, Yu-Jung Yeh,
and Chung-Wei Lin
Parallel Response Query Tree Splitting for RFID Tag Anti-Collision
Ming-Kuei Yeh, Jehn-Ruey Jiang, and Shing-Tsaan Huang
An Accurate GPS-based Localization In Wireless Sensor Networks: A GM-WLS Method
Bo Cheng, Rong Du, Bo Yang, Wenbin Yu, Cailian Chen, and Xinping Guan

Sliding-Typed Communication Range Recognition Method for Indoor Position Estimation in Passive RFID Systems
Atsuki Inada, Yuki Oda, Emi Nakamori, Manato Fujimoto, Tomotaka Wada, Kouichi Mutsuura, and Hiromi Okada

Optimal Multipath Planning for Neyman-Pearson Detection in Wireless Sensor Networks
Yung-Liang Lai and Jehn-Ruey Jiang

  • SRMPDS
Tuesday, 13 Sep Location: B2-1
Session I GPU and Multi-core Systems
13:30-14:50 Analyzing the Effects of Multicore Architectures and On-host Communication Characteristics on Collective Communications
Joshua Ladd Manjunath Gorentla Venkata, Richard Graham and Pavel Shamis

Energy-Aware Workload Consolidation on GPU
Dong Li, Surendra Byna and Srimat Chakradhar

The Power Efficiency of GPUs in Multi Nodes Environment with Molecular Dynamics
Takuro Udagawa and Masakazu Sekijima
An Efficient I/O Aggregator Assignment Scheme for Collective I/O Considering Processor Affinity
Kwangho Cha and Seungryoul Maeng
14:50-15:20 Coffee Break
Session II Cloud and Distributed Systems
15:20-17:00 Agent-based Adaptive Resource Alloction on the Cloud Computing Environment
Gihun Jung and Kwang Mong Sim
A Simulation Framework for Reconfigurable Processors in Large-scale Distributed Systems
M. Faisal Nadeem, S. Arash Ostadzadeh, M. Nadeem, J.S.S.M Wong and Klm Bertels
An Extensible Design of a Load-Aware Virtual Router Monitor in User Space
Harry F. W. Choi and Patrick P. C. Lee
Can MPI Benefit Hadoop and MapReduce Applications?
Xiaoyi Lu, Bing Wang, Li Zha and Zhiwei Xu
P2G: A Framework for Distributed Real-Time Processing of Multimedia Data
H ævard Espeland, Paul B. Beskow, H ækon K. Stensland, Preben N. Olsen, St æle Kristoffersen, Carsten Griwodz and P æl Halvorsen

  • EMS
Tuesday, 13 Sep Location: B2-3
Session I Keynote Session Chair: Prof. Shang-Hong Lai
09:00-10:00 Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU
Dr. Yen-Kuang Chen, Intel, USA
10:00-10:20 Coffee Break
Session II Session Chair: Prof. Pao-Ann Hsiung and Dr. Yanqin Yang
10:20-12:00 Parallelized Face Based RMS System on a Multi-core Embedded Computing Platform
Te-Feng Su, Jia-Jhe Li, Chih-Hsueh Duan, Shu-Fan Wang and Shang-Hong Lai
Adaptive Performance Monitoring for Embedded Multicore Systems
Chun-Yi Shih, Ming-Chih Li, Chao-Sheng Lin, Pao-Ann Hsiung, Chih-Hung Chang, William C. Chu, Nien-Lin Hsueh, Chihhsiong Shih, Chao-Tung Yang and Chorng-Shiuh Koong
C++ Compiler Supports for Embedded Multicore DSP Systems
Chi-Bang Kuan, Jia-Jhe Li, Chung-Kai Chen and Jenq Kuen Lee
Embedded Network Intrusion Detection Systems with a Multi-Core Aware Packet Capture Module
Chia-Hao Hsu and Sheng-De Wang
An Efficient Approach of Power Reducing for Scratch-pad Memory based Embedded Systems
Yanqin Yang, Wenchao Xu, Minyi Guo and Zili Shao
Accelerating the Near Non-bonded Force Computation in Desmond with Graphic Processing Units, Hualiang Deng
Xin Li, Xiaoguang Liu and Gang Wang
*Note: each presenter has 12 minutes to present the paper, and 3 minutes for Q&A