The diversity and complexity of data-processing architectures, as used in current large scale e-commerce and social media applications, hinder the integration of new data analytics features and increase maintenance and operational costs. These applications typically span three different layers, On-line Transaction Processing (OLTP), On-line Analytical Processing (OLAP) and Off-line Analysis, each of which adopts a different processing model. The fragmentation of application logic across the transaction, stream and batch processing models increases operational costs and application development time. Substantial engineering effort is required to export, move, convert, and import data across different systems when performing exploratory analytics for developing new applications. Developers do not have a reliable way of mapping application features on a specific processing model, relying on past experience and rule of thumb. Re-implementing complex processing pipelines in a system that adopts a different way of thinking is costly and the performance trade-offs among them are difficult to estimate.
In this project, we propose to unify two of these models, Stream and Transaction processing, in one unified processing engine. The current data management system is mainly divided into two categories: Database Management System (DBMS) and Data Stream Management Systems (DSMS). Recent and current approaches attempt this unification by extending OLTP DBMS to support streaming sources, but they only provide very limited analysis capacity and scalability, where comes the curse of "No one size fits all" for DBMS. However, we argue that transactions processing is a relevant concept for DSMS. At a first step towards "One Size Fits All" Data Management System, we present StreamDB, which based on the idea of "Make DSMS support transactions processing" instead of "Integrate streaming processing in DBMS". Firstly, we describes how StreamDB processes transactions in a streaming way. Then we compares StreamDB to a popular in-memory RDBMS, VoltDB, on the standard transactional benchmark, TPC-C. The experiments show that our proposed StreamDB outperform VoltDB in terms of throughput, scalability and latency. Finally, we conclude that the ideas present here provides insight on the development of next-generation data management system and allows for further study of challenges in unifying stream and transaction processing.
DetailsContact: Michael Kampouridis