Big Data/MonetDB
Big Data |
---|
Technologies
|
MonetDB is a column-store, i.e., for each column in a relational table a binary association table (BAT) is created that maps an object identifier to the corresponding value. It exploits the main memory but the database is still persisted on disk. MonetDB can be used as a distributed database. Its design focusses on a read-dominated workload and updates consist of appending large data chunks at a time.
MonetDB consists of three layers:
- Front-end
- provides the user-level data model and the query languages SQL, XQuery (for XML), SciQL (for arrays) and SPARQL (for RDF).
- First, the query is translated into relational algebra. Then, a domain-specific strategic optimization is applied, which tries to reduce the amount of data to be processed. The resulting optimization plan is finally translated into the MonetDB Assembly Language (MAL).
- Back-end
- consists of the MAL optimizer and interpreter.
- The performed tactical optimization is inspired by programming language optimization and ranges from symbolic processing up to just-in-time data distribution and execution.
- Kernel
- provides BATs and a library of optimized implementations of the binary relational algebra operators.
- The operational optimization chooses at runtime the optimal algorithm and implementation to perform the defined operators on the used input data.
MonetDB/RDF
[edit | edit source]RDF triples consist of three object identifiers: S (Subject), P (Property) and O (Object). Each triple is stored in six triple tables: SPO, SOP, PSO, POS, OPS and OSP. This leads to 18 BATs. Before inserting a triple into the database, a dictionary module decomposes an URI into the largest common prefix, which is stored only once, and the unique ID of a subject, property or object.
Current research: Use characteristic sets (CS) to derive relational tables. For each subject in a (CS) an own table is formed. Each column represents a property and the values the corresponding objects. An object which is another subject is expressed via a foreign key. Irregular triples (belonging to no CS are stored separately in a basic triple storage. The relational schema should adapt during runtime.
In order to reduce the number of CSs, attributes of king 0..n are allowed. Since a column of MonetDB must be exactly of one type, for each distinct object type of one property an own CS is created. Furthermore, a schema fine-tuning like the unification of 1..1 related CSs.
References
[edit | edit source]- MonetDB - official web site
- Wikipedia Article - MonetDB
- P. A. Boncz "Monet: A Next-Generation Database Kernel For Query-Intensive Applications". Ph.D. Thesis (Universiteit van Amsterdam). May 2002.
- P. A. Boncz and S. Manegold and M. L. Kersten "Database Architecture Optimized For The New Bottleneck: Memory Access" Proceedings of the International Conference on Very Large Data Bases (VLDB), 1999, Pages 54-65, Very Large Data Base Endowment, Edinburgh, UK.
- S. Idreos and F. E. Groffen and N. J. Nes and S. Manegold and K. S. Mullender and M. L. Kersten (March 2012). "MonetDB: Two Decades of Research in Column-oriented Database Architectures" IEEE Data Engineering Bulletin (IEEE): 40–45.
- M. Antonelli "A SPARQL front-end for MonetDB". Master Thesis (University Roma Tre). 2008.
- P. Boncz and I. Fundulaki and P. Minh Duc and P. Tsialiamanis and V. Christophides. "Deliverable 2.5 MonetDB Release with Optimized Graph Path Processing" Rep. University of Crete, 1 Sept. 2012.
- P. Minh Duc "Self-Organizing Structured RDF In MonetDB" Proceedings of ICDE/PHD Symposium 2013 (ICDE), 2013.
- E. Sidirourgos and R. A. Concalves and M. L. Kersten and N. J. Nes and S. Manegold "Column-Store Support For RDF Data Management: Not All Swans Are White" Proceedings of the International Conference on Very Large Data Bases (VLDB, 2008), 2008.
- D. J. Abadi and A. Marcus and S. R. Madden and K. Hollenbach "Scalable Semantic Web Data Management Using Vertical Partitioning" Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB '07), 2007, Pages 411-422, VLDB Endowment.
- P. Tsialiamanis and E. Sidirourgos and I. Fundulaki and V. Christophides and P. A. Boncz "Heuristic-Based Query Optimisation For SPARQL" Proceedings of EDBT 2012 (EDBT 2012), 2012, Springer,