Efficient and scalable data evolution with column oriented databases
|Efficient and scalable data evolution with column oriented databases|
|Author(s)||Liu Z., He B., Hsiao H.-I., Chen Y.|
|Published in||ACM International Conference Proceeding Series|
|Keyword(s)||Bitmap index, Column oriented database, Data evolution, Schema (Extra: Bitmap indexes, Column oriented database, Data evolution, Experimental evaluation, Intermediate results, Schema, Schema evolution, SQL query, Synthetic data, Wikipedia, Data warehouses, Technology, Query processing)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Efficient and scalable data evolution with column oriented databases is a 2011 conference paper written in English by Liu Z., He B., Hsiao H.-I., Chen Y. and published in ACM International Conference Proceeding Series.
Database evolution is the process of updating the schema of a database or data warehouse (schema evolution) and evolving the data to the updated schema (data evolution). It is often desired or necessitated when changes occur to the data or the query workload, the initial schema was not carefully designed, or more knowledge of the database is known and a better schema is concluded. The Wikipedia database, for example, has had more than 170 versions in the past 5 years . Unfortunately, although much research has been done on the schema evolution part, data evolution has long been a prohibitively expensive process, which essentially evolves the data by executing SQL queries and re-constructing indexes. This prevents databases from being flexibly and frequently changed based on the need and forces schema designers, who cannot afford mistakes, to be highly cautious. Techniques that enable efficient data evolution will undoubtedly make life much easier. In this paper, we study the efficiency of data evolution, and discuss the techniques for data evolution on column oriented databases, which store each attribute, rather than each tuple, contiguously. We show that column oriented databases have a better potential than traditional row oriented databases for supporting data evolution, and propose a novel data-level data evolution framework on column oriented databases. Our approach, as suggested by experimental evaluations on real and synthetic data, is much more efficient than the query-level data evolution on both row and column oriented databases, which involves unnecessary access of irrelevant data, materializing intermediate results and re-constructing indexes.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.