This work was supported by grants HKUST 6081/01E and 6070/00E from Hong Kong RGC. Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Hong Kong 26, Feb, 2002 Outline Preliminary Spatial data warehouses and aggregate trees Applications and motivation Solution for static objects Solution for dynamic objects Performance study Conclusion 2 Preliminary Spatial Data Warehouses Each spatial object carries some sort of aggregate information (i.e., each landscape may involve the population). A common query is the window aggregate query, which specifies a window query and retrieves the aggregate sum of all objects intersecting it. Analogy of the group-by in conventional data warehouses. Materialization techniques common in traditional data warehouses are of limited use since possible positions of queries are infinite. Ad-hoc group-by 3 R2 R1 150
qs 75 R3 132 R4 12 Preliminaries Spatial Data Warehouse A better approach is to deploy aggregate trees to introduce the spatial hierarchy [Kline and Snodgrass, 1995, Papadias, et al, 2001, Lazaridis and Mehrotra, 2001]. Aggregation R-tree R5 R1 150 qs R2 75 R3 132 R5 225 R4 12 R1 150 R6 Retrieve the sum of aggregate of objects intersecting qs 4 R2 75 R6 144 R3 132 R4 12
Spatio-Temporal DW: Applications and Motivation Spatio-temporal databases deal with objects whose properties may change with time. Traditional studies in spatio-temporal databases focus on retrieving the actual objects that satisfy the query predicates. Retrieve all vehicles that appear in the north district during 3pm to 5pm yesterday. A more useful type of queries may be to retrieve, instead of the actual object IDs, the number of objects that satisfy the query conditions. Retrieve the (approximate) number of vehicles in the north district during 3pm-5pm yesterday. In the above example, the spatial objects (i.e., streets in the north district) that carry aggregate information (i.e., number of cars) are static. Other queries may involve dynamic objects. The mobile phone antenna (i.e., the aggregate information = # of users served by the antenna) whose spatial extents (i.e., covering areas) may change over time. 5 Example (Static Objects) R2 R1 q R3 R4 aggregate results over timestamps s Query qs retrieve the aggregate sum (during time T1-T4) of all rectangles that intersect it. 369 369 367 364 359
R4 regions R2 T4 T5 now FACT TABLE 6 total sum aggregate results over regions Traditional Methods Pre-materialization Even more difficult than spatial DW due to the inclusion of the temporal dimension. Use an aggregation tree. When the aggregate of a region changes, create a 3D box. An aggregate 3D R-tree is used to index all these boxes. Problem: The spatial extent of a region must be duplicated many times although it does not change. 7 3D boxes for region R1 130 T5 T4 135 145 T3 150 T1 Aggregate RB-tree Spatial extents are stored only once. R5 R2 R1 q R3 R4
B-tree for the whole space R6 s 1 1105 3 723 B-tree for R5 1 685 4 445 1 369 3 367 4 364 5 359 B-tree for R6 1 283 3 405 aggregate results over timestamps 369 R4 12 R3 132 R2 75 R1 150 T1 369 367 364 359 12 60 127 125 127 127 638 90 420 150 145 135 130 710 T2 85 12
T3 90 T4 T5 now FACT TABLE 1 225 2 230 1828 12 80 12 total sum 4 225 5 220 R-tree for spatial dimension R5 1130 R6 698 B-tree for R1 1 445 4 265 1 150 3 145 aggregate results over regions R1 710 R2 420 1 155 3 265 8 3 85 4 90 3 137 4 139 B-tree for R4 R3 638 R4 60 4 135 5 130 B-tree for R2 1 75 2 80 1 144 2 139 1 12 B-tree for R3
1 259 3 379 1 132 2 127 3 125 4 127 Example (Dynamic Objects) Situation during timestamps 1-4 qs R2 R1 R3 R4 aggregate results over timestamps Query qs retrieve the aggregate sum (during time T1-T4) of all rectangles that intersect it. regions 369 369 367 364 359 1828 R 4 12 12 12 12 12 60 R3 132
total sum over regions Example (cont.) change position at timestamp 5 R2 R1 qs R3 R4 Query qs retrieve the aggregate sum (during time T1-T4) of all rectangles that intersect it. aggregate results over timestamps regions 369 369 367 364 359 1828 R 4 12 12 12 12 12 60 R3 132
127 125 127 127 638 R2 75 80 85 90 90 420 R 1 150 150 145 135 130 710 T1 T2 T3 T4 T5 now aggregate results FACT TABLE 10 total sum
over regions Aggregate HRB-tree Integrates the previous idea with the spatio-temporal access method HR-trees. timestamp 1-4 R5 timestamp 5 R2 R1 R' 5 R2 R4 R3 R6 qs R3 R' 1 B-tree forR5 B-tree forR1 R-tree for spatial dimension timestamps 1-4 A B R1 B-tree forR2 R6 qs R-tree for spatial dimension R2
R5 timestamp 5 D R6 C R3 R4 E R4 B-tree forR3 R'1 R6 R2 B-tree forR4 11 R'5 B-tree forR6 B-tree forR'1 B-tree forR'5 Aggregate 3D RB-tree Creates a 3D box only when the spatial extent of an object changes. R5 B-tree forR 5 R2 R4 R'5 R3 R1
B-tree forR' 1 B-tree forR6 R6 R'1 B-tree forR'5 5 time B-tree forR1 B-tree forR2 B-tree forR3 12 B-tree forR4 Managing Numerous B-trees If each B-tree is too small (i.e., the rates of spatial extent and aggregate changes are similar) A block contains too few entries and much space is wasted. Not suitable for caching. Our solution is to use a B-File, which packs numerous B-trees into a single file Avoiding empty spaces in a disk page. Maintaining the same query performance. 13 Performance Dataset settings Number of spatial objects = 10,000 History length = 1,000 timestamps Aggregate agility describes how fast the aggregate information changes (4%, 8%, 16%, 32%, 64%) Region agility describes how fast the spatial extents change 0% for static objects 0.01% for dynamic objects (capturing the fact that spatial dimension changes much slower than the aggregate data) Datasets include 500,000 to 6,500,5000 records. Each query contains 2 parameters: (spatial extents and interval length). 14 Results (Static Objects)
0 1 25 50 query length 75 0 100 18 a3DRB 1 3 5 query extent (%) 7 9 Conclusion We propose indexing techniques that replace the data cube in spatiotemporal data warehouses and answer ad-hoc group-by queries very efficiently. Both static and dynamic spatial dimensions are discussed. Extensions Cost models that predict the performance of alternative structures. Query optimization based on the cost models. Complex query evaluation 19
The City car is NOT a replacement for personal vehicles, taxis, buses, or trucks; it is a NEW vehicle type that promotes a socially responsible and more effective means of urban mobility.The CityCar utilizes fully integrated in-wheel electric motors and...
In 2019/20 Long term Plan funding for mental health will start to flow into CCG baselines and they must, in association with STPs and ICSs, commission services that deliver improved services set out in the plan such as community mental...
The period of a periodic orbit is the number of bounces it makes until it starts to retrace itself. Definitions (cont.) A periodic orbit is stable if its period is independent of initial position Otherwise it is unstable The Problem...
Centralised heating and hot water systems can lead to unwanted heat gains due to pipework. Lack of investment in features that would enable better heat management, particularly with regard to ventilation and solar shading. Care provision. fit for a future...
Consultant Cardiovascular Physician, UHCW NHS Trust, Coventry ... regardless of office BP values. The results of this study demonstrate that in patients being treated for hypertension, higher ambulatory BP values predict CV events, even after adjustment for other risk factors...
Types of Bills. Two Types of Bills. Private Bills: concern individual people or places . Public bills: apply to the entire nation and general matters like taxation, etc. Congress considers many . resolutions (formal statements of opinions from lawmakers) Joint...
Ready to download the document? Go ahead and hit continue!