Database Management > eBook-PDF > Principles of Distributed Database Systems 4th Edition  by M. Tamer Özsu · Patrick Valduriez (All)

Principles of Distributed Database Systems 4th Edition  by M. Tamer Özsu · Patrick Valduriez

Document Content and Description Below

Contents 1 Introduction . 1 1.1 What Is a Distributed Database System? 1 1.2 History of Distributed DBMS . 3 1.3 Data Delivery Alternatives . 5 1.4 Promises of Distributed DBMSs 7 1.4.1 Trans... parent Management of Distributed and Replicated Data 1.4.2 Reliability Through Distributed Transactions 10 1.4.3 Improved Performance 11 1.4.4 Scalability . 13 1.5 Design Issues . 13 1.5.1 Distributed Database Design . 13 1.5.2 Distributed Data Control 14 1.5.3 Distributed Query Processing. 14 1.5.4 Distributed Concurrency Control . 14 1.5.5 Reliability of Distributed DBMS . 15 1.5.6 Replication. 15 1.5.7 Parallel DBMSs . 16 1.5.8 Database Integration 16 1.5.9 Alternative Distribution Approaches . 16 1.5.10 Big Data Processing and NoSQL. 16 1.6 Distributed DBMS Architectures 17 1.6.1 Architectural Models for Distributed DBMSs . 17 1.6.2 Client/Server Systems. 20 1.6.3 Peer-to-Peer Systems 22 1.6.4 Multidatabase Systems 25 1.6.5 Cloud Computing . 27 1.7 Bibliographic Notes 31 xi xii Contents 2 Distributed and Parallel Database Design . 33 2.1 Data Fragmentation 35 2.1.1 Horizontal Fragmentation. 37 2.1.2 Vertical Fragmentation 52 2.1.3 Hybrid Fragmentation. 65 2.2 Allocation. 66 2.2.1 Auxiliary Information . 68 2.2.2 Allocation Model 69 2.2.3 Solution Methods . 72 2.3 Combined Approaches . 72 2.3.1 Workload-Agnostic Partitioning Techniques 73 2.3.2 Workload-Aware Partitioning Techniques . 74 2.4 Adaptive Approaches 78 2.4.1 Detecting Workload Changes . 79 2.4.2 Detecting Affected Items . 79 2.4.3 Incremental Reconfiguration 80 2.5 Data Directory 82 2.6 Conclusion 83 2.7 Bibliographic Notes 84 3 Distributed Data Control 91 3.1 View Management . 92 3.1.1 Views in Centralized DBMSs. 92 3.1.2 Views in Distributed DBMSs . 95 3.1.3 Maintenance of Materialized Views 96 3.2 Access Control . 102 3.2.1 Discretionary Access Control. 103 3.2.2 Mandatory Access Control . 106 3.2.3 Distributed Access Control . 108 3.3 Semantic Integrity Control . 110 3.3.1 Centralized Semantic Integrity Control 111 3.3.2 Distributed Semantic Integrity Control. 116 3.4 Conclusion 123 3.5 Bibliographic Notes 123 4 Distributed Query Processing 129 4.1 Overview 130 4.1.1 Query Processing Problem 130 4.1.2 Query Optimization . 133 4.1.3 Layers Of Query Processing 136 4.2 Data Localization. 140 4.2.1 Reduction for Primary Horizontal Fragmentation . 141 4.2.2 Reduction with Join . 142 4.2.3 Reduction for Vertical Fragmentation 143 4.2.4 Reduction for Derived Fragmentation 145 4.2.5 Reduction for Hybrid Fragmentation. 148 Contents xiii 4.3 Join Ordering in Distributed Queries 149 4.3.1 Join Trees 149 4.3.2 Join Ordering 151 4.3.3 Semijoin-Based Algorithms 153 4.3.4 Join Versus Semijoin 156 4.4 Distributed Cost Model 157 4.4.1 Cost Functions. 157 4.4.2 Database Statistics 159 4.5 Distributed Query Optimization . 161 4.5.1 Dynamic Approach 161 4.5.2 Static Approach . 165 4.5.3 Hybrid Approach 169 4.6 Adaptive Query Processing 173 4.6.1 Adaptive Query Processing Process 174 4.6.2 Eddy Approach 176 4.7 Conclusion 177 4.8 Bibliographic Notes 178 5 Distributed Transaction Processing 183 5.1 Background and Terminology . 184 5.2 Distributed Concurrency Control 188 5.2.1 Locking-Based Algorithms . 189 5.2.2 Timestamp-Based Algorithms 197 5.2.3 Multiversion Concurrency Control . 203 5.2.4 Optimistic Algorithms 205 5.3 Distributed Concurrency Control Using Snapshot Isolation . 206 5.4 Distributed DBMS Reliability . 209 5.4.1 Two-Phase Commit Protocol . 211 5.4.2 Variations of 2PC 217 5.4.3 Dealing with Site Failures 220 5.4.4 Network Partitioning 227 5.4.5 Paxos Consensus Protocol 231 5.4.6 Architectural Considerations . 234 5.5 Modern Approaches to Scaling Out Transaction Management 236 5.5.1 Spanner 237 5.5.2 LeanXcale . 237 5.6 Conclusion 239 5.7 Bibliographic Notes 241 6 Data Replication . 247 6.1 Consistency of Replicated Databases . 249 6.1.1 Mutual Consistency . 249 6.1.2 Mutual Consistency Versus Transaction Consistency . 251 6.2 Update Management Strategies 252 6.2.1 Eager Update Propagation 253 6.2.2 Lazy Update Propagation . 254 xiv Contents 6.2.3 Centralized Techniques . 254 6.2.4 Distributed Techniques 255 6.3 Replication Protocols 255 6.3.1 Eager Centralized Protocols 256 6.3.2 Eager Distributed Protocols. 262 6.3.3 Lazy Centralized Protocols . 262 6.3.4 Lazy Distributed Protocols . 268 6.4 Group Communication . 269 6.5 Replication and Failures . 272 6.5.1 Failures and Lazy Replication 273 6.5.2 Failures and Eager Replication . 273 6.6 Conclusion 276 6.7 Bibliographic Notes 277 7 Database Integration—Multidatabase Systems . 281 7.1 Database Integration . 282 7.1.1 Bottom-Up Design Methodology 283 7.1.2 Schema Matching . 287 7.1.3 Schema Integration 296 7.1.4 Schema Mapping 298 7.1.5 Data Cleaning . 306 7.2 Multidatabase Query Processing 307 7.2.1 Issues in Multidatabase Query Processing . 308 7.2.2 Multidatabase Query Processing Architecture . 309 7.2.3 Query Rewriting Using Views 311 7.2.4 Query Optimization and Execution 317 7.2.5 Query Translation and Execution. 329 7.3 Conclusion 332 7.4 Bibliographic Notes 334 8 Parallel Database Systems . 349 8.1 Objectives. 350 8.2 Parallel Architectures 352 8.2.1 General Architecture 353 8.2.2 Shared-Memory . 355 8.2.3 Shared-Disk . 357 8.2.4 Shared-Nothing 358 8.3 Data Placement . 359 8.4 Parallel Query Processing 362 8.4.1 Parallel Algorithms for Data Processing . 362 8.4.2 Parallel Query Optimization 369 8.5 Load Balancing . 374 8.5.1 Parallel Execution Problems 374 8.5.2 Intraoperator Load Balancing. 376 8.5.3 Interoperator Load Balancing. 378 8.5.4 Intraquery Load Balancing . 378 Contents xv 8.6 Fault-Tolerance . 383 8.7 Database Clusters 384 8.7.1 Database Cluster Architecture 385 8.7.2 Replication. 386 8.7.3 Load Balancing 386 8.7.4 Query Processing 387 8.8 Conclusion 390 8.9 Bibliographic Notes 390 9 Peer-to-Peer Data Management. 395 9.1 Infrastructure . 398 9.1.1 Unstructured P2P Networks 399 9.1.2 Structured P2P Networks . 402 9.1.3 Superpeer P2P Networks . 406 9.1.4 Comparison of P2P Networks 408 9.2 Schema Mapping in P2P Systems . 408 9.2.1 Pairwise Schema Mapping 408 9.2.2 Mapping Based on Machine Learning Techniques 409 9.2.3 Common Agreement Mapping . 410 9.2.4 Schema Mapping Using IR Techniques 411 9.3 Querying Over P2P Systems. 411 9.3.1 Top-k Queries . 412 9.3.2 Join Queries . 424 9.3.3 Range Queries . 425 9.4 Replica Consistency 428 9.4.1 Basic Support in DHTs . 429 9.4.2 Data Currency in DHTs . 431 9.4.3 Replica Reconciliation 432 9.5 Blockchain 436 9.5.1 Blockchain Definition . 437 9.5.2 Blockchain Infrastructure . 438 9.5.3 Blockchain 2.0. 442 9.5.4 Issues. 443 9.6 Conclusion 444 9.7 Bibliographic Notes 445 10 Big Data Processing 449 10.1 Distributed Storage Systems . 451 10.1.1 Google File System . 453 10.1.2 Combining Object Storage and File Storage . 454 10.2 Big Data Processing Frameworks . 455 10.2.1 MapReduce Data Processing . 456 10.2.2 Data Processing Using Spark . 466 10.3 Stream Data Management . 470 10.3.1 Stream Models, Languages, and Operators 472 10.3.2 Query Processing over Data Streams. 476 10.3.3 DSS Fault-Tolerance 483 xvi Contents 10.4 Graph Analytics Platforms. 486 10.4.1 Graph Partitioning. 489 10.4.2 MapReduce and Graph Analytics 494 10.4.3 Special-Purpose Graph Analytics Systems 495 10.4.4 Vertex-Centric Block Synchronous. 498 10.4.5 Vertex-Centric Asynchronous 501 10.4.6 Vertex-Centric Gather-Apply-Scatter 503 10.4.7 Partition-Centric Block Synchronous Processing 504 10.4.8 Partition-Centric Asynchronous 506 10.4.9 Partition-Centric Gather-Apply-Scatter 506 10.4.10 Edge-Centric Block Synchronous Processing . 507 10.4.11 Edge-Centric Asynchronous 507 10.4.12 Edge-Centric Gather-Apply-Scatter 507 10.5 Data Lakes 508 10.5.1 Data Lake Versus Data Warehouse . 508 10.5.2 Architecture . 510 10.5.3 Challenges . 511 10.6 Conclusion 512 10.7 Bibliographic Notes 512 11 NoSQL, NewSQL, and Polystores 519 11.1 Motivations for NoSQL 520 11.2 Key-Value Stores . 521 11.2.1 DynamoDB 522 11.2.2 Other Key-Value Stores . 524 11.3 Document Stores . 525 11.3.1 MongoDB . 525 11.3.2 Other Document Stores . 528 11.4 Wide Column Stores . 529 11.4.1 Bigtable 529 11.4.2 Other Wide Column Stores . 531 11.5 Graph DBMSs 531 11.5.1 Neo4j. 532 11.5.2 Other Graph Databases . 535 11.6 Hybrid Data Stores . 535 11.6.1 Multimodel NoSQL Stores . 536 11.6.2 NewSQL DBMSs . 537 11.7 Polystores . 540 11.7.1 Loosely Coupled Polystores 540 11.7.2 Tightly Coupled Polystores . 544 11.7.3 Hybrid Systems . 549 11.7.4 Concluding Remarks 553 11.8 Conclusion 554 11.9 Bibliographic Notes 555 Contents xvii 12 Web Data Management 559 12.1 Web Graph Management. 560 12.2 Web Search . 562 12.2.1 Web Crawling . 563 12.2.2 Indexing . 566 12.2.3 Ranking and Link Analysis . 567 12.2.4 Evaluation of Keyword Search . 568 12.3 Web Querying 569 12.3.1 Semistructured Data Approach . 570 12.3.2 Web Query Language Approach . 574 12.4 Question Answering Systems 580 12.5 Searching and Querying the Hidden Web . 584 12.5.1 Crawling the Hidden Web 585 12.5.2 Metasearching . 586 12.6 Web Data Integration. 588 12.6.1 Web Tables/Fusion Tables 589 12.6.2 Semantic Web and Linked Open Data . 590 12.6.3 Data Quality Issues in Web Data Integration 608 12.7 Bibliographic Notes 615 A Overview of Relational DBMS 619 B Centralized Query Processing. 621 C Transaction Processing Fundamentals . 623 D Review of Computer Networks 625 References. 627 [Show More]

Last updated: 1 year ago

Preview 1 out of 681 pages

Add to cart

Instant download

document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Add to cart

Instant download

Reviews( 0 )

$19.00

Add to cart

Instant download

Can't find what you want? Try our AI powered Search

OR

REQUEST DOCUMENT
93
0

Document information


Connected school, study & course


About the document


Uploaded On

Oct 05, 2022

Number of pages

681

Written in

Seller


seller-icon
Kirsch

Member since 4 years

905 Documents Sold


Additional information

This document has been written for:

Uploaded

Oct 05, 2022

Downloads

 0

Views

 93

Document Keyword Tags

Recommended For You

Get more on eBook-PDF »

$19.00
What is Browsegrades

In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Browsegrades · High quality services·