Algorithms and Data Structures for External Memory by Jeffrey Scott Vitter

By Jeffrey Scott Vitter

Facts units in huge purposes are frequently too big to slot thoroughly contained in the computer's inner reminiscence. The ensuing input/output communique (or I/O) among speedy inner reminiscence and slower exterior reminiscence (such as disks) could be a significant functionality bottleneck. Algorithms and information buildings for exterior reminiscence surveys the cutting-edge within the layout and research of exterior reminiscence (or EM) algorithms and knowledge buildings, the place the target is to use locality and parallelism with a purpose to lessen the I/O bills. a number of EM paradigms are thought of for fixing batched and on-line difficulties successfully in exterior reminiscence. Algorithms and information buildings for exterior reminiscence describes a number of priceless paradigms for the layout and implementation of effective EM algorithms and knowledge buildings. the matter domain names thought of comprise sorting, permuting, FFT, medical computing, computational geometry, graphs, databases, geographic info structures, and textual content and string processing. Algorithms and information constructions for exterior reminiscence is a useful reference for anyone attracted to, or undertaking study within the layout, research, and implementation of algorithms and knowledge constructions.

Show description

Read or Download Algorithms and Data Structures for External Memory PDF

Best algorithms and data structures books

Problems on algorithms

Too frequently the matter units in common set of rules texts are composed of small, idiosyncratic devices of busy-work and beside the point questions - forcing teachers into the time-consuming job of discovering or composing extra difficulties. Designed to fill that hole, this complement offers an in depth and sundry selection of worthy, functional difficulties at the layout, research, and verification of algorithms.

Practical Handbook of Genetic Algorithms: Applications

The maths hired through genetic algorithms (GAs)are one of the most enjoyable discoveries of the previous few a long time. From the development of a easy GA via to complicated implementation, the sensible instruction manual of Genetic Algorithms stands as an essential resource of compiled wisdom from revered specialists worldwide.

Multimedia Database Management Systems

A entire, systematic method of multimedia database administration platforms. It offers equipment for dealing with the expanding calls for of multimedia databases and their inherent layout and structure matters, and covers the way to create a good multimedia database via integrating some of the info indexing and retrieval tools to be had.

Additional resources for Algorithms and Data Structures for External Memory

Sample text

Then blocks of different buckets may “collide,” meaning that they want to be output to the same disk at the same time, and since the buckets use the same round-robin ordering, subsequent blocks in those same buckets will also tend to collide. Vitter and Hutchinson [342] solve this problem by the technique of randomized cycling. For each of the S buckets, they determine the ordering of the disks in the stripe for that bucket via a random permutation of {1, 2, . . , D}. The S random permutations are chosen independently.

1 ([202]). 1) D B log m + 2 log N  N   if B log m = o(log N ). 1 is at least 2. The second case in the theorem is the pathological case in which the block size B and internal memory size M are so small that the optimum way to permute the items is to move them one at a time in the naive manner, not making use of blocking. 1. For the lower bound calculation, we can assume without loss of generality that there is only one disk, namely, D = 1. The I/O lower bound for general D follows by dividing the lower bound for one disk by D.

As long as there is a small pool of D/ε block-sized output buffers to temporarily cache the blocks, Vitter and Hutchinson [342] show analytically that with high probability the output proceeds optimally in (1 + ε)n I/Os. 9–26]. There may be some blocks left in internal memory at the end of a distribution pass. In the pathological case, they may all belong to the same bucket. This situation can be used as an advantage by choosing the bucket to recursively process next to be the one with the most blocks in memory.

Download PDF sample

Rated 4.55 of 5 – based on 12 votes