The Mostly Color Channel: scale-up and scale-out

Thursday, March 10, 2016

scale-up and scale-out

In the post on big data we mentioned Gunther's universal scalability model

In this model, p is the number of processors or clusters nodes and S_p is the speedup with p nodes. σ and κ represent the degree of contention in the system, respectively the lack of coherency in the distributed data. An example for the contention is waiting for message queueing (bottleneck saturation) and an example for incoherency is updating the processor caches (non-local data exchange).

When we do measurements, T_p is the runtime on p nodes, and S_p = T₁ / T_p is the speedup with p nodes. When we have enough data, we can estimate σ and κ for our system and dataset using nonlinear statistical regression.

The model makes it easy to understand the difference between scale-up and scale-out architectures. In a scale-up system, you can increase the speedup by optimizing the contention, for example by adding memory or by bonding network ports. When you play with σ, you will learn that you can increase the speedup, but not the number of processors where you have the maximum speedup, which remains at 48 nodes in the example in Gunther's pape.

In a scale-out architecture, you play with κ and you learn that you can additionally move the maximum over the number of nodes. In Gunther's paper, they can move the maximum to 95 nodes by optimizing the system to exchange fewer data.

This shows that scale-up and scale-out are not simply about using faster system component vs. using more components in parallel. In both cases, you have a plurality of nodes, but you optimize the system differently. In scale-up, you find bottlenecks and then mitigate them. In scale-out, you also work on the algorithms to reduce data exchange.

Since the incoherency term is quadratic, you get more bang for the bucks by reducing the coherency workload. This leads to adding more nodes instead of increasing the performance of the nodes, the latter usually being a much more expensive proposition.

In big data, scale-out or horizontal scaling is the key approach to achieve scalability. While this is obvious to anybody who has done GP-GPU programming, it is less so for those who are just experienced in monolithic apps.

No comments:

Post a Comment

About this blog

The Internet is an amalgam of forms blurred under epistemological pressures. In Søren Kierkegaard’s words, under this flat shower of leveled information, where everybody is interested in everything and nothing is too trivial or too important, people just accumulate information and postpone decisions indefinitely, i.e., nobody takes action and nobody is responsible for truth — there is no mastery, just gossip. He called this the æsthetic sphere of existence, exhorting us to evolve to the ethical sphere, where we do not just accumulate information but take action and make commitments. Blogs are instruments to overcome flatness by creating opportunities for vertical activities. In this sense this blog is a view from my window — a collection of tidbits I judged relevant to computational color science and in general to the promotion of scientific excellence in areas of strategic importance for the future of research, economy and society.

The Mostly Color Channel

Thursday, March 10, 2016

scale-up and scale-out

No comments:

Post a Comment

Search This Blog

Featured Post

Meta-Palette

Understanding Color

Cognitive Aspects of Color

The Color Thesaurus...

Popular Posts

Blog Archive

Labels

Contributors

Blogroll

About this blog

Privacy Policy