Megahertz myth



There is in fact a "Megahertz Myth" and it exists in the minds of those who think that the only factor that matters is raw chip speed, as defined in megahertz ratings. Especially true in the case of different CPU designs, even among products in the same family. When you start to compare different classes of chips, the mythological 1:1 relationship of MHz to "speed" becomes even more difficult to cling to. The megahertz myth is a name for the widely held misconception that the computing power of a CPU is strictly a function of its clock speed. In reality, clock speed is only one of many factors that determine the speed at which a CPU can execute instructions. The myth is largely a creation of computer and hardware manufacturers' marketing departments, who for a while highlighted clock speed as one of the primary features in their advertising, playing up the innate assumption that big numbers = MOAR POWER!!!11

What affects speed
For many years, the number of times a computer's clock — PC's "heart" to its processor's "brain" — ticked each second was a direct indication of how many calculations a processor could perform. One clock tick, one instruction, was how the design rule ran. Although CPU clock speeds have not significantly changed in the last decade, many improvements to hardware efficiency and software optimization have kept actual performance steadily growing.

Multiple cores
Modern CPUs include multiple execution units, or cores, within the same chip. Each one of these cores has all of the capabilities of a typical CPU, and these cores are able to work together on the same piece of software. This is known as parallel processing. While modern multi-core CPUs may run at a lower megahertz speed than past processors, their multiple cores allow them to process more workloads simultaneously. Therefore, the same amount of work can be achieved at lower speeds. Fewer megahertz equals less power, so multi-core CPUs are not only more powerful, they are also significantly more energy efficient. The multi-core CPUs in modern smartphones are able to achieve greater performance with fewer than five watts of power than the the hundreds of watts needed by desktop computers from less than a decade earlier. Desktop and laptop computers sold today typically have two- or four-core CPUs; enthusiast, academic and professional computers can have CPUs with dozens, if not hundreds, of cores (in the case of corporate mainframes).

To take advantage of parallel processing, developers must write their code accordingly in order to distribute the software's workload among the available cores. Some algorithms, especially those with many interdependencies between intermediate results, can be hard or impossible to parallelize.

Multiple CPUs
In the same way that a single CPU with multiple cores can distribute its workload, multiple CPUs, each with multiple cores, can further parallelize the task at hand. These multiple CPUs may all be housed in the same case on a common motherboard, or they may be spread out across multiple racks of cases that then communicate with each other over high-speed network links. Supercomputers use this technique, having sometimes hundreds of thousands of CPUs (and GPUs) working together on one task to achieve their huge processing power. Similarly, the vast server and render "farms", employed by companies like Google and Industrial Light and Magic, utilize hundreds of thousands of cooperating CPUs.

Because of these techniques, making many CPUs act as one, the only theoretical limit to a supercomputer's processing power is the physical space it can occupy. The practical limits are energy usage and heat dissipation. At its peak, the Oak Ridge National Laboratory Titan's 37,376 CPUs and GPUs require 8.2 megawatts of power (or, enough to power about 8000 homes). The Titan's cooling system has 6600 tons of capacity (a large house has about 3 tons).

like Folding@home and BOINC, allow home PC users to volunteer their unused processing power, through the internet, to globally networked projects. With distributed computing software, a home PC downloads and then calculates a small chunk of a much more massive shared data set. Once work on that small chunk has been completed, it is uploaded into the global data set, and a new small chunk is downloaded to be worked on. In this manner, the unused power of millions of home PCs can be leveraged into solving extremely complex problems, like genome folding and the cataloguing of astronomical data. Bitcoin also works similar to this.

GPU acceleration
Graphics processing units (GPUs) are specialized processors that can quickly perform a limited set of operations on a large series of values at the same time (for example, pixels on a screen). This differs from the CPU, which is able to perform a much broader array of functions, but at a slower rate. As their name implies, GPUs are usually used for rendering graphic elements (like video games) efficiently, but can also be used for machine learning or other data-driven tasks that are hugely parallelizable, yet mathematically simple. GPUs currently account for the majority of the processing power in nearly all of the World's supercomputers. Two of the fastest supercomputers, China's Tianhe-2 and the US's Titan, both utilize tens of thousands of GPUs, most of which are little more than scaled-up versions of the same add-in cards used in home PCs.

Miniaturization
Miniaturization of components makes it possible to have more components in the same chip, which in turn can be used to parallelize more tasks. There are some physical limits on miniaturization, because as the size of components decreases, quantum effects such as become more apparent.

Processors
Instructions connect computer software to hardware and are information sent to the processor to be interpreted. The types of processors include:

It is clear that superscalar is superior to scalar because it's super more instructions can be interpreted at a time. Being able to process more instructions per machine cycle means that processes are performed relatively more quickly.

Bus and memory speed
RAM is orders of magnitude slower than the CPU, and peripherals like drives and networking adapters are slower still. In modern computers, the CPU spends a good deal of time waiting for data to be read from or written to other components. Faster memories, more and bigger CPU and faster  are used to try to chip away at this delay.

Other techniques
Other techniques that have been employed to improve computing performance are:
 * , to overcome the difficult instruction-level parallelism due to the scarceness of registers in some (like those of the x86 architecture).
 * , to avoid wasting cycles by executing instructions in a different order while making sure the semantics of the program remain the same.
 * Addition of that are optimized for performing some tasks more quickly than they would if these operations were manually performed by the binary code.  can recognize these patterns in programs and generate machine code that makes use of these specialized instructions. For example, POPCNT in SSE4 counts the number of 1 bits in a numeric value. Other instructions can speed up common cryptographic or media encoding/decoding operations.
 * , to get rid of the waste heat produced by the components that may damage them at worst or reduce their lifespan at best. This is a problem in mobile devices as smartphones and tablets, whose hardware is designed for power efficiency cramming and except for some niche devices designed for gaming generally lack active cooling systems.