Server Tech

Server Processor basics – Multi Processors Multi Cores and Multi Threads

When it comes to Server Processors, there are too many confusing terms used by the vendors – Multiple Processors, Multiple Sockets, Multiple Cores, Multiple threads, etc. In this article, let us try to decipher what all these terms mean, along with their implications for applications that run on the computer servers.

The necessity for Multiple Processors/ Cores:

A Server processor (CPU) is one of the main components of a Server where all the computations required to complete various tasks assigned to them are performed. As you can guess, the processor performance (speed – determined by the clock frequency at which it works – for example 1.5 Ghz) plays an important role as applications become more and more demanding.

But a single processor has its performance limits. Previously, its performance was increased by increasing the clock frequency at which it works. But when they tried to increase the clock frequency beyond 3 Ghz, the amount of heat generated in the processor made it impossible for the normal working of a processor.

So, the only way to increase the performance capacity of the server was to add additional processors to the same server, and make all these processors work in tandem with each other. This brought about three different innovations – Multiple Processors (in multiple sockets), Multiple Cores (within each socket) and Multiple Threads (within each core).

Multiple Processors (In multiple sockets):

This is the simplest terminology. Multiple processors mean just that – multiple processors in a single server that work together to complete the computational tasks. Each processor usually has its own socket (Integrated Circuit, Cache Memory & Bus Interface). So, one socket can hold one processor, and all these processors are connected together using high speed bus circuitry. So, Dual Processor Server means, the server could hold two processors (max).

Every server has a limitation on the maximum number of processors that can be accommodated (Like Single Socket, 2 Sockets, 4 Sockets, etc). Increasing the number of processors is the best way to increase server performance, but this method is not cost effective. The number of servers required for an application would come down if multiple processors are inserted in to the same server.

Multiple Cores (Within each socket):

Every processor is made using integrated Circuits. With improvements in Integrated Circuit technology, it was possible to pack more and more components (transistors, etc) within the size of a processor die. So, the circuitry required for two (or more) separate processors were integrated in to one die, to make multi-core processors. Each core is logically separate from others, so the components meant for individual cores within a single processor die do not functionally overlap.

The main difference between multiple cores (in a single processor die) and multiple physical processors, in-spite of having same processing capacity, is the fact that multiple cores can be inserted in to a single socket and multiple cores share certain processor resources like Cache Memory & Bus Interface. With multi-core processors, servers can save a lot of space along with increasing the performance.

Though the performance of multiple cores is not as good as multiple individual processors, its considerably better than a single core processor (at least 50% improvement for each additional core). Another advantage is the fact that all the cores combined together achieve higher performance using lesser power for each core, than individual processors. But some processing power of multi core processors are reserved for managing communication between the processors. There are dual-core, quad-core, 6-core, 12-core (and even 96 cores) available today within a single processor die.

Multiple Threads (Within each core):

When an individual core within a processor die is executing instructions, it normally does not utilize the entire capacity of the core. So, a concept called multi-threading was introduced. With multiple threading, each core can execute two (or more) individual processing cycles simultaneously. So, by using multi-threading, the unused resources within each processor are utilized more effectively and the performance of the processor increases.

So, if there are two processors in a server, and each processor is dual-core based, and each core supports two threads, the server can execute eight threads simultaneously at any given point of time. Though multiple threads can be executed within each core, the most common configuration utilized in the industry is Dual Threading.

Software/ Applications for Multiple Processors/ Cores/ Threads:

One important consideration that is required to utilize the full capacity of the multi-processors, multi-cores and multi-threads is the fact that the operating system(s) should support Symmetric Multi Processing (SMP) & the applications should be designed to take advantage of parallel processing capabilities. If not, the applications would work as if they are working on a single thread/core/processor.

Its important for application developers to develop applications that can allocate various processes individually to various threads/ cores, and many coding languages support this. There are also some libraries and compilers provided by the server processor manufacturers, which help create such applications.

With some applications, users (or) operating systems can allocate (or dedicate) resources (processor capacity, RAM, etc) for particular processes. Even individual cores can be dedicated for certain applications. For example, an operating system can allocate a separate core for virus scanning while the main application is running on the other three cores. Even if there are problems with virus scanning, only that core hangs without affecting the application running on the other cores.

Processor/ Core Interconnect Fabric/ L2 Cache Optimization:

Since multiple cores (and even multiple processors) need to be communicating with each other constantly, it is important to optimize the interconnect fabric (perhaps by increasing its capacity or connecting them using mesh configurations) that connect various cores/ processors with the memory units.

Even the L2 Cache (This is the memory unit which is required for storing the common operations of the processors while performing computations) is generally optimized by processor vendors. For example, the data is stored on a common L2 cache and can be accessed by all the cores/ processors, so that even when the value of the variables change, the new value is available to the next core or processor accessing it. The amount of memory available in L2 cache in such situations is increased to accommodate the needs of multi-cores/ processors.

64 bit Vs 32 bit Applications:

Certain applications are written for 64-bit processors. A 64-bit processor supports a wider range of calculations to be performed using a wider range of numbers. The capacity/ speed of the application processing is improved when using a 64-bit application. But to take advantage of 64 bit applications, the Operating System version should also support 64-bit operation and the processor & device drivers should also support 64-bit operation.

Most of the server processors support 64-bit by default, and are backward compatible to support 32-bit applications as well. But all the 32-bit applications may not run smoothly on a 64-bit processor. A 64-bit processor/ application/ OS can (use) take advantage of any size of RAM for their operation, while most 32-bit applications support a maximum of 4 GB RAM (in practice its slightly lesser).

You could stay up to date on the various computer networking and related IT technologies by subscribing to this blog with your email address in the sidebar box that says, ‘Get email updates when new articles are published’


  • Terry Holmes

    I haven’t found any good explanation about memory buses on multiple processor systems. Do all the chips share the system RAM in one contiguous memory space? Is only one processor at a time allowed to read/write on a common memory bus? This makes a big difference for jobs that need to process many GBs of data which will be hampered if the processors are contending for bus time.

    • jeremy

      before roughly 2008 both processors would acsess the same memory bank,but even then it was through a chipset,the chipset could have upto 4 channels,and on a high end chipset each channel could have 4 dimm (ram sticks). each ram stick can be read or written to concurrently by each cpu,just not at the same address for both cpu’s. (if one tries to write and the other tries to read the same thing they’ll basically fumble on each other