|Dali Ismail, firstname.lastname@example.org (A paper written under the guidance of Prof. Raj Jain)||Download|
Multicore Central Processing Units (CPU) are becoming the standard for the current era of processors through the significant level of performance that CPUs offer. This includes multiple multicore architectures, different levels of performance, and with the variety of architectures, it becomes necessary to compare multicore architectures to make sure that the performance aligns itself with the expected specifications. This paper surveys the proper ways of selecting multicore CPU performance techniques, examples of metrics used in multicore CPUs performance studies, factors that affect the performance of multicore CPU and the benchmarks that target multicore CPU aspects of performance.Keyword: Multicore CPU, Performance analysis, Performance evaluation, Benchmarks, Evaluation techniques, Server virtualization, Performance measurement.
Moore’s law of in-chip performance doubling has become the standard for progress in the computer and semiconductor industry. Memory capacity and CPU speeds are two of the many digital electronic devices expected by Moore’s law to experience increased development approximately every 18 months [Moore's Law]. Multicore CPUs have evolved in this process and have become an essential part of our daily life. They are implemented in variety of devices, and their performance varies considerably by application. This situation draws the attention of researchers to evaluate CPU performance rigorously to obtain higher performance at lower cost. The following section discusses the need for multicore processors and the need for performance analysis.
The high performance speed achieved by multi-processors (multiple CPUs on different chips attached to the same motherboard), produce undesirably high power consumption, and as a result, alternative research trends encouraged the production of multicore CPUs in order to reduce power consumption, while simultaneously increasing the processing speed. The architecture of multicore CPUs provided the hungry applications and devices, speed and performance with lower power consumption.
Performance analysis is a criterion that defines the performance of a system, and is required at every stage of the computer system life-cycle, to ensure high performance at a given cost. The demand for performance analysis was derived by radical changes in a number of elements including, a. The present day computer user who is more demanding than computer users 20 years ago. b. The popularity of computer technology, which is no longer a secret, resulted in an inundation in the computer market of different computer manufactures, each differing in performance. Such changes require performance analysis that meet user’s demands and help select the best alternative which provides higher performance at given cost implementing trade-offs between what each technique provides and the required criteria in mind.
In this paper we will represent the different ways used to evaluate multicore CPUs performance, and the goal of this paper is to help understand the proper method for selecting a suitable evaluation techniques, metrics, and measure of multicore CPUs. In section 2, we will see the techniques used to evaluate multicore CPUs and the considerations that should be considered when selecting the techniques, and we will mention some of the metrics used in the analysis of multicore CPUs and the factors commonly affecting multicore CPU performance. In section 3, we will explore the benchmarks used in measuring multicore CPUs performance with an example on performance analysis of multicore CPUs provided in section 4.
This section explores the different methods used in evaluating multicore CPUs performance, with the metrics, analysis, and factors, varying performance based on requirements.
The first step in performance evaluation is to select the proper evaluation technique. The main techniques are: analytical modeling, simulation and measurement. In evaluating multicore CPUs performance, the techniques used are depending on different considerations. However, we cannot trust the result of one technique unless we validate that result with other techniques. For example, we cannot trust the analytical modeling technique without validating the result with simulation or measurement. That is, we require at the use of two techniques to get an accurate result. The considerations for selecting the appropriate technique listed in table 1 [Jain91].
Table 1: Evaluation Techniques Criteria
Table 1 shows the consideration in order of importance and the result may be wrong or misleading in all cases. For example, analytical modeling can be done at any stage of the system life-cycle. Although it takes smaller amounts of time than simulation and measurement because they vary with time, analytical modeling needs no tools for analysis; unfortunately it may give less accurate results. However, it is easy to do trade-off evaluation, as it is “cost less” in terms of capital compared to the other techniques. Unfortunately the saleability for products with just analytical modeling performance result is low. After knowing the evaluation techniques and the criterion for each technique the selection should be based upon the criterion and trade-offs that can be made between these criterions to get the required result.
Performance metrics are the measurements of the system performance or activity [Metrics], and the metrics selection depends upon the services provided by the system because metrics quantify the required output of the system. Metrics can be classified in three main classes:
This list is just an example of metrics used in multicore CPUs performance evaluation. However, metrics are related to three criteria (1) time (2) rate and (3) resources, which are the criteria that can be measured and used to determine the system performance. Performance metrics varies upon the services provided by the system, in multicore CPU systems metrics can be chosen based on the purpose of the performance analysis and which type of performance requirements are required for the program to run efficiently by taking advantage of multicore CPU architectures.
Factors are the performance parameters that we want to study to see their effects on the system. Factors also depend upon the required performance needed to utilize the CPU and get the expected outcome from it. In this subsection we are going to show examples of factors that affect multicore CPU performance.
These are some examples of the factors affecting multicore CPU performance, and for the analysis of each factor under study we will define ways to optimize the performance by analyzing the effects of the factors and interpret the result to get the optimal expected performance.
Multicore CPUs are designed for a variety of applications, (virtualization, Games, and Embedded systems), with this kind of diversity, measuring the performance for multicore systems became a necessity to ensure that the performance delivered as required by the system. Different tools are used to measure multicore CPU performance. Tools like profiling, which is used to monitor and observe system performance behavior rather than measuring. By measuring elapse time for the processes in multicore architectures having multiple threads, results in high-level information of processing speeds which increases performance as result of parallelization [Prinslow11]. Utilizing benchmark tools often results in better measurements that become more relevant and accurate to system profiling. In this section we will introduce benchmarking approaches used to measure multicore CPU performance.
Multicore CPU benchmarks must target the aspect of concurrency from the parallelism prospective, which can be represented by the throughput in data and computational workloads [Levy09]. This subsection will introduce benchmarks that been used to measure multicore CPU performance. In this subsection we are going to list two general benchmarks used to measure multicore CPU performance.
More recently, following the evolution of mobile devices and computers, power efficiency has become an important aspect that has required new approaches to measure multicore CPU performance. Benchmark companies developed new benchmarks depending on power to measure the performance. In this subsection we are going to list two power benchmarks with can be considered as the industry standard power benchmarks.
Benchmark results for multicore CPU performance depend on the test run by the benchmark to measure the performance of the multicore CPU for specific applications, and by defining the reasoning for measurements we can relate different multicore CPU performances to each other based on the benchmark used to utilize and measure the multicore CPU performance.
This section will introduce an example of performance analysis processes for multicore CPUs that will assist in selecting the proper CPU for a machine specification.
Server virtualization of Multicore CPU: Intel IT (Information Technology) team evaluated server performances based on three Intel multicore CPU servers (A Four-socket server based on Quad-Core Intel Xeon CPU X7350 with 16 cores, a dual-socket server based on Quad-Core Intel Xeon CPU X5355 with 8 cores and a dual-socket server based on Intel Dual-Core Xeon CPU 5160 with four cores) [Carpenter07]. In comparing the performance of the multicore CPUs, the Intel IT team targeted the speed of the CPUs and the power efficiency. This became a major concern as mentioned earlier in section one in the paper [Carpenter07]. Due to CPU clock speed, runtime used to measure the performance on each CPU. The data was normalized. Furthermore, the normalized workload consists of VMs (Virtual Machines) and a copy of a synthetic CPU intensive DB application in each VM. W-M/Job (Watt-minute per job) metrics was utilized to measure CPU power efficiency with an increasing workload to test the scalability factors of the CPUs [Carpenter07]. The results from [Carpenter07] show that the three servers different levels of scalability in terms of power consumption. As the VMs number increased the run time remains constant until the workload equals the number of cores. After the number of VMs exceeds the number of cores, the run time begins to increase, figure 1 shows the result of the servers based on Intel multicore CPU run times.
Figure 2: Run Time
Figure 2: Power Consumption
This section introduced examples of how performance analysis works. By comparing the multicore CPUs, we can use the result to help us make proper decisions in terms of selecting the appropriate CPU for a required performance level.
As a result of Moore’s law [Moore's Law], CPU performance is increasing rapidly. The number of cores on the chip increases at each release of a new generation of a CPU. With multicore CPU becoming not only faster but more power efficient depends on the required demand. To maintain relevant profiling of these systems, we need to evaluate the CPU depending on the workload we expect to process. We have addressed the techniques used to evaluate multicore CPU performance, metrics, factors, benchmark tools used to measure multicore CPU performance, and we provided an example of performance evaluation for multicore CPUs. Different approaches used in multicore CPU performance analysis depending on the purpose of the study. We are expecting to see new approaches in multicore CPU performance analysis as multicore CPU production increases to new levels.