What is system performance? Explain its attributes in details.
System Performance: Meaning and Key Attributes
System performance in computer architecture describes how effectively a computer system completes work for a given workload. It focuses on how fast tasks finish, how many tasks can be handled at once, how consistently the system responds, and how efficiently resources like CPU, memory, and power are used. In simple terms, higher performance means shorter execution time, higher capacity, and better user experience.
Basic Idea (Time vs. Work)
- Execution time (latency): Time taken to finish one task or program.
- Throughput (work rate): Number of tasks completed per unit time.
- General rule: Higher performance = lower time per task and/or more tasks per second.
Performance ∝ 1 / Execution time
1) Latency (Response Time)
Latency is the total time from when a request is made to when the result is delivered. It matters most for interactive tasks (e.g., typing, gaming, browsing).
- Includes CPU time, memory access delays, I/O waits, and queuing delays.
- Tail latency (p95/p99) measures worst-case responsiveness for most users.
- Jitter is variability in response time; lower jitter means smoother performance.
2) Throughput (Bandwidth)
Throughput is how many operations the system completes per unit time (e.g., requests/second, transactions/second).
- Improved by parallelism, pipelining, batching, and higher resource capacity.
- Latency and throughput are related but not identical; a system can have high throughput but still high latency if requests queue up.
3) CPU Execution Metrics
CPU performance depends on instruction count, cycles per instruction (CPI), and clock rate.
CPU time = Instruction count × CPI × Clock cycle time
= (Instruction count × CPI) / Clock rate
IPC (Instructions Per Cycle) = 1 / CPI
- Instruction count depends on the compiler and ISA.
- CPI depends on microarchitecture (pipelining, superscalar, caches, branch prediction).
- Clock rate alone does not guarantee better performance if CPI or memory stalls increase.
- IPS/MIPS are only meaningful for the same program and ISA; do not compare across architectures.
4) Memory System Performance
Memory speed often limits overall performance due to the CPU–memory gap.
- Cache hit rate: Fraction of accesses served by cache.
- Miss penalty: Extra time when data is not in the cache.
- Average Memory Access Time (AMAT):
AMAT = Hit time + Miss rate × Miss penalty
- Improve by increasing locality, larger or smarter caches, prefetching, and reducing miss penalty (e.g., higher memory bandwidth).
5) I/O and Storage Performance
- Latency: Time to start and complete a read/write.
- IOPS: Input/Output operations per second (small random accesses).
- Throughput: MB/s for large sequential transfers.
- Queue depth and scheduling affect both latency and throughput.
6) Parallel Performance and Scalability
How well performance improves when we add more cores/nodes.
- Speedup: How many times faster a parallel version is than the serial version.
Speedup(N) = T_serial / T_parallel(N) Efficiency(N) = Speedup(N) / N (ideal = 1) Amdahl’s Law: Speedup_max = 1 / (f_serial + (f_parallel / N))
- Limits come from non-parallel parts, communication, synchronization, and load imbalance.
- Strong scaling: same problem size, more processors. Weak scaling: larger problem with more processors.
7) Resource Utilization
Measures how busy resources are (CPU, memory, disk, network).
- High utilization can increase throughput but may increase queuing delays and latency.
- Good performance balances utilization without saturating bottlenecks.
8) Reliability, Availability, and Serviceability (RAS)
- Reliability: How rarely the system fails (fewer faults, longer Mean Time Between Failures).
- Availability: Fraction of time the system is up and usable.
Availability = MTBF / (MTBF + MTTR)
- Serviceability: Ease of diagnosing and fixing faults.
- Techniques: redundancy, error correction, failover, graceful degradation.
9) Power and Energy Efficiency
Important for mobile and large-scale systems.
- Power: rate of energy use (Watts); Energy: total consumption (Joules) per task.
- Performance/Watt and Energy per operation (J/op) are key metrics.
Dynamic power ∝ C × V^2 × f Energy = Power × Time Energy-Delay Product (EDP) balances speed and efficiency
10) Cost and Price–Performance
- Cost per unit performance (e.g., Rs per transaction/sec) matters for practical deployments.
- Total Cost of Ownership (hardware, power, cooling, maintenance) should be considered.
11) Quality of Service (QoS)
- Meeting deadlines, maintaining tail-latency targets, and ensuring fairness across users.
- Important for real-time and multi-tenant systems.
Putting It Together (Example)
- If a program has 1×10^9 instructions, CPI = 1.5, and clock = 3 GHz:
CPU time = (1e9 × 1.5) / (3e9) = 0.5 seconds
- Reducing CPI to 1.0 lowers CPU time to 0.33 s, but if cache misses increase AMAT, the real gain may be smaller. Always measure with realistic workloads.
Key Takeaways
- System performance is multi-dimensional: latency, throughput, CPU efficiency, memory/I-O behavior, scalability, reliability, and energy all matter.
- Optimizing one attribute may hurt another; choose metrics that match the workload and user goals.
- Use clear formulas and representative benchmarks to evaluate changes accurately.
