Computer Architecture

news/2025/9/23 20:02:22/文章来源:https://www.cnblogs.com/xcyle/p/19107954

System Evaluation Metrics

Cost Metrics

The cost of a chip includes:

  • Design cost: non-recurring engineering (NRE), can be amortized well if there is high volume;
  • Manufacturing cost: depends on area;
    • Manufacturing Semiconductor Chips: Ingot → Wafer → Die (unpackaged chip) → Chip
    • To measure the production efficiency of semiconductor manufacturing, we use the metric yield: the portion of good chips per wafer.
  • Testing cost: depends on yield and test time;
  • Packaging cost: depends on die size, number of pins, power delivery, ...

The cost of a system includes:

  • Power cost;
  • Cooling cost;
  • Total Cost of Ownership (TCO) of datacenters:
    • Capital expenses (CAPEX): facilities, assembly & installation, compute, storage,
      networking, software, …
    • Operational expenses (OPEX): energy, rent, maintenance, employee salaries, …
  • System availability: Downtime is expensive​​ and results in a direct loss of revenue. Redundancy​​ (adding backup components) improves availability but also increases the initial capital cost.

Performance Metrics

Performance metrics:

  • Latency: time to complete a task;
  • Throughput: tasks completed per unit time;

Improving latency often reduces throughput, but not vice versa. For example, inter-task parallelization improves throughput but not latency of a task, while intra-task parallelization improves both.

Buffering/queuing/batching improves throughput but may hurt latency, leads to the tradeoff between latency and throughput.

Digital systems (e.g., processors) operate using a constant-rate clock:

  • Clock cycle time (CCT): duration of a clock cycle;
  • Clock frequency (rate): cycles per second.

To compute the execution time of a program, we first compute the number of instructions (IC), which is fixed for a given program. Then we compute the average number of cycles per instruction (CPI), which depends on the system architecture and implementation. All together, we have

\[\text{Execution Time} = \frac{\text{Instructions}}{\text{Program}} \times \frac{\text{Cycles}}{\text{Instruction}} \times \frac{\text{Time}}{\text{Cycle}}= \text{IC} \times \text{CPI} \times \text{CCT}. \]

Roughly speaking, software determines IC, ISA determines CPI, and microarchitecture/circuit determines CCT.

So far we only discuss the performance on processors. What about memory? It could be reflected on CPI. We know that

\[\text{Runtime}=\max(\text{#ops}/\text{processor throughput},\text{#bytes}/\text{memory bandwidth}). \]

Denote operational Intensity (OI) as \(\frac{\text{#ops}}{\text{#bytes}}\), we have

\[\begin{align*} \text{Perf}=&\ \text{#ops}/\text{Runtime}\\ =&\ \min(\text{processor throughput},\text{memory bandwidth}\times\text{operational intensity}). \end{align*} \]

Drawing the graph of performance vs. operational intensity, we have the roofline model (for certain system):

Power and Energy Metrics

Dynamic/active power: \(C\times V_{dd}^2\times f_{0\to 1}=\alpha C V_{dd}^2 f\), where \(C\) is the capacitance being switched, \(V_{dd}\) is the supply voltage, \(f_{0\to 1}\) is the frequency of 0-to-1 transitions, \(\alpha\) is the activity factor (the fraction of capacitance being switched), and \(f\) is the clock frequency.

Static/leakage power: \(V_{dd}I_{leak}\), where \(I_{leak}\) is the leakage current.

Therefore, total power is

\[\text{Power}=\alpha C V_{dd}^2 f + V_{dd} I_{leak}. \]

And

\[\text{Energy}=\text{Power}\times \text{Time}. \]

Limiting factors of power, energy, and power density:

  • Power is limited by infrastructure, e.g., power supply;
  • Power density is limited by thermal dissipation, e.g., fans, liquid cooling;
  • Energy is limited by battery capacity or electrical bill.

Power scaling:

  • Dennard scaling (1974-2005): If the feature size scales by \(1/S\), the supply voltage and current can scale by \(1/S\);
  • Post-Dennard scaling (2006-now): Power limits performance scaling (power wall), so we need to slow down frequency scaling or reduce chip utilization.

Normalize performance to power:

\[\text{Energy Efficiency}=\frac{\text{Performance}}{\text{Power}}=\frac{\text{Operations}/\text{Time}}{\text{Energy}/\text{Time}}=1/\frac{\text{Energy}}{\text{Operations}}. \]

For certain task, choose the "optimal" design to trade off performance and energy.

Scalability

Scalability measures the speedup achieved by using \(N\) processors compared to using just \(1\) processor.

Two settings to evaluate scalability:

  • Strong scaling: speedup on \(N\) processors with fixed total workload size
  • Weak scaling: speedup on \(N\) processors with fixed per-processor workload size

How to balance the workload?

  • Static load balancing: to partition input as evenly as possible
  • Dynamic load balancing, e.g., work dispatch, work stealing

Suppose that an optimization accelerates a fraction \(f\) of a program by a factor of \(S\), then the overall speedup is given by Amdahl's Law:

\[\text{Speedup}=\frac{1}{(1-f)+\frac{f}{S}}. \]

Benchmark

Benchmark is a carefully selected programs used to measure performance. And benchmark suite is a collection of benchmarks.

To report the average performance on a benchmark suite, we may use three types of means: arithmetic (for absolutes), geometric (for rates) and harmonic (for ratios).

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/913731.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

cv-css 快捷方式,将指定节点的计算样式获取下拉 获取tailwind网页样式成原生样式

cv-css 快捷方式,将指定节点的计算样式获取下拉 获取tailwind网页样式成原生样式 使用方法先选择节点 点击cv-css 复制函数 控制台粘贴函数回车javascript:(function(){ const code = `(() => { const el = wi…

可以看国外网站的浏览app软考

函数 devm_kzalloc() 和kzalloc()一样都是内核内存分配函数,但是devm_kzalloc()是跟设备(device)有关的,当设备(device)被detached或者驱动(driver)卸载(unloaded)时,内存会被自动释放。另外,当内存不在使用时,可以使用…

温州哪里有网站建设百度广告电话号码

题干 有一个自行车手打算进行一场公路骑行&#xff0c;这条路线总共由 n 1 个不同海拔的点组成。自行车手从海拔为 0 的点 0 开始骑行。 给你一个长度为 n 的整数数组 gain &#xff0c;其中 gain[i] 是点 i 和点 i 1 的 净海拔高度差&#xff08;0 < i < n&#xff…

沧县住房和城乡建设局网站1000元做网站

目录 基本分类 C风格输出 C风格 可以抑制输出 方法一 方法二 在Qt中进行log输出, 一般不使用c中的printf, 也不是使用C中的cout, Qt框架提供了专门用于日志输出的类, 头文件名为 QDebug。 基本分类 qDebug&#xff1a;调试信息提示 qInfo &#xff1a;输出信息 qWarnin…

福田区网站建网页设计基础课件

题目 213. 打家劫舍 II 198. 打家劫舍 你是一个专业的小偷&#xff0c;计划偷窃沿街的房屋&#xff0c;每间房内都藏有一定的现金。这个地方所有的房屋都围成一圈&#xff0c;这意味着第一个房屋和最后一个房屋是紧挨着的。同时&#xff0c;相邻的房屋装有相互连通的防盗系统&…

网站备案帐号找回邢台移动网站建设公司

可以使用array()类java.nio.ByteBuffer中的方法获得缓冲区的字节数组。如果返回的数组被修改&#xff0c;则缓冲区的内容也会被类似地修改&#xff0c;反之亦然。如果缓冲区是只读的&#xff0c;则抛出ReadOnlyBufferException。演示此的程序如下所示-示例import java.nio.*;im…

桂城网站设计seow

给定一个二叉树&#xff0c;确定它是否是一个完全二叉树。 百度百科中对完全二叉树的定义如下&#xff1a; 若设二叉树的深度为 h&#xff0c;除第 h 层外&#xff0c;其它各层 (1&#xff5e;h-1) 的结点数都达到最大个数&#xff0c;第 h 层所有的结点都连续集中在最左边&a…

遂昌赶街网站天津建设银行公积金缴费网站

因工作中的某些奇葩要求&#xff0c;需要将PDF文件的每页内容转存成按顺序编号的图片。用第三方软件或者在线转换也可以&#xff0c;但批量操作还是Python方便&#xff0c;所谓搞定办公自动化&#xff0c;Python出山&#xff0c;一统天下&#xff1b;Python出征&#xff0c;寸草…

用个人免费空间快速建立个人网站后方平台网站和新媒体建设方案

Java集合框架&#xff08;Java Collections Framework&#xff09;是一组用来表示和操作集合的类的集合&#xff0c;它提供了用于存储不同类型对象的标准化接口和类。Java集合框架的主要组成部分包括以下几个部分&#xff1a; 集合接口&#xff08;Collection Interface&#…

做php网站教程wordpress首页摘要

1. 计算公式 https://baijiahao.baidu.com/s?id1750340479004609055&wfrspider&forpc Z087/SQRT(εr1.41)ln[(5.98h)/(0.8wt)] 常规来说阻抗与介电常数成反比,与介质层厚度成正比,与线宽成反比,与铜厚成反比。 Z0&#xff1a;印刷导线的特性阻抗 εr&#xff1a;绝缘…

软件工程:构建数字世界的基石

在当今数字化时代,软件无处不在,从智能手机上的应用程序到大型企业的复杂信息系统,软件已经成为我们生活中不可或缺的一部分。而软件工程,作为一门专注于软件开发的学科,其重要性也日益凸显。它不仅仅是编写代码,…

# Shell 文本处理三剑客:awk、sed 与常用小器具详解

# Shell 文本处理三剑客:awk、sed 与常用小器具详解2025-09-23 19:54 tlnshuju 阅读(0) 评论(0) 收藏 举报pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; dis…

巴中模板建站价格多少网站备案 个人 单位

Flying-Saucer是一个不错的库&#xff0c;可以从Java应用程序中生成PDF文档。 只需生成一堆XHTML&#xff0c;然后将其放入渲染器中&#xff0c;然后使用iText生成所需的文档即可。 但是&#xff0c;当涉及条形码时&#xff0c;Flying-Saucer无法访问iText的内置条形码功能&am…

重庆网站建设changeke垂直网站怎么做

文章目录 【方法一】运用哈密顿凯莱定理相关例题 【方法二】运用特征方程二阶矩阵求解通法三阶矩阵求解通法相关例题 市面上许多资料给出的计算矩阵高次幂的方法&#xff0c;无外乎有这几种&#xff1a; 分块矩阵求解高次幂&#xff1b;先求低次方幂&#xff0c;然后通过找规律…

建设工程检测中心网站网站规划设计是什么

spContent《C语言程序设计》是高等学校本科教育普遍开设的一门课程&#xff0c;是广大程序设计语言学习者首选的入门课程。本课程立足于C语言基础知识的讲解&#xff0c;讲授中引入大量实例&#xff0c;突出重点&#xff0c;剖析难点&#xff0c;培养学生结构化程序设计的思想&…

网站建设 万网 域名做空包网站合法吗

函数概述 函数&#xff1a;组织好的、可重复使用的。杉树能提高应用的模块性和代码的重复利用性。Python提供了很多的内置函数&#xff0c;比如len()等等&#xff0c;可以自行定义函数。 函数的定义 def 函数名&#xff08;参数列表&#xff09;&#xff1a; #函数定义 函数体…

网站建设的公司哪家是上市公司互动性的网站

List item 本篇将介绍Linux的时间管理&运行级别相关知识&#xff0c;并将深入介绍Linux的启动过程及原理。 Linux的时间管理 Linux 时钟分为系统时钟&#xff08;System Clock&#xff09;和硬件&#xff08;Real Time Clock&#xff0c;简称 RTC&#xff09;时钟。系统时…

老的网站为什么要改版新网站北京seowyhseo

最近对一系列基于3D Gaussian Splatting&#xff08;3DGS&#xff09;SLAM的工作的源码进行了测试与解读。为此写下本博客mark一下所有的源码解读以及对应的代码配置与测试记录~ 其中工作1~5的原理解读见博客&#xff1a; 学习笔记之——3D Gaussian Splatting及其在SLAM与自动…

大连龙彩科技的网站在谁家做公司网络安全管理制度和应急工作预案

笔记本屏幕的色域 72%NTSC和100%sRGB有什么区别2018-09-15 11:00:05247点赞724收藏60评论无论是选购普通笔记本还是游戏本&#xff0c;大家除了关心产品的外观和配置外&#xff0c;越来越多的朋友把重点放在了屏幕上&#xff0c;也许TN屏和IPS屏大家很好判断哪个更好&#xff0…

Avalonia 学习笔记07. Control Themes(控件主题)

在本章节中,我们的目标是创建一个可复用的、带图标的按钮控件,以简化我们在视图(View)中编写的XAML代码。当前,每创建一个带图标的按钮,都需要在 <Button> 内部嵌套一个 <StackPanel> 和两个 <La…