Computer Organization/Architecture 计算机组织/架构/结构 重要观念和笔记(陆续更新中,2024/04/17周三,已更新)

前情提要:我的说法比较白话,希望可以更好理解其中一些观念,这篇会以中文为主,专有名词还是用英文,好吧应该会中英穿插,自己学的时候感觉听中文会吸收比较快,也可能是我英文比较烂的关系 ̄□ ̄||欢迎点赞收藏评论给建议,感谢~

  • 2024/04/17周三,已更新

问1: 为什么要学计算机架构Computer Architecture(CA)?

答:
In broadest definition,CA是allow us using manufacturing technologies来efficiently execution information processing application的abstraction/implementation layer;CA主要指ISA和microarchitecture;本质上是software和hardware的contract契约及interface,包括software怎么看到hardware,hardware有哪些部分是software visible的,它们有哪些互动interaction等。
在这里插入图片描述

问2: Architecture/ISA vs Microarchitecture/Organization?

答:
1. 观念: 清楚什么是ISA定义的,什么是Microarchitecture定义的即可。
2. ISA定义的: input/output;data types/sizes;operations(instructions and how they work);execution semantics(interrupt);programmer visible state(memory and register)。
3. Microarchitecture定义的: implement ISA for some metrics(speed,energy,cost)的tradeoffs;e.g., pipline number and pipline depth,cache size,silicon area,bus widths,ALU widths,exe ordering,peak power。
4. Microarchitecture像是implement ISA的choice,ISA一样的computer可以有很多不一样的microarchitecture,取决于computer要用在embeding space还是high performance space。

问3: pipline哪一个cycle哪一条instruction在什么stage一定要清楚?

例题: 来自《Computer Architecture a Quantitative Approach》第六版,Page C-71,Problem C.1 a,b,c,d,e,f,g
在这里插入图片描述

a. Data hazards are caused by data dependences in the code. Whether a dependency causes a hazard depends on the machine implementation (i.e., number of pipeline stages). List all of the data dependences in the code above. Record the register, source instruction, and destination instruction; for example, there is a data dependency for register x1 from the ld to the addi.

b. Show the timing of this instruction sequence for the 5-stage RISC pipeline without any forwarding or bypassing hardware but assuming that a register read and a write in the same clock cycle “forwards” through the register file, as between the add and or shown in Figure C.5. Use a pipeline timing chart like that in Figure C.8. Assume that the branch is handled by flushing the pipeline. If all memory references take 1 cycle, how many cycles does this loop take to execute?

c. Show the timing of this instruction sequence for the 5-stage RISC pipeline with full forwarding and bypassing hardware. Use a pipeline timing chart like that shown in Figure C.8. Assume that the branch is handled by predicting it as not taken. If all memory references take 1 cycle, how many cycles does this loop take to execute?

d. Show the timing of this instruction sequence for the 5-stage RISC pipeline with full forwarding and bypassing hardware, as shown in Figure C.6. Use a pipeline timing chart like that shown in Figure C.8. Assume that the branch is handled by predicting it as taken. If all memory references take 1 cycle, how many cycles does this loop take to execute?

e. High-performance processors have very deep pipelines—more than 15 stages. Imagine that you have a 10-stage pipeline in which every stage of the 5-stage pipeline has been split in two. The only catch is that, for data forwarding, data are forwarded from the end of a pair of stages to the beginning of the two stages where they are needed. For example, data are forwarded from the output of the second execute stage to the input of the first execute stage, still causing a 1-cycle delay. Show the timing of this instruction sequence for the 10-stage RISC pipeline with full forwarding and bypassing hardware. Use a pipeline timing chart like that shown in Figure C.8 (but with stages labeled IF1, IF2, ID1, etc.). Assume that the branch is handled by predicting it as taken. If all memory references take 1 cycle, how many cycles does this loop take to execute?

f. Assume that in the 5-stage pipeline, the longest stage requires 0.8 ns, and the pipeline register delay is 0.1 ns. What is the clock cycle time of the 5-stage pipeline? If the 10-stage pipeline splits all stages in half, what is the cycle time of the 10-stage machine?

g. Using your answers from parts (d) and (e), determine the cycles per instruction (CPI) for the loop on a 5-stage pipeline and a 10-stage pipeline. Make sure you count only from when the first instruction reaches the write-back stage to the end. Do not count the start-up of the first instruction. Using the clock cycle time calculated in part (f), calculate the average instruction execute time for each machine.

答:
a.
在这里插入图片描述
b.
Forwarding is performed only via the register file. Branch outcomes and targets are not known until the end of the execute stage. All instructions introduced to the pipeline prior to this point are flushed.
在这里插入图片描述
Since the initial value of x3 is x2+396 and equal instances of the loop add 4 to x2, the total number of iterations is 99. It takes 16 cycles between loop instances. The last loop takes two addition cycles since this latency cannot be overlapped with additional loop instances. The total number of cycles is 16×98+18 =1584.

c.
在这里插入图片描述
Assumes branch resolved in decode stage and no delay slots. Branch outcomes and targets are known now at the end of decode. Resolving branch in decode requires zero detect after bypass. The total number of cycles is 8×98+11=795.

d.
在这里插入图片描述
Assumes branch resolved in decode stage and no delay slots, and early pre-decode to determine and fetch target of branch in fetch stage. The total number of cycles is 7×98+11=697.

e.
在这里插入图片描述
The total number of cycles is 12×98+21=1197.

f.
5-stage: 0.8+0.1=0.9(ns); 10-stage: 0.8/2+0.1=0.5(ns)

g.
5-stage:
The 5th cycle to the 11th cycle took a total of 7 cycles and involved the execution of 6 instructions.
CPI = 7(cycles) / 6 (instructions) = 1.16
Average Instruction Execution Time = 1.16×0.9=1.044

10-stage:
The 10th cycle to the 21st cycle took a total of 12 cycles and involved the execution of 6 instructions.
CPI = 12(cycles) / 6 (instructions) = 2
Average Instruction Execution Time = 2×0.5=1

  • 2024/00/00周x,未更新

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/825318.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Eagle for Mac v1.9.13注册版:强大的图片管理工具

Eagle for Mac是一款专为Mac用户设计的图片管理工具,旨在帮助用户更高效、有序地管理和查找图片资源。 Eagle for Mac v1.9.13注册版下载 Eagle支持多种图片格式,包括JPG、PNG、GIF、SVG、PSD、AI等,无论是矢量图还是位图,都能以清…

Solaris安装Oracle RAC配置手册

一. Oracle RAC安装前的系统准备工作 检查安装包 ​pkginfo –i SUNWarc SUNWbtool SUNWhea SUNWlibC SUNWlibm SUNWlibms SUNWsprotSUNWtoo pkg install SUNWarc SUNWbtool SUNWhea SUNWlibC SUNWlibm SUNWlibms SUNWsprotSUNWtoo 1.1 创建系统用户和组(两节点都要执行 ro…

通付盾APP尽职调查报告:守护移动应用安全新篇章

在数字化浪潮席卷全球的今天,移动应用程序已经成为我们生活中不可或缺的一部分。无论是购物、社交、娱乐还是工作,我们几乎每天都在与各种各样的APP打交道。然而,随着APP的广泛应用,其安全问题也日益凸显,成为开发者和…

BEV| lift-splat-shoot 运行配置

Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d

基于imx6ull的LCD驱动移植

移植思路: LCD除了显示之外,它的表面通常还贴有一个触摸屏。 所以我们移植的是2个设备的驱动:LCD、触摸屏。 LCD驱动在内核中已经有了,并且很完善,我们只需要修改设备树就可以:修改时序等LCD参数&#x…

GPT国内怎么用

2022年11月,OpenAI发布了ChatGPT,这标志着大型语言模型在自然语言处理领域迈出了巨大的一步。ChatGPT不仅在生成文本方面表现出了惊人的流畅度和连贯性,更为人工智能应用开启了全新的可能性。 ChatGPT的推出促进了人工智能技术在多个领域的广…

No spring.config.import property has been defined

运行Springcloud项目出现下面错误: Description: No spring.config.import property has been defined Action: Add a spring.config.importnacos: property to your configuration. If configuration is not required add spring.config.importoptional:nac…

mac电脑mysql下载与安装

mysql下载地址 历史下载地址 MySQL :: Download MySQL Community Server (Archived Versions) mac 版下载 mac版本分为 Intel 处理器 和 M系列处理器。 从 8.0.26开始, mysql 支持M系列处理器。 以前的都只有Intel 处理器的。 Intel 处理器选择 x86_64 M 系列处理…

内置管线升级到SBP,如何复用之前打包的AssetBundle

1)内置管线升级到SBP,如何复用之前打包的AssetBundle 2)安卓真机,在Unity 2021.3.31版本下Buffer数据异常 3)URP里CullResults.CreateSharedRendererScene下面的消耗 4)移动端是否支持曲面细分着色 这是第3…

Seaborn:推荐一个好用的Python可视化工具

1. 引言 Seaborn 是建立在 matplotlib 基础上的数据可视化库,并与 Python 中的 pandas 数据结构紧密结合。可视化是 Seaborn 的核心部分,有助于直观的理解数据。 闲话少说,我们直接开始吧! 2. 安装 Seaborn库主要提供以下功能…

羊大师解析,夏天羊奶有什么搭配,可以解暑吗?

羊大师解析,夏天羊奶有什么搭配,可以解暑吗? 羊大师发现夏天羊奶的搭配方式多样,不仅可以提供丰富的营养,还有助于解暑。以下是一些推荐的搭配方式: 羊奶蜂蜜:蜂蜜的清甜口感可以改善羊奶的膻味&#xff…

学习笔记<2024.4.15-2024.4.21>:Attention Is All You Need

Transformer中Self-Attention以及Multi-Head Attention详解 (https://www.bilibili.com/video/BV15v411W78M/?spm_id_from333.337.search-card.all.click&vd_sourcef32decb03075b4a1833fe5c47c11ba94)

网络变压器(网络隔离变压器)是如何影响网通设备的传输速率的呢?

Hqst华轩盛(石门盈盛)电子导读:今天介绍网络变压器(网络隔离变压器/网络滤波器)是如何影响网通设备的传输速率的 一、网络变压器(网络隔离变压器/网络滤波器)的工作原理 网络变压器(网络隔离变压器/网络滤…

【网络运维知识】—路由器与交换机区别

【网络运维知识】—路由器与交换机区别 一、路由器(Router)和交换机(Switch)对比1.1 功能1.2 转发方式1.3 范围1.4 处理方式 💖The Begin💖点点关注,收藏不迷路💖 路由器&#xff08…

Xftp下载,安装与使用

1.下载网址 家庭/学校免费 - NetSarang Website (xshell.com) 链接:百度网盘 请输入提取码 提取码:jbgy 2.安装 3.使用 Linux终端ifconfig 4.问题 4.1 中文乱码

MyBatis操作数据库(4)

动态sql 动态sql是MyBatis的强大特性之一, 能够完成不同条件下的sql拼接. <if>标签 在注册用户的问题时, 可能会有这样的一个问题:就是说注册时有一些信息是必填的, 而有一些信息是选填的. 那么如果在添加用户的时候有不确定字段的传入, 程序应该如何实现呢? 这时就…

算法与数学

学过数据结构的对这个应该都不会感到陌生&#xff0c;但是乍一看原来是数学&#xff0c;然而事实就是如此。二分法的数学源头就是这个。 还有前缀和的 我们这里所说的前缀和其实就是我们在高中学的数列中的Sn(前n项和)&#xff0c;只是我们这里需要将S1 , S2 , S3 , S4 …… S…

Android apk包使用360加固工具的加固步骤

1&#xff0c;准备好已经签名打包的apk包。 2&#xff0c;在360加固官方网站下载加固exe软件。三六零天御-企业移动应用安全一站式服务平台 3&#xff0c;步骤一&#xff0c;添加加固包&#xff0c;进行加固&#xff0c;并输出加固包&#xff1a; 4&#xff0c;步骤二&#…

idea项目启动异常:Command line is too long.

项目场景&#xff1a; 提示&#xff1a;这里简述项目相关背景&#xff1a; idea中启动项目报错&#xff1a; 解决方案 在idea 的运行配置中&#xff0c;修改enviroment下的shorten command line 为jar manifest 注&#xff1a; 有时shorten command line 可能不是默认存在的…

[2021最新]Java时间戳和日期时间互转换

代码&#xff1a; import java.text.ParseException; import java.text.SimpleDateFormat;public class MainProcess {public static void main(String[] args) throws ParseException {// 1.set formatSimpleDateFormat timeSmat new SimpleDateFormat("yyyy-MM-dd HH:…