Download Digital Design and Computer Architecture, Second Edition PDF

TitleDigital Design and Computer Architecture, Second Edition
File Size24.7 MB
Total Pages721
Table of Contents
                            Front Cover
In Praise of Digital Design
and Computer Architecture
About the Authors
Digital Design and Computer Architecture
Table of Contents
		Side-by-Side Coverage of SystemVerilog and VHDL
		Classic MIPS Architecture and Microarchitecture
		Real-World Perspectives
		Accessible Overview of Advanced Microarchitecture
		End-of-Chapter Exercises and Interview Questions
	Online Supplements
	How to Use the Software Tools in A Course
		Altera Quartus II
		Microchip MPLAB IDE
		Optional Tools: Synplify Premier and QtSpim
1 From Zero to One
	1.1 The Game Plan
	1.2 The Art of Managing Complexity
		1.2.1 Abstraction
		1.2.2 Discipline
		1.2.3 The Three-Y's
	1.3 The Digital Abstraction
	1.4 Number Systems
		1.4.1 Decimal Numbers
		1.4.2 Binary Numbers
		1.4.3 Hexadecimal Numbers
		1.4.4 Bytes, Nibbles, and All That Jazz
		1.4.5 Binary Addition
		1.4.6 Signed Binary Numbers
			Sign/Magnitude Numbers
			Two's Complement Numbers
			Comparison of Number Systems
	1.5 Logic Gates
		1.5.1 NOT Gate
		1.5.2 Buffer
		1.5.3 AND Gate
		1.5.4 OR Gate
		1.5.5 Other Two-Input Gates
		1.5.6 Multiple-Input Gates
	1.6 Beneath the Digital Abstraction
		1.6.1 Supply Voltage
		1.6.2 Logic Levels
		1.6.3 Noise Margins
		1.6.4 DC Transfer Characteristics
		1.6.5 The Static Discipline
	1.7 CMOS Transistors*
		1.7.1 Semiconductors
		1.7.2 Diodes
		1.7.3 Capacitors
		1.7.4 nMOS and pMOS Transistors
		1.7.5 CMOS NOT Gate
		1.7.6 Other CMOS Logic Gates
		1.7.7 Transmission Gates
		1.7.8 Pseudo-nMOS Logic
	1.8 Power Consumption*
	1.9 Summary and a Look Ahead
	Interview Questions
2 Combinational Logic Design
	2.1 Introduction
	2.2 Boolean Equations
		2.2.1 Terminology
		2.2.2 Sum-of-Products Form
		2.2.3 Product-of-Sums Form
	2.3 Boolean Algebra
		2.3.1 Axioms
		2.3.2 Theorems of One Variable
		2.3.3 Theorems of Several Variables
		2.3.4 The Truth Behind It All
		2.3.5 Simplifying Equations
	2.4 From Logic to Gates
	2.5 Multilevel Combinational Logic
		2.5.1 Hardware Reduction
		2.5.2 Bubble Pushing
	2.6 X’s and Z’s, Oh My
		2.6.1 Illegal Value: X
		2.6.2 Floating Value: Z
	2.7 Karnaugh Maps
		2.7.1 Circular Thinking
		2.7.2 Logic Minimization with K-Maps
		2.7.3 Don't Cares
		2.7.4 The Big Picture
	2.8 Combinational Building Blocks
		2.8.1 Multiplexers
			2:1 Multiplexer
			Wider Multiplexers
			Multiplexer Logic
		2.8.2 Decoders
			Decoder Logic
	2.9 Timing
		2.9.1 Propagation and Contamination Delay
		2.9.2 Glitches
	2.10 Summary
	Interview Questions
3 Sequential Logic Design
	3.1 Introduction
	3.2 Latches and Flip-Flops
		3.2.1 SR Latch
		3.2.2 D Latch
		3.2.3 D FIip-Flop
		3.2.4 Register
		3.2.5 Enabled Flip-Flop
		3.2.6 Resettable Flip-Flop
		3.2.7 Transistor-Level Latch and Flip-Flop Designs*
		3.2.8 Putting It All Together
	3.3 Synchronous Logic Design
		3.3.1 Some Problematic Circuits
		3.3.2 Synchronous Sequential Circuits
		3.3.3 Synchronous and Asynchronous Circuits
	3.4 Finite State Machines
		3.4.1 FSM Design Example
		3.4.2 State Encodings
		3.4.3 Moore and Mealy Machines
		3.4.4 Factoring State Machines
		3.4.5 Deriving an FSM from a Schematic
		3.4.6 FSM Review
	3.5 Timing of Sequential Logic
		3.5.1 The Dynamic Discipline
		3.5.2 System Timing
			Setup Time Constraint
			Hold Time Constraint
			Putting It All Together
		3.5.3 Clock Skew*
		3.5.4 Metastability
			Metastable State
			Resolution Time
		3.5.5 Synchronizers
		3.5.6 Derivation of Resolution Time*
	3.6 Parallelism
	3.7 Summary
	Interview Questions
4 Hardware Description Languages
	4.1 Introduction
		4.1.1 Modules
		4.1.2 Language Origins
		4.1.3 Simulation and Synthesis
	4.2 Combinational Logic
		4.2.1 Bitwise Operators
		4.2.2 Comments and White Space
		4.2.3 Reduction Operators
		4.2.4 Conditional Assignment
		4.2.5 Internal Variables
		4.2.6 Precedence
		4.2.7 Numbers
		4.2.8 Z’s and X’s
		4.2.9 Bit Swizzling
		4.2.10 Delays
	4.3 Structural Modeling
	4.4 Sequential Logic
		4.4.1 Registers
		4.4.2 Resettable Registers
		4.4.3 Enabled Registers
		4.4.4 Multiple Registers
		4.4.5 Latches
	4.5 More Combinational Logic
		4.5.1 Case Statements
		4.5.2 If Statements
		4.5.3 Truth Tables with Don’t Cares
		4.5.4 Blocking and Nonblocking Assignments
			Combinational Logic*
			Sequential Logic*
	4.6 Finite State Machines
	4.7 Data Types*
		4.7.1 SystemVerilog
		4.7.2 VHDL
	4.8 Parameterized Modules*
	4.9 Testbenches
	4.10 Summary
	Interview Questions
5 Digital Building Blocks
	5.1 Introduction
	5.2 Arithmetic Circuits
		5.2.1 Addition
			Half Adder
			Full Adder
			Carry Propagate Adder
			Ripple-Carry Adder
			Carry-Lookahead Adder
			Prefix Adder*
			Putting It All Together
		5.2.2 Subtraction
		5.2.3 Comparators
		5.2.4 ALU
		5.2.5 Shifters and Rotators
		5.2.6 Multiplication*
		5.2.7 Division*
		5.2.8 Further Reading
	5.3 Number Systems
		5.3.1 Fixed-Point Number Systems
		5.3.2 Floating-Point Number Systems*
			Special Cases: 0, ±∞, and NaN
			Single- and Double-Precision Formats
			Floating-Point Addition
	5.4 Sequential Building Blocks
		5.4.1 Counters
		5.4.2 Shift Registers
			Scan Chains*
	5.5 Memory Arrays
		5.5.1 Overview
			Bit Cells
			Memory Ports
			Memory Types
		5.5.2 Dynamic Random Access Memory (DRAM)
		5.5.3 Static Random Access Memory (SRAM)
		5.5.4 Area and Delay
		5.5.5 Register Files
		5.5.6 Read Only Memory
		5.5.7 Logic Using Memory Arrays
		5.5.8 Memory HDL
	5.6 Logic Arrays
		5.6.1 Programmable Logic Array
		5.6.2 Field Programmable Gate Array
		5.6.3 Array Implementations*
	5.7 Summary
	Interview Questions
6 Architecture
	6.1 Introduction
	6.2 Assembly Language
		6.2.1 Instructions
		6.2.2 Operands: Registers, Memory, and Constants
			The Register Set
	6.3 Machine Language
		6.3.1 R-Type Instructions
		6.3.2 l-Type Instructions
		6.3.3 J-Type Instructions
		6.3.4 Interpreting Machine Language Code
		6.3.5 The Power of the Stored Program
	6.4 Programming
		6.4.1 Arithmetic/Logical Instructions
			Logical Instructions
			Shift Instructions
			Generating Constants
			Multiplication and Division Instructions*
		6.4.2 Branching
			Conditional Branches
		6.4.3 Conditional Statements
			If Statements
			If/Else Statements
			Switch/Case Statements*
		6.4.4 Getting Loopy
			While Loops
			For Loops
			Magnitude Comparison
		6.4.5 Arrays
			Array Indexing
			Bytes and Characters
		6.4.6 Function Calls
			Function Calls and Returns
			Input Arguments and Return Values
			The Stack
			Preserved Registers
			Recursive Function Calls
			Additional Arguments and Local Variables*
	6.5 Addressing Modes
	6.6 Lights, Camera, Action: Compiling, Assembling, and Loading
		6.6.1 The Memory Map
			The Text Segment
			The Global Data Segment
			The Dynamic Data Segment
			The Reserved Segments
		6.6.2 Translating and Starting a Program
			Step 1: Compilation
			Step 2: Assembling
			Step 3: Linking
			Step 4: Loading
	6.7 Odds and Ends*
		6.7.1 Pseudoinstructions
		6.7.2 Exceptions
		6.7.3 Signed and Unsigned Instructions
			Addition and Subtraction
			Multiplication and Division
			Set Less Than
		6.7.4 Floating-Point Instructions
	6.8 Real-World Perspective: x86 Architecture*
		6.8.1 x86 Registers
		6.8.2 x86 Operands
		6.8.3 Status Flags
		6.8.4 x86 Instructions
		6.8.5 x86 Instruction Encoding
		6.8.6 Other x86 Peculiarities
		6.8.7 The Big Picture
	6.9 Summary
	Interview Questions
7 Microarchitecture
	7.1 Introduction
		7.1.1 Architectural State and Instruction Set
		7.1.2 Design Process
		7.1.3 MIPS Microarchitectures
	7.2 Performance Analysis
	7.3 Single-Cycle Processor
		7.3.1 Single-Cycle Datapath
		7.3.2 Single-Cycle Control
		7.3.3 More Instructions
		7.3.4 Performance Analysis
	7.4 Multicycle Processor
		7.4.1 Multicycle Datapath
		7.4.2 Multicycle Control
		7.4.3 More Instructions
		7.4.4 Performance Analysis
	7.5 Pipelined Processor
		7.5.1 Pipelined Datapath
		7.5.2 Pipelined Control
		7.5.3 Hazards
			Solving Data Hazards with Forwarding
			Solving Data Hazards with Stalls
			Solving Control Hazards
			Hazard Summary
		7.5.4 More Instructions
		7.5.5 Performance Analysis
	7.6 HDL Representation*
		7.6.1 Single-Cycle Processor
		7.6.2 Generic Building Blocks
		7.6.3 Testbench
	7.7 Exceptions*
	7.8 Advanced Microarchitecture*
		7.8.1 Deep Pipelines
		7.8.2 Branch Prediction
		7.8.3 Superscalar Processor
		7.8.4 Out-of-Order Processor
		7.8.5 Register Renaming
		7.8.6 Single Instruction Multiple Data
		7.8.7 Multithreading
		7.8.8 Homogeneous Multiprocessors
		7.8.9 Heterogeneous Multiprocessors
	7.9 Real-World Perspective: x86 Microarchitecture*
	7.10 Summary
	Interview Questions
8 Memory and I/O Systems
	8.1 Introduction
	8.2 Memory System Performance Analysis
	8.3 Caches
		8.3.1 What Data is Held in the Cache?
		8.3.2 How is Data Found?
			Direct Mapped Cache
			Multi-way Set Associative Cache
			Fully Associative Cache
			Block Size
			Putting it All Together
		8.3.3 What Data is Replaced?
		8.3.4 Advanced Cache Design*
			Multiple-Level Caches
			Reducing Miss Rate
			Write Policy
		8.3.5 The Evolution of MIPS Caches*
	8.4 Virtual Memory
		8.4.1 Address Translation
		8.4.2 The Page Table
		8.4.3 The Translation Lookaside Buffer
		8.4.4 Memory Protection
		8.4.5 Replacement Policies*
		8.4.6 Multilevel Page Tables*
	8.5 I/O Introduction
	8.6 Embedded I/O Systems
		8.6.1 PIC32MX675F512H Microcontroller
		8.6.2 General-Purpose Digital I/O
		8.6.3 Serial I/O
			Serial Peripheral Interface (SPI)
			Universal Asynchronous Receiver Transmitter (UART)
		8.6.4 Timers
		8.6.5 Interrupts
		8.6.6 Analog I/O
			A/D Conversion
			D/A Conversion
			Pulse-Width Modulation
		8.6.7 Other Microcontroller Peripherals
			Character LCDs
			VGA Monitor
			Bluetooth Wireless Communication
			Motor Control
			DC Motors
			Servo Motor
			Stepper Motor
	8.7 PC I/O Systems
		8.7.1 USB
		8.7.2 PCI and PCI Express
		8.7.3 DDR3 Memory
		8.7.4 Networking
		8.7.5 SATA
		8.7.6 Interfacing to a PC
			Data Acquisition Systems
			USB Links
	8.8 Real-World Perspective: x86 Memory and I/O Systems*
		8.8.1 x86 Cache Systems
		8.8.2 x86 Virtual Memory
		8.8.3 x86 Programmed I/O
	8.9 Summary
	Interview Questions
A Digital System Implementation
	A.1 Introduction
	A.2 74xx Logic
		A.2.1 Logic Gates
		A.2.2 Other Functions
	A.3 Programmable Logic
		A.3.1 PROMs
		A.3.2 PLAs
		A.3.3 FPGAs
	A.4 Application-Specific Integrated Circuits
	A.5 Data Sheets
	A.6 Logic Families
	A.7 Packaging and Assembly
		Printed Circuit Boards
		Putting It All Together
	A.8 Transmission Lines
		A.8.1 Matched Termination
		A.8.2 Open Termination
		A.8.3 Short Termination
		A.8.4 Mismatched Termination
		A.8.5 When to Use Transmission Line Models
		A.8.6 Proper Transmission Line Terminations
		A.8.7 Derivation of Z0*
		A.8.8 Derivation of the Reflection Coefficient*
		A.8.9 Putting It All Together
	A.9 Economics
B MIPS Instructions
C C Programming
	C.1 Introduction
	C.2 Welcome to C
		C.2.1 C Program Dissection
			Header: #include <stdio.h>
			Main function: int main(void)
			Body: printf("Hello world!\n");
		C.2.2 Running a C Program
	C.3 Compilation
		C.3.1 Comments
		C.3.2 #define
		C.3.3 #include
	C.4 Variables
		C.4.1 Primitive Data Types
		C.4.2 Global and Local Variables
		C.4.3 Initializing Variables
	C.5 Operators
	C.6 Function Calls
	C.7 Control-Flow Statements
		C.7.1 Conditional Statements
			if Statements
			if/else Statements
			switch/case Statements
		C.7.2 Loops
			while Loops
			do/while Loops
			for Loops
	C.8 More Data Types
		C.8.1 Pointers
		C.8.2 Arrays
		C.8.3 Characters
		C.8.4 Strings
		C.8.5 Structures
		C.8.6 * typedef
		C.8.7 * Dynamic Memory Allocation
		C.8.8 * Linked Lists
	C.9 Standard Libraries
		C.9.1 stdio
			File Manipulation
			Other Handy stdio Functions
		C.9.2 stdlib
			rand and srand
			Format Conversion: atoi, atol, atof
		C.9.3 math
		C.9.4 string
	C.10 Compiler and Command Line Options
		C.10.1 Compiling Multiple C Source Files
		C.10.2 Compiler Options
		C.10.3 Command Line Arguments
	C.11 Common Mistakes
Further Reading
Document Text Contents
Page 360


32-bit jump target address (JTA) to indicate the instruction address to
execute next.

Unfortunately, the J-type instruction encoding does not have enough
bits to specify a full 32-bit JTA. Six bits of the instruction are used for the
opcode, so only 26 bits are left to encode the JTA. Fortunately, the two
least significant bits, JTA1:0, should always be 0, because instructions
are word aligned. The next 26 bits, JTA27:2, are taken from the addr field
of the instruction. The four most significant bits, JTA31:28, are obtained
from the four most significant bits of PC+ 4. This addressing mode is
called pseudo-direct.

Code Example 6.29 illustrates a jal instruction using pseudo-direct
addressing. The JTA of the jal instruction is 0x004000A0. Figure 6.30
shows the machine code for this jal instruction. The top four bits and
bottom two bits of the JTA are discarded. The remaining bits are stored
in the 26-bit address field (addr).

The processor calculates the JTA from the J-type instruction by
appending two 0’s and prepending the four most significant bits of PC+ 4
to the 26-bit address field (addr).

Because the four most significant bits of the JTA are taken from PC+ 4,
the jump range is limited. The range limits of branch and jump instructions
are explored in Exercises 6.29 to 6.32. All J-type instructions, j and jal,
use pseudo-direct addressing.

Note that the jump register instruction, jr, is not a J-type instruction.
It is an R-type instruction that jumps to the 32-bit value held in register rs.


MIPS Assembly Code

0x0040005C jal sum
. . .

0x004000A0 sum: add $v0, $a0, $a1


jal sum

Machine CodeAssembly Code

3 (0x0C100028)


6 bits

0000 0000 0100 0000 0000 0000 1010 0000JTA

26-bit addr (0x0100028)


0000 0000

0 0 0 2 8



26 bits 6 bits 26 bits

000011 00 0001 0000 0000 0000 0010 1000


0000 0100 0000 0000 0000 1010

1 0

Field Values

Figure 6.30 jal machine code

6.5 Addressing Modes 335

Page 361


Up until now, we have shown how to translate short high-level code snip-
pets into assembly and machine code. This section describes how to com-
pile and assemble a complete high-level program and how to load the
program into memory for execution.

We begin by introducing the MIPS memory map, which defines
where code, data, and stack memory are located. We then show the steps
of code execution for a sample program.

6 . 6 . 1 The Memory Map

With 32-bit addresses, the MIPS address space spans 232 bytes = 4 giga-
bytes (GB). Word addresses are divisible by 4 and range from 0 to
0xFFFFFFFC. Figure 6.31 shows the MIPS memory map. The MIPS
architecture divides the address space into four parts or segments: the text
segment, global data segment, dynamic data segment, and reserved seg-
ments. The following sections describe each segment.

The Text Segment
The text segment stores the machine language program. It is large enough
to accommodate almost 256 MB of code. Note that the four most signif-
icant bits of the address in the text space are all 0, so the j instruction can
directly jump to any address in the program.

The Global Data Segment
The global data segment stores global variables that, in contrast to local
variables, can be seen by all functions in a program. Global variables


$sp = 0x7FFFFFFC











Global Data



$gp = 0x10008000

PC = 0x00400000

Dynamic DataFigure 6.31 MIPS memory map

336 CHAPTER SIX Architecture

Similer Documents