User Guide
Page 2
... 1149.1-1990 "IEEE Standard Test Access Port and Boundary-Scan Architecture," Copyright © 1990 by the Institute of Sale, AMD assumes no representations or warranties with Advanced Micro Devices, Inc. ("AMD") products. AMD makes no liability whatsoever, and disclaims any express or implied ..., FusionE86 is a trademark of Advanced Micro Devices, Inc. Except as components in the described manner. Trademarks AMD, the AMD logo, K6, 3DNow!, and combinations thereof, AMD PowerNow!, E86, and Super7 are for surgical implant into the body, or in other application in this publication...
... 1149.1-1990 "IEEE Standard Test Access Port and Boundary-Scan Architecture," Copyright © 1990 by the Institute of Sale, AMD assumes no representations or warranties with Advanced Micro Devices, Inc. ("AMD") products. AMD makes no liability whatsoever, and disclaims any express or implied ..., FusionE86 is a trademark of Advanced Micro Devices, Inc. Except as components in the described manner. Trademarks AMD, the AMD logo, K6, 3DNow!, and combinations thereof, AMD PowerNow!, E86, and Super7 are for surgical implant into the body, or in other application in this publication...
User Guide
Page 5
...Contents Contents Revision History xvii About this Data Sheet xix 1 AMD-K6™-2E+ Embedded Processor 1 1.1 AMD-K6™-2E+ Embedded Processor Features 3 1.2 Process Technology 7 1.3 Super7™ Platform 8 2 Internal Architecture 11 2.1 Microarchitecture Overview 11 2.2 Cache, Instruction Prefetch, and... Management Registers 54 3.4 Paging 56 3.5 Descriptors and Gates 59 3.6 Exceptions and Interrupts 62 3.7 Instructions Supported by the AMD-K6™-2E+ Processor . . 63 4 Logic Symbol Diagram 91 5 Signal Descriptions 93 5.1 Signal Terminology 93 5.2 A20M# (Address Bit ...
...Contents Contents Revision History xvii About this Data Sheet xix 1 AMD-K6™-2E+ Embedded Processor 1 1.1 AMD-K6™-2E+ Embedded Processor Features 3 1.2 Process Technology 7 1.3 Super7™ Platform 8 2 Internal Architecture 11 2.1 Microarchitecture Overview 11 2.2 Cache, Instruction Prefetch, and... Management Registers 54 3.4 Paging 56 3.5 Descriptors and Gates 59 3.6 Exceptions and Interrupts 62 3.7 Instructions Supported by the AMD-K6™-2E+ Processor . . 63 4 Logic Symbol Diagram 91 5 Signal Descriptions 93 5.1 Signal Terminology 93 5.2 A20M# (Address Bit ...
User Guide
Page 19
... initiative. Chapter 7, "Bus Cycles" on page 93, lists the signals and their descriptions alphabetically and by the A M D -K 6 -2 E + p r o c e s s o r 's a rch i t e c t u re a n d d e s i g n implementation. Chapter 2, "Internal Architecture" on page 91, contains the AMD-K6-2E+ processor logic symbol diagram. Chapter 4, "Logic Symbol Diagram" on page 11, describes the functional elements of bus cycles. 23542A/0-September 2000 Preliminary Information...
... initiative. Chapter 7, "Bus Cycles" on page 93, lists the signals and their descriptions alphabetically and by the A M D -K 6 -2 E + p r o c e s s o r 's a rch i t e c t u re a n d d e s i g n implementation. Chapter 2, "Internal Architecture" on page 91, contains the AMD-K6-2E+ processor logic symbol diagram. Chapter 4, "Logic Symbol Diagram" on page 11, describes the functional elements of bus cycles. 23542A/0-September 2000 Preliminary Information...
User Guide
Page 26
...AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 The AMD-K6-2E+ embedded processor is available in order to offer the lowest available power and extended temperature ratings. s The low-power version operates at 450 MHz delivers a maximum peak bandwidth of a large and fast cache design in feeding performancehungry applications, AMD developed an innovative cache architecture... example, the internal L2 cache of an AMD-K6-2E+/450 processor operates at 100 MHz. technology in all AMD-K6 family processors. The processor's multiport internal cache design enables both the ...
...AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 The AMD-K6-2E+ embedded processor is available in order to offer the lowest available power and extended temperature ratings. s The low-power version operates at 450 MHz delivers a maximum peak bandwidth of a large and fast cache design in feeding performancehungry applications, AMD developed an innovative cache architecture... example, the internal L2 cache of an AMD-K6-2E+/450 processor operates at 100 MHz. technology in all AMD-K6 family processors. The processor's multiport internal cache design enables both the ...
User Guide
Page 29
... experience through six generations of -the-art features, industry-leading performance, high-performance 3DNow! See "Super7™ Platform" on page 8 for AMD-K6-2E+ processor designs. Industry-Standard x86 Architecture The AMD-K6-2E+ processor is an extension to enable compatibility with Windows® 98, Windows 95, Windows 3.x, Windows NT, DOS, Linux, OS/2, Unix, Solaris, NetWare®...
... experience through six generations of -the-art features, industry-leading performance, high-performance 3DNow! See "Super7™ Platform" on page 8 for AMD-K6-2E+ processor designs. Industry-Standard x86 Architecture The AMD-K6-2E+ processor is an extension to enable compatibility with Windows® 98, Windows 95, Windows 3.x, Windows NT, DOS, Linux, OS/2, Unix, Solaris, NetWare®...
User Guide
Page 33
...-based software. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 2 Internal Architecture The AMD-K6-2E+ processor implements advanced design techniques known as the Enhanced RISC86 microarchitecture. The architecture determines what software the processor can run. s Microarchitecture refers to understand the terms architecture, microarchitecture, and design implementation. The architecture of the RISC86 microarchitecture. 2.1 Microarchitecture Overview When...
...-based software. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 2 Internal Architecture The AMD-K6-2E+ processor implements advanced design techniques known as the Enhanced RISC86 microarchitecture. The architecture determines what software the processor can run. s Microarchitecture refers to understand the terms architecture, microarchitecture, and design implementation. The architecture of the RISC86 microarchitecture. 2.1 Microarchitecture Overview When...
User Guide
Page 34
... x86 instructions into an aggressive and highly efficient six-stage pipeline. As shown in turn, the decoders feed the scheduler. The AMD-K6-2E+ processor combines the latest in processor microarchitecture to include direct support for today's computational systems. The AMD-K6-2E+ processor offers true sixth-generation performance and x86 binary software compatibility. 12 Internal Architecture Chapter 2
... x86 instructions into an aggressive and highly efficient six-stage pipeline. As shown in turn, the decoders feed the scheduler. The AMD-K6-2E+ processor combines the latest in processor microarchitecture to include direct support for today's computational systems. The AMD-K6-2E+ processor offers true sixth-generation performance and x86 binary software compatibility. 12 Internal Architecture Chapter 2
User Guide
Page 35
...fly, with the x86 instructions, in length Internal Architecture 13 The three types of an x86 instruction on -chip L1 instruction cache is stored, along with no additional latency, up to a processor clock. The AMD-K6-2E+ processor categorizes x86 instructions into RISC86 operations. Predecode logic ...to two x86 instructions per clock into three types of the x86 instructions begins when the on a byte-by the decoders. AMD-K6™-2E+ Processor Block Diagram Srtvr ÃVvÃ` Drtr Hyvrqvh"9I...
...fly, with the x86 instructions, in length Internal Architecture 13 The three types of an x86 instruction on -chip L1 instruction cache is stored, along with no additional latency, up to a processor clock. The AMD-K6-2E+ processor categorizes x86 instructions into RISC86 operations. Predecode logic ...to two x86 instructions per clock into three types of the x86 instructions begins when the on a byte-by the decoders. AMD-K6™-2E+ Processor Block Diagram Srtvr ÃVvÃ` Drtr Hyvrqvh"9I...
User Guide
Page 36
... x86 instructions. EAX, EBX, ECX, EDX, EBP, ESP, ESI, and EDI. 14 Internal Architecture Chapter 2 The ICU is capable of operations: s Memory load operation s Memory store operation s Complex integer, MMX or 3DNow! Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 s Long decodes-x86 instructions less than or equal to...
... x86 instructions. EAX, EBX, ECX, EDX, EBP, ESP, ESI, and EDI. 14 Internal Architecture Chapter 2 The ICU is capable of operations: s Memory load operation s Memory store operation s Complex integer, MMX or 3DNow! Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 s Long decodes-x86 instructions less than or equal to...
User Guide
Page 37
...AMD-K6™-2E+ Embedded Processor Data Sheet Branch Logic 3DNow!™ Technology s An analogous set of one clock cache-fetch penalty. technology, which uses a packed, single-precision, floating-point data format and Single Instruction Multiple Data (SIMD) operations based on the fly during instruction decode. committed or architectural... table s Branch target cache s Return address stack The AMD-K6-2E+ processor implements a two-level branch prediction scheme based on page 35. In summary, the AMD-K6-2E+ processor uses dynamic branch logic to minimize delays due to the ...
...AMD-K6™-2E+ Embedded Processor Data Sheet Branch Logic 3DNow!™ Technology s An analogous set of one clock cache-fetch penalty. technology, which uses a packed, single-precision, floating-point data format and Single Instruction Multiple Data (SIMD) operations based on the fly during instruction decode. committed or architectural... table s Branch target cache s Return address stack The AMD-K6-2E+ processor implements a two-level branch prediction scheme based on page 35. In summary, the AMD-K6-2E+ processor uses dynamic branch logic to minimize delays due to the ...
User Guide
Page 38
... or from external memory, each instruction byte that later enables the decoders to a tag mismatch, in the same cache state. 16 Internal Architecture Chapter 2 The two cache lines of each cache line. The required L1 cache line is filled from the L2 cache or from external ...Sheet 23542A/0-September 2000 2.2 Cache Cache, Instruction Prefetch, and Predecode Bits The writeback level-one (L1) cache on the AMD-K6-2E+ processor is organized as invalid. The processor cache design takes advantage of cache misses and associated cache fills can take place-a tag-miss cache fill and a tag-hit...
... or from external memory, each instruction byte that later enables the decoders to a tag mismatch, in the same cache state. 16 Internal Architecture Chapter 2 The two cache lines of each cache line. The required L1 cache line is filled from the L2 cache or from external ...Sheet 23542A/0-September 2000 2.2 Cache Cache, Instruction Prefetch, and Predecode Bits The writeback level-one (L1) cache on the AMD-K6-2E+ processor is organized as invalid. The processor cache design takes advantage of cache misses and associated cache fills can take place-a tag-miss cache fill and a tag-hit...
User Guide
Page 39
For more detailed information, see Figure 3 on a memory-aligned word Chapter 2 Internal Architecture 17 The predecode bits indicate the number of bytes to the start of instructions lie across a cache line boundary. ... cache alongside each x86 instruction byte as they assist with each instruction byte. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet Prefetching Predecode Bits The AMD-K6-2E+ processor conditionally performs cache prefetching, which results in Table 15, "3DNow!™ Instructions," on page 89. technology...
For more detailed information, see Figure 3 on a memory-aligned word Chapter 2 Internal Architecture 17 The predecode bits indicate the number of bytes to the start of instructions lie across a cache line boundary. ... cache alongside each x86 instruction byte as they assist with each instruction byte. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet Prefetching Predecode Bits The AMD-K6-2E+ processor conditionally performs cache prefetching, which results in Table 15, "3DNow!™ Instructions," on page 89. technology...
User Guide
Page 40
...Unit 16 Instruction Bytes plus 16 Sets of Predecode Bits Instruction Buffer Figure 3. The Instruction Buffer Instruction Decode The AMD-K6-2E+ processor decode logic is flushed and reloaded with word granularity. Most RISC86 operations execute in a single clock. RISC86 ... - RISC86 operations are decoded into as few as a JMP instruction - Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 (two bytes) organization. Some x86 instructions are fixed-length internal instructions. or one RISC86 18 Internal Architecture Chapter 2
...Unit 16 Instruction Bytes plus 16 Sets of Predecode Bits Instruction Buffer Figure 3. The Instruction Buffer Instruction Decode The AMD-K6-2E+ processor decode logic is flushed and reloaded with word granularity. Most RISC86 operations execute in a single clock. RISC86 ... - RISC86 operations are decoded into as few as a JMP instruction - Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 (two bytes) organization. Some x86 instructions are fixed-length internal instructions. or one RISC86 18 Internal Architecture Chapter 2
User Guide
Page 41
...Architecture 19 The two parallel short decoders translate the most commonly-used x86 instructions ( moves, shifts, branches, ALU, FPU) and the extensions to seven bytes long. a register-to convert x86 instructions into several RISC86 operations. AMD-K6™-2E+ Processor Decode Logic The AMD-K6-2E+ processor... RISC86® Sequencer Vector Address 4 RISC86 Operations Figure 4. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet operation - More complex x86 instructions are up to the x86 instruction set (including MMX and 3DNow!
...Architecture 19 The two parallel short decoders translate the most commonly-used x86 instructions ( moves, shifts, branches, ALU, FPU) and the extensions to seven bytes long. a register-to convert x86 instructions into several RISC86 operations. AMD-K6™-2E+ Processor Decode Logic The AMD-K6-2E+ processor... RISC86® Sequencer Vector Address 4 RISC86 Operations Figure 4. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet operation - More complex x86 instructions are up to the x86 instruction set (including MMX and 3DNow!
User Guide
Page 42
... ESC instruction decode in the scheduler at a time. Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 they are designed to decode up to six groups or 24 RISC86 operations can be placed in the first short decoder. 20 Internal Architecture Chapter 2 Floating Point Instructions. This decode generates a RISC86 floating...
... ESC instruction decode in the scheduler at a time. Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 they are designed to decode up to six groups or 24 RISC86 operations can be placed in the first short decoder. 20 Internal Architecture Chapter 2 Floating Point Instructions. This decode generates a RISC86 floating...
User Guide
Page 43
...When possible, the scheduler can issue RISC86 operations for optimized execution. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet MMX™ and 3DNow!™ Instructions. All of the EMMS, FEMMS, and ...22). The main advantage of the scheduler and its operation buffer is the heart of the AMD-K6-2E+ processor (see Figure 5 on -the-fly instruction code scheduling for out-of-order execution, it... locations in parallel and allows the AMD-K6-2E+ processor to four RISC86 operations per clock. A 3DNow! Chapter 2 Internal Architecture 21
...When possible, the scheduler can issue RISC86 operations for optimized execution. 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet MMX™ and 3DNow!™ Instructions. All of the EMMS, FEMMS, and ...22). The main advantage of the scheduler and its operation buffer is the heart of the AMD-K6-2E+ processor (see Figure 5 on -the-fly instruction code scheduling for out-of-order execution, it... locations in parallel and allows the AMD-K6-2E+ processor to four RISC86 operations per clock. A 3DNow! Chapter 2 Internal Architecture 21
User Guide
Page 44
...The store and load execution units are two-stage pipelined designs. AMD-K6™-2E+ Processor Scheduler 2.5 Execution Units The AMD-K6-2E+ processor contains ten parallel execution units-store, load, integer X ALU,...AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 RISC86 #0 From Decode Logic RISC86 #1 RISC86 #2 RISC86 #3 Centralized RISC86® Operation Scheduler RISC86 Issue Buses RISC86 Operation Buffer Figure 5. Table 1 on page 24. execution units share the register X and Y issue pipelines. Data memory and register 22 Internal Architecture...
...The store and load execution units are two-stage pipelined designs. AMD-K6™-2E+ Processor Scheduler 2.5 Execution Units The AMD-K6-2E+ processor contains ten parallel execution units-store, load, integer X ALU,...AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 RISC86 #0 From Decode Logic RISC86 #1 RISC86 #2 RISC86 #3 Centralized RISC86® Operation Scheduler RISC86 Issue Buses RISC86 Operation Buffer Figure 5. Table 1 on page 24. execution units share the register X and Y issue pipelines. Data memory and register 22 Internal Architecture...
User Guide
Page 45
...-bit and 32-bit operands) Resolves Branch Conditions FADD, FSUB, FMUL 3DNow! 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet writes from stores are held in order. s The load unit performs data memory reads. Table ... instructions) Integer Y Branch FPU 3DNow! Multiply 3DNow! Convert Latency 1 1 2 1 2-3 1 1 1 2 1 1 2 2 2 2 Throughput 1 1 1 1 2-3 1 1 1 1 1 1 2 1 1 1 Chapter 2 Internal Architecture 23 The Integer X execution unit can operate on page 25) in that it resolves conditional branches such as JCC and LOOP after two clocks. The...
...-bit and 32-bit operands) Resolves Branch Conditions FADD, FSUB, FMUL 3DNow! 23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet writes from stores are held in order. s The load unit performs data memory reads. Table ... instructions) Integer Y Branch FPU 3DNow! Multiply 3DNow! Convert Latency 1 1 2 1 2-3 1 1 1 2 1 1 2 2 2 2 Throughput 1 1 1 1 2-3 1 1 1 1 1 1 2 1 1 1 Chapter 2 Internal Architecture 23 The Integer X execution unit can operate on page 25) in that it resolves conditional branches such as JCC and LOOP after two clocks. The...
User Guide
Page 46
Register X and Y Pipeline Functional Units 24 Internal Architecture Chapter 2 In addition, both . Figure 6 shows the details of an integer execution unit and an MMX ALU execution unit, therefore allowing superscalar ... X and Integer Y units. multiplier and MMX shifter, which allows the appropriate RISC86 operation to the 3DNow! ALU, the MMX/3DNow! Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 Register X and Y Pipelines The functional units that consist of the X and Y register pipelines. Each register pipeline has ...
Register X and Y Pipeline Functional Units 24 Internal Architecture Chapter 2 In addition, both . Figure 6 shows the details of an integer execution unit and an MMX ALU execution unit, therefore allowing superscalar ... X and Integer Y units. multiplier and MMX shifter, which allows the appropriate RISC86 operation to the 3DNow! ALU, the MMX/3DNow! Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 23542A/0-September 2000 Register X and Y Pipelines The functional units that consist of the X and Y register pipelines. Each register pipeline has ...
User Guide
Page 47
...table stores executed branch information, predicts individual branches, and predicts the behavior of groups of the unconditional branch. Chapter 2 Internal Architecture 25 A two-level adaptive history algorithm is implemented in x86 code fit into two categories: s Unconditional branches always change program ...or may not divert program flow (that can minimize or hide the impact of the dynamic branch-prediction mechanism built into the AMD-K6-2E+ processor. The branch logic contains an 8192-entry branch history table, a 16-entry by redirecting instruction fetching to 20% conditional ...
...table stores executed branch information, predicts individual branches, and predicts the behavior of groups of the unconditional branch. Chapter 2 Internal Architecture 25 A two-level adaptive history algorithm is implemented in x86 code fit into two categories: s Unconditional branches always change program ...or may not divert program flow (that can minimize or hide the impact of the dynamic branch-prediction mechanism built into the AMD-K6-2E+ processor. The branch logic contains an 8192-entry branch history table, a 16-entry by redirecting instruction fetching to 20% conditional ...