Bulldozer Design Breakdown
* Two tightly coupled, "conventional" x86 out-of-order processing engines which AMD internally named module
(Single-Module ==> Dual-Core, Dual-Module ==> Quad-Core, Quad-Module ==> Octa-Core etc...)
* Between 8MB to 16MB of L3 cache shared among all Modules on the same silicon die
* DDR3-1866 and Higher Memory Level Parallelism
* Dual Channel DDR3 Integrated Memory Controler (Support for PC3-15000 (DDR3-1866)) for Desktop, Quad Channel DDR3 Intergrated Memory Controller (support for PC-15000 (DDR3-1866) and Registered DDR3) for Server/Workstation (New Opteron)
* Cluster Multi-threading (CMT) Technology [6]
* Bulldozer module [7] [8] consists of the following:
o up to 2048kB L2 cache inside each module (shared between the cores in a module)
o 16kB 4-way L1 data cache (way-predicted) per core and 2-way 64kB L1 instruction cache per module L1 cache, Fruehe for THW
o Two dedicated integer cores
- each consist of 2 ALU and 2 AGU which are capable for total of 4 independent arithmetic or memory operations per clock per core
- duplicating integer schedulers and execution pipelines offers dedicated hardware to each of two threads which significantly increase performance in multithreaded integer applications
- second integer core increases Bulldozer module die by around 12%, which at chip level adds about 5% of total die space[9]
o Two symmetrical 128-bit FMAC (fused multiply-add (FMA) capability) Floating Point Pipelines per module that can be unified into one large 256-bit wide unit if one of integer cores dispatch AVX instruction and two symmetrical x87/MMX/3DNow! capable FPPs for backward compatibility with SSE2 non-optimized software
* 32nm SOI process with implemented first generation GF's High-K Metal Gate (HKMG)
* Support for Intel's future Advanced Vector Extensions (AVX) instruction set, which supports 256-Bit floating point operations, and SSE4.1, SSE4.2, AES, CLMUL, as well as future 128-bit instruction sets proposed by AMD (XOP, FMA4 and CVT16) [10], which have the same functionality as the SSE5 instruction set formerly proposed by AMD, but with compatibility to the AVX coding scheme.
* Hyper Transport Technology rev. 3.1 (3.20 GHz, 6.4 GT/s, 51.6 GB/s, 16-bit uplink/16-bit downlink) [first implemented into HY-D1 revision "Magny-Cours" on the socket G34 Opteron platform in March 2010 and "Lisbon" on the socket C32 Opteron platform in June 2010]
* Socket AM3+ (AM3r2)
- 938pin(?), DDR3 support
- will retain only backwards compatiblity with previous Socket AM3/AM2 processors ("new AM3+ socket for consumer versions of Bulldozer CPUs. AM2 and AM3 processors will work in the AM3+ socket, but Bulldozer chips will not work in non-AM3+ motherboards"). For the server segment Socket G34 (LGA1974) will be used.
* Min-Max Power Usage - 10-100 watts