CENTRAL PROCESSING UNIT
A Central Processing Unit (CPU), or sometimes just called processor, is a description of a class of logic machines that can execute computer programs. This broad definition can easily be applied to many early computers that existed long before the term "CPU" ever came into widespread usage. The term itself and its initialism have been in use in the computer industry at least since the early 1960s.
Discrete transistor and IC CPUs
The first such improvement came with the advent of the transistor. With this improvement more complex and reliable CPUs were built onto one or several printed circuit boards containing discrete (individual) components.The integrated circuit (IC) allowed a large number of transistors to be manufactured on a singlesemiconductor-based die, or "chip."the quantity of individual ICs needed for a complete CPU. MSI and LSI (medium- and large-scale integration) ICs increased transistor counts to hundreds, then thousands.
Microprocessors
Previous generations of CPUs were implemented as discrete components and numerous small integrated circuits (ICs) on one or more circuit boards. Microprocessors, on the other hand, are CPUs manufactured on a very small number of ICs; usually just one.
Advanced Processors
• The processor is the central component of the PC.
• This vital component is responsible for every single thing the PC does.
• It determines, which operating systems can be used, which software packages the PC can run, how much energy the pc uses, and how stable the system will be
• The processor is also a major determinant of overall system cost: the newer and more powerful the processor, the more expensive the machine will be.
Principle of a Microprocessor
• They take signals in the form of 0s and 1s manipulate them according to a set of instructions, and produce output in the form of 0s and 1s.
• The voltage on the line at the time a signal is sent determines whether the signal is a 0 or a 1.
• On a 3.3-volt system, an application of 3.3 volts means that it is a 1, while an application of 0 volts means it's a 0.
The CPU- The Real Computer
CPU (Central Processing Unit)= A complex collection of electronic circuits on one or more integrated circuits (chips) which:
1. Executes the instructions in a software program
2. Communicates with other parts of the computer system, especially RAM
The CPU is the computer!
Some of the parts of the CPU
Arithmetic Logic Unit (ALU) = area of the CPU responsible for the actual processing “The CPU’s calculator”
Control Unit (CU) = area of the CPU responsible for getting data and instructions from RAM
A CPU can be:
A CPU can be:
1. A series of integrated circuits (chips) on one or more circuit boards
– Older mainframe and minicomputers
2. On a single integrated circuit known as a microprocessor
microprocessor = a CPU on a single chip
microcomputer = older term for a computer with a microprocessor(s) (PC, Macintosh
The Microprocessor
Compatibility
Why can’t I run Windows software on my Macintosh and visa versa?
• Operating system software is designed to run on one specific type of CPU or “family of CPUs”
• Application software is designed to work with a specific operating system software, thus one specific type of CPU or “family of CPUs”
More “later”
Power of the CPU
1. The number of bits processed
2. The size of the CPUs data bus
3. The math coprocessor
4. Multiprocessing capabilities
5. Virtual Memory capabilities
6. The Speed of the CPU
7. Internal Cache capabilities
1. The number of bits processed
Early microprocessors:
• 4 bit and 8 bit processors
• Intel 8088, 80286: 16 bit processors
• Intel 80386, 80486, Pentium: 32 bit processors
• Motorola (Apple) 68020, 30, 40 and the PowerPC: 32 bit processors
Current processors
• 32 bit processors
Latest:
• 64 & 128 bit processors
Pentium III
Pentium 4
AMD Processors (continued)
VIA C3 Processor
64-Bit Processors
• Intel Itaniums
• AMD 64-bit processors
The Itanium 2 Processor
AMD 64-Bit Processors
Combination Heat Sink and Cooling Fan
2. The CPU’s Data Bus
The CPU’s Data Bus
Data bus = the number or wires between the CPU and RAM
More wires (lanes) the faster the CPU gets the data and software to process
Older CPUs: 8 and 16 bit data bus
Newer CPUs: 32 and 64 bit data bus
3. The Amount of RAM the CPU can Recognize
• CPUs are designed to be able to recognize a specific amount of RAM memory
• Today’s microprocessors recognize 4 GB of RAM,
• However Motherboards support less, about 2 GB
• Earlier Microprocessors
– Intel 8088 -> 1 MB RAM
– Intel 80286 -> 16 MB RAM
– Intel 80386 & Early Pentiums -> 4 GB RAM
4. The Math Co-processor or FPU (Legacy
For math intensive applications
• Large spreadsheets
• Graphics
• Animation and video
• CAD (Computer Aided Design)
Early microcomputers
• separate Math Co-processor
Later microprocessors
• built-into the CPU
• faster when inside the chip
5. Multiprocessing Abilities
Multiprocessing = ability of the CPU to process more than one task at a time
Example: Sorting a datafile and calculating a spreadsheet at the same time
All new microprocessors can do multiprocessing
6. The Speed of the CPU
Speed measured in
• Megahertz (MHz) - the number of millions of beats per second
• Gigahertz (GHz) - the number of billions of beats per second
Examples:
• Early CPUs: 4 - 33 MHz
• Current Processors: 3 GHz and more
Faster the CPU, faster the processing
7. Internal Cache
Internal Cache = memory inside the CPU chip which stores instructions and data which the CPU is currently working on or may soon need.
• The CPU must deliver its data at a very high speed.
• The regular RAM cannot keep up with that speed.
• Therefore, a special RAM type called cache is used as a buffer - temporary storage.
• L1 Cache – Same chip as CPU (fastest)
• L2 Cache – Separate chip
Inside the CPU
• The computer can only do one thing at a time.
• Each action must be broken down into the most basic steps.
• One round of steps from getting an instruction back to getting the next instruction is called the Machine Cycle.
The Machine Cycle
• Fetch - get an instruction from Main Memory
• Decode - translate it into computer commands
• Execute - actually process the command
• Store - write the result to Main Memory
For example, to add the numbers 5 and 6 and show the answer on the screen requires the following steps:
1. Fetch instruction: "Get number at an address in memory nnnn"
2. Decode instruction.
3. Execute: ALU finds the number. (which happens to be 5)
4. Store: The number 5 is stored in a temporary spot in Main Memory.
5. Repeat steps for another number (= 6)
9. Fetch instruction: "Add those two numbers"
10. Decode instruction.
11. Execute: ALU adds the numbers.
12. Store: The answer is stored in a temporary spot.
13. Fetch instruction: "Display answer on screen."
14. Decode instruction.
15. Execute: Display answer on screen.
The immense speed of the computer enables it to do millions of such steps in a second.
In fact, MIPS, standing for millions of instructions per second, is one way to measure computer speeds.
Apple G5
• G5 drives the largest performance gain in the history of the PowerPC.
• The 64-bit G5 offers speeds up to 2.5GHz and can address up to 8GB of main memory.
CPU operation
The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions called a program. The program is represented by a series of numbers that are kept in some kind of computer memory. There are four steps : fetch, decode, execute, and write back. The first step, fetch, involves retrieving an instruction from program memory. The instruction that the CPU fetches from memory is used to determine what the CPU is to do. In the decode step, the instruction is broken up into parts that have significance to other portions of the CPU.
Often, one group of numbers in the instruction, called the opcode, indicates which operation to perform. The remaining parts of the number usually provide information required for that instruction, such as operands for an addition operation. The execute step is performed. During this step, various portions of the CPU are connected so they can perform the desired operation. If, for instance, an addition operation was requested, an arithmetic logic unit (ALU) will be connected to a set of inputs and a set of outputs. The final step, writeback, simply "writes back" the results of the execute step to some form of memory. After the execution of the instruction and writeback of the resulting data, the entire process repeats, with the next instruction cycle normally fetching the next-in-sequence instruction because of the incremented value in the program counter.
Clock rate
Most CPUs, and indeed most sequential logic devices, are synchronous in nature. That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a clock signal, usually takes the form of a periodic square wave. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate period for the clock signal.
Summary about CPU performance
• MOST OBVIOUS: Processor Clock Frequency
• Increased frequency – increased execution rate
• State of the Art: >2GHz (Jan 2002)
• Memory and I/O access times can be performance bottleneck – unless you take some special measures
• ALU register width
– A processor is an n-bit processor, where N represents the precision of the ALU – N can be 4, 8, 16, 32, or 64
– The wider the registers – the more processing per clock
• Data bus width
– The wider the data bus the faster we can transfer data
– Since the memory and I/O device access times are finite, the more bits transferred per cycle the better
• Address bus width
• Increased address width doesn’t provide a ‘speed’ increase as such
• CPU can directly address more memory
• PCs use big programs, which would not fit in a smaller address space
• Overcoming small address space takes time
– Impacts on overall system performance
HISTORY
Intel 8088
Intel 8086 (1978)
It was a true 16-bit processor and talked with its cards via a 16 wire data connection. The chip contained 29,000 transistors and 20 address lines that gave it the ability to talk with up to 1 MB of RAM. . The chip was available in 5, 6,, 8, and 10 MHz versions.
Intel 8088 (1979)
The only difference is that it handles its address lines differently than the 8086. This chip was the one that was chosen for the first IBM PC, and like the 8086, it is able to work with the 8087 math coprocessor chip
8086/8088 Functional Units
8086/8088 (3)
•8086/8088 consists of two internal units
–The execution unit (EU) - executes the instructions
–The bus interface unit (BIU) - fetches instructions, reads operands and writes results
•The 8086 has a 6-byte prefetch queue
•The 8088 has a 4-byte prefetch queue
NEC V20 and V30 (1981)
Clones of the 8088 and 8086. They are supposed to be about 30% faster than the Intel ones, though.
8086/8088 Summary
•First Generation (introduced June 1978)
•One of the first 16-bit processors on the market
•16-bit internal registers
•16/8-bit external data bus
•20-bit address bus (1MB addressable)
•Used in 1st generation IBM PCs (1981)
Intel 80186 (1980)
The 186 was a popular chip. In 1990, Intel came out with the Enhanced 186 family. They all shared a common core design. They had a 1-micron core design and ran at about 25MHz at 3 volts. The 80186 contained a high level of integration, with the system controller, interrupt controller, DMA controller and timing circuitry right on the CPU.
2nd Generation Processor 286
•P2 (286) = 2nd Generation Processor
•Introduced in 1981
•CPU behind IBM AT
•Throughput of original IBM AT (6MHz) was about 500% of IBM PC (4.77MHz)
•Level of integration: 134k transistors (vs 29k in 8086)
•Still a 16-bit processor…
•Available in higher clock frequencies: 25MHz
2nd Generation Processors 286
•Fully backwards compatible to 8086
80286 runs 8086 software without modification
•Improved instruction execution
Average instruction takes 4.5 cycles vs. 12 cycles (8086)
•Improved instruction set
•Real mode and Protected Mode
Multitasking-support. What happens in one area of memory doesn’t affect other programs. Protected mode supported by Windows 3.0.
•16MB addressable physical memory
•On-chip MMU (1GB virtual memory)
•Non-multiplexed address-bus and data-bus
Intel 80286 (1982)
A 16-bit, 134,000 transistor processor capable of addressing up to 16 MB of RAM. In addition to the increased physical memory support, this chip is able to work with virtual memory, thereby allowing much for expandability. The 286 was the first “real” processor. It introduced the concept of protected mode. This is the ability to multitask, having different programs run separately but at the same time. On the drawbacks of this ability, though, was that while it could switch from real mode to protected mode , it could not switch back to real mode without a warm reboot. This chip was used by IBM in its Advanced Technology PC/AT and was used in a lot of IBM-compatibles. It ran at 8, 10, and 12.5 MHz, but later editions of the chip ran as high as
3rd Generation Processor 386
•P3 (386) = 3rd Generation Processor
•Introduced: 10/1985
•Full 32-bit processor
(32-bit registers. 32-bit internal and external databus. 32-bit address bus)
•275k transistors. CMOS. 132-pin PGA package.
(Supply current Icc=400mA. Roughly the same as 8086 !)
•Clock speeds: 16-33MHz
•P3 processors were far ahead of their time:
It took 10 years before 32-bit operating systems became mainstream!
•First 386 PCs early 1987
(COMPAQ)
3rd Generation Processor 386
•Modes of operation:
–Real. Protected. Virtual Real.
•Protected mode of 386 is fully compatible with 286
Protected mode=native mode of operation. Chips are designed for advanced operating systems such as Windows NT
•New virtual real mode
Processor can run with hardware memory protection while simulating the 8086’s real-mode operation. Multiple copies of e.g. DOS can run simultaneously, each in a protected area of memory. If a program in one memory area crashes, the rest of the system is protected.
80386 Operating Modes
•Protected Mode for Multitasking support
•Real Mode (native 8086 mode)
–Processor powers up in Real Mode
•System Management Mode
–Power management or system security
–Processor switches to separate address space, while saving the entire context of the currently running program or task
Intel 386 (1985 - 1990)
The 386 was a 32-bit processor, meaning its data throughput was immediately twice that of the 286. Containing 275,000 transistors, the 80386DX processor came in 16, 20, 25, and 33 MHz versions. The 32-bit address bus allowed the chip to work with a full 4 GB of RAM and a staggering 64 TB of virtual memory. While the chip could run in both real and protected mode , it could also run in virtual real mode, allowing several real mode sessions to be run at a time. s, though. In 1988, Intel released the 386SX, which was basically a low-fat version of the 386. It used the 16-bit data bus rather than the 32-bit, and it was slower, but it thus used less power and thus enabled Intel to promote the chip into desktops and even portables. In 1990, Intel released the 80386SL, which was basically an 855,00 transistor version of the 386SX processor. 386 chips were designed to be user friendly.
80386: Classic CISC Processor
•CISC = Complex Instruction Set Computer
•Complex instructions
•...but code-size efficient
•Micro-encoding of the machine instructions
•Extensive addressing capabilities for memory operations
•Few, but very useful CPU registers
80386 Complex Instructions
•CISC drawback: Most instructions are so complicated, they have to be broken into a sequence of micro-steps
•These steps are called Micro-Code
•Stored in a ROM in the processor core
•Micro-code ROM: Access-time and size...
•They require extra ROM and decode logic
RISC: Less is More
•RISC = Reduced Instruction Set Computer
•20/80 Rule: 20% of the instructions take up 80% of the time
•Sometimes executing a sequence of simple instructions runs quicker than a single complex machine instruction that has the same effect
RISC Ideas (1)
•Reduce the instruction set to simplify the decoding
–Smaller Instruction Set -> Simpler Logic -> Smaller Logic -> Faster Execution
•Eliminate microcode – hardwire all instruction execution
•Pipeline instruction decoding and executing – do more operations in parallel
Superscalar Architecture:
•The processor may have more than one pipeline (Pentium…)
•Where possible each pipeline works independently
–Not always possible
•May achieve average completed execution of more than one instruction per clock cycle
Getting the Benefits of Pipelining
•Simplified Instruction decoding
–Simpler, faster logic
•On-chip cache memories
–Local memory on-chip to avoid memory access bottlenecks
•Floating Point pipeline for FP coprocessor
•Speculative Execution to get around pipeline flushes
4th Generation Processor
80486: IA-32 with RISC elements
•Introduced 04/91
•Greatly improved 80386 CPU
•Hard-wired implementation of frequently used instructions (as in RISCs). On average 2 clock cycles/instruction.
•5 stage instruction pipeline
•Internal L1 Cache Memory (8kB) + cache controller
•On-chip Floating Point coprocessor (FPU)
•Longer Prefetch Queue (32-bytes as opposed to 16 on the 80386)
•Higher frequency operation: up to 120MHz
•>1.2M transistors, 0.8m CMOS. 168-pin PGA.
Intel 486 (1989 - 1994)
The 80486DX was a 32-bit processor containing 1.2 million transistors. It had the same memory capacity as the 386 (both were 32-bit) but offered twice the speed at 26.9 million instructions per second (MIPS) at 33 MHz. The 486 was the first to have an integrated floating point unit (FPU) to replace the normally separate math coprocessor (not all flavors of the 486 had this, though). It also contained an integrated 8 KB on-die cache. This increases speed by using the instruction pipelining to predict the next instructions and then storing them in the cache. Also, the 486 came in 5 volt and 3 volt0 versions, allowing flexibility for desktops and laptops.The memberS of 486 family were i486DX, 486SX ,486DX/50,i486DX2/50,i486DX2/66. Also in 1992, Intel put out the 486SL. it contained 1.4 million transistors.
The Pentium Pro (1995-1999
The Pentium Pro (also called “P6″ or “PPro”) is a RISC chip with a 486 hardware emulator on it, running at 200 MHz or below. Several techniques are used by this chip to produce more performance than its predecessors. Increased speed is achieved by dividing processing into more stages, and more work is done within each clock cycle. Three instructions can be decoded in each clock cycle, as opposed to only two for the Pentium. It has two separate 8K L1 cache (one for data and one for instructions), and up to 1 MB of onboard L2 cache in the same package. the onboard L2 cache increased performance in and of itself because the chip did not have to make use of an L2 cache on the motherboard itself. PPro is optimized for 32-bit code, so it will run 16-bit code no faster than a Pentium, which is a big drawback.
5th Gen. Processor: Pentium
•Pentium = P5 (586) = 5th Generation Processor
(trademarking a number designation not possible)
•Introduced: 03/1993
(Pentium-PCs followed a few months later)
•Superscalar technology
(2 instruction pipelines for execution of up to 2 instructions per clock cycle)
•Branch prediction
(to avoid flushing the instruction queue and pipeline at branch-taken event)
•Internal 8kB caches for code and data
(but external L2 cache)
•Addressbus: 32b. External Databus: 64b
But not a 64-bit processor! Internal data paths up to 256b wide
5th Gen. Processor: Pentium
•Pipelined FPU
(2..10 times faster than 486 FPU. FDIV bug! Free replacement…)
962,306,957,033 / 11,010,046 = 87,402.6282027341 (correct answer)
962,306,957,033 / 11,010,046 = 87,399.5805831329 (flawed Pentium)
•Burst-mode bus cycles
(fast data transfer from memory to cache)
•>3M transistors. BiCMOS. 0.8m..0.35m.
•Supply voltages: 5V..2.9V
•Packages: PGA273 and SPGA296
(up to 16W power dissipation! Forced-convection cooling: fan)
5th Gen. Processor: Pentium
•Clock speeds: 60-266MHz
•Clock multiplier circuitry
Processor runs faster than the system bus. Motherboard bus speeds 50, 60, 66MHz.
•System management mode (SMM)
(full control over power management features)
P5 Evolution: Pentium MMX
•Pentium P5 with MMX Extensions
•Introduced: 01/1997
•D-bus: 64b. A-bus: 32b
•Vcc: 1.8V-2.8V
•66-266MHz
•L1-caches: 16kB code and data (write-back). 4-way set associative. More write buffers.
•4.5M transistors. 0.25/0.35m BiCMOS.
•321-pin socket 7
P5 Evolution: Pentium MMX
•MMX
–MMX = Multi-media Extensions
To meet growing importance and increasing demands of multi-media and communication applications
–57 new instructions
New instructions designed specifically to handle video and audio data
–SIMD = Single Instruction Multiple Data
One instruction performs the same function on many pieces of data
–MMX is pipelined
6th Gen. Processor: P6
•P6 Processor Variations:
–Pentium Pro
Original P6 processor. L2 cache: 256kB, 512kB or 1MB (full-core speed)
–Pentium II
P6 with L2-cache: 512kB (half-core speed)
–Pentium II Xeon
P6 with L2-cache: 512kB/1MB/2MB (full-core speed)
–Celeron
P6 without L2 cache
–Celeron-A
P6 with L2-cache: 128kB on-die (full-core speed)
–Pentium III
P6 with SSE (MMX2), L2-cache: 256kB on-die (half-core speed)
–Pentium III Xeon
P6 with SSE (MMX2), L2-cache: 512kB/1MB/2MB on-die (full-core speed)
P6: Main New Features...
•Other new features:
–A few new instructions
–Enhanced multi-processor support
•Only recent Windows Versions (NT/2000/XP) do take full advantage of the P6’s capabilities
•Use optimising compilers
–to make code as “predictable” as possible
P6: Pentium Pro
•Introduced: 11/1995
(before P5 MMX)
•Outstanding feature:
Integrated L2 cache
•Multi-chip module (MCM)
Dual-cavity PGA
•2 silicon dies:
Processor & L2 cache (256kB, 512kB, 1MB)
5.5M + 63M = 68M transistors
•Packaging = extremely expensive !
P6: Pentium II
•Introduced: 05/1997
•Abandoned: chip-in-a-socket
•Introduced: 242-pin SEC cartridge
•Much less expensive to manufacture
(at the time!)
P6: Pentium II
•Processor core speeds: 233-450MHz
•Bus speeds: 66-100MHz
•7.5M transistors. 0.25/0.35m BiCMOS.
•MMX
•Power dissipation up to >40W!
Heatsinks and fans required!
•A-bus: 36b
Addressable: 64GB
•L2 cache
Half core-speed.
Supports up to 512MB
P6: Celeron
•Cheaper packaging (SEP)
No fancy plastic cartridge
•Specifically designed for lower-cost PCs
•L2 cache support up to 4GB of RAM
•MMX
•L1 cache: 2* 16kB
•Integral thermal diode for temperature monitoring
•0.25/0.18m technology
P6: Pentium III
•Introduced: 02/1999
•28M transistors
•0.18m coppermine technology
Interconnect: Copper rather than Aluminium/Tungsten to reduce signal diffusion…
•Major improvements:
–SSE (Streaming SIMD Extensions)
–Integrated on-die L2 cache
•Available up to 1GHz
•
7th Gen. Processor: Pentium 4
•Introduced: 11/2000. Also called “NetBurst”
•Main technical details:
–Core speed range 1.3GHz..3GHz…?
–42M transistors. 0.18m.
–System (front-side) bus: up to 400MHz
–ALU runs at twice the processor core frequency
–Hyper-pipelined 20-stage technology
–Very deep out-of-order instruction execution
–20kB L1 cache. 256kB full-speed L2-cache. 8-way set associative. L2 supports up to 4GB RAM and ECC.
–SSE2 – 144 new SSE2 instructions
–Socket 432. Up to 64W of power dissipation.
Pentium II (1997)
Intel made some major changes to the processor scene with the release of the Pentium II. They had the PentiumMMX and Pentium Pro’s out into the market in a strong way, and they wanted to bring the best of both into one chip. . Pentium II is optimized for 32-bit applications. It also contains the MMX instruction set, which is almost a standard by this time. The chip uses the dynamic execution technology of the Pentium Pro, allowing the processor to predict coming instructions, accelerating work flow.. Pentium II has 32KB of L1 cache (16KB each for data and instructions) and has a 512KB of L2 cache on package. The L2 cache runs at ½ the speed of the processor, not at full speed. Nonetheless, the fact that the L2 cache is not on the motherboard, but instead in the chip itself, boosts performance.. Pentium Pro’s use Socket 8. Pentium II, however, makes use of “Slot 1″. The package-type of the P2 is called Single-Edge contact (SEC). The chip and L2 cache actually reside on a card which attaches to the motherboard via a slot, much like an expansion card. The entire P2 package is surrounded by a plastic cartridge. .
Celeron (1998)
The Celeron is very cheap, Intel removed the L2 cache from the Pentium II. They also removed the support for dual processors, an ability that the Pentium II had. Additionally, they ditched the plastic cover which the P2 had, leaving simply the processor on the Slot 1 style card. This, no doubt, reduced the cost of the processor quite a bit, but performance suffered noticeably. Removing the L2 cache from a chip seriously hampers its performance. On top of that, the chip was still limited to the 66MHz system bus.Intel had realized their mistake with the next edition of the Celeron, the Celeron 300A. The 300A came with 128KB of L2 cache on board. The Celeron is available in two formats. The original Celerons used the patented Slot 1 interface. But, Intel later switched over to a PPGA format, or Plastic Pin Grid Array, also known as Socket 370. . Slot 1 Celerons ranged from the original 233MHz up to 433 MHz, while Celerons 300MHz and up were available in Socket 370.s
Pentium III (1999)
also known as KATMAI running at 450 MHz on a 100MHz bus. It introduced the SSE instruction set, which was basically an extension of MMX that again improved the performance on 3D apps designed to use the new ability. SSE contained 70 new instructions, with four simultaneous instructions able to be performed simultaneously. In April of 2000, Intel released their Pentium III Coppermine. While Katmai had 512 KB of L2 cache, Coppermine had half that at only 256 KB. But, the cache was located directly on the CPU core rather than on the daughtercard as typified in previous Slot 1 processors.. Coppermine also supported the 133 MHz front side bus. Coppermine proved to be a performance chip and it was and still is used by many PCs. Coppermine eventually saw 1+ GHz.
Celeron II (2000)
The Celeron II is simply a Celeron with a SSE, SSE2, and a few added features. The chip is available from 533 MHz to 1.1 GHz. This chip was basically an enhancement of the original Celeron, and it was released in response to AMD’s coming competition in the low-cost market with the Duron. Due to some inefficiencies in the L2 cache and still using the 66MHz bus this chip would not hold up too well against the Duron despite being based on the trusted Coppermine core. Celeron II would not be released with true 100 MHz bus support until the 800MHz edition, which was put out at the beginning of 2001.
Pentium IV (2000)
While we have been talking about AMD’s high-speed Athlon Thunderbirds and Palominos, Intel actually beat AMD to the gun by releasing Pentium IV Willamette in November of 2000. Pentium IV was exactly what Intel needed to again take the torch from AMD. Pentium IV is a truly new CPU architecture and serves as the beginning to new technologies we will see for the next several years. The new NetBurst architecture is designed with future speed increase in mind, meaning P4 is not going to fade away quickly like Pentium III near the 1 GHz mark.
According to Intel, NetBurst is made up of four new technologies: Hyper Pipelined Technology, Rapid Execution Engine, Execution Trace Cache and a 400MHz system bus
The difference between Core 2 Duo and Core Duo (Dual Core)
Dual core is simply a generic term referring to any processor package with two physical CPUs in one.
The Pentium D is simply two Pentium 4 Prescott cpus inefficiently paired together and ran as dual core.
The Core Duo is Intel's first generation dual core processor based upon the Pentium M (a Pentium III-4 hybrid) made mostly for laptops (though a few motherboard manufacturers have released desktop boards supporting the Core Duo CPU), and is much more efficiently than Pentium D.
The Core 2 Duo is Intel's second generation (hence, Core 2) processor made for desktops and laptops designed from the ground up to be fast while not consuming nearly as much power as previous CPUs.
The Pentium D, Core Duo, Core 2 Duo and Athlon X2 are all current CPUs that have dual cores in one package.
Note - Intel has dropped the Pentium name in favor of the Core architecture.
Intel Core 2
The Core 2 brand was introduced on July 27, 2006[3] comprising the Solo (single-core), Duo (dual-core), Quad (quad-core), and Extreme (dual- or quad-core CPUs for enthusiasts) branches, during 2007.[4] Intel Core 2 processors with vPro technology (designed for businesses) include the dual-core and quad-core branches.[5
Duo, Quad, and Extreme
The Core 2-branded CPUs include: "Conroe" and "Allendale" (dual-core for higher- and lower-end desktops), "Merom" (dual-core for laptops), "Kentsfield" (quad-core for desktops), and their variants named "Penryn" (dual-core for laptops), "Wolfdale" (dual-core for desktops) and "Yorkfield" (quad-core for desktops). (Note: For the server and workstation "Woodcrest", "Clovertown", and "Tigerton" CPUs see the Xeon brand[6].) The Core 2 branded processors featured the Virtualization Technology (except T52x0, T5300, T54x0, T55x0 with stepping "B2", E2xx0, E4x00 and E8190 models), Execute Disable Bit, and SSE3. Their Core microarchitecture introduced also SSSE3, Trusted Execution Technology, Enhanced SpeedStep, and Active Management Technology (iAMT2). With a Thermal Design Power (TDP) of up to only 65 W, the Core 2 dual-core Conroe consumed only half the power of less capable, but also dual-core Pentium D-branded desktop chips[7] with a TDP of up to 130 W[8] (a high TDP requires additional cooling that can be noisy or expensive).
CISC, RISC, EPIC & VLIW
•CISC : Complex Instruction Set Computer.
• CISC chips have a large amount of different and complex instructions.
• In common CISC chips are relatively slow compared to RISC chips per instruction, but use less than RISC instructions.
•RISC : Reduced Instruction Set Computer.
• RISC chips evolved around the mid-1980 as a reaction at CISC chips.
• Advantage of RISC is that because of simple instructions
• RISC chips require fewer transistors, which makes them easier to design and cheaper to produce.
•EPIC : Explicitly Parallel Instruction Computing.
• EPIC can execute many instruction in parallel.
• EPIC is created by Intel and is in a way a combination of both CISC and RISC.
• This will in theory allow the processing of Windows-based as well as UNIX-based applications by the same CPU.
• Microsoft developed their Win64 standard for it. EPIC is a 64-bit chip.
•VLIW : Very Long Instruction Word.
• The VLIW processor uses instructions that are long.
• The idea is to put many instructions together in one.
• Then the processor can fetch several instructions in one operation and be more efficient.