Cpu | Sanketh's Blog

x86 Assembly Part 1: Registers

When learning assembly, it’s easy to get lost in the “why” of CPU design, but this blog will stay focused on the x86 instruction set itself. The goal here isn’t to study computer architecture or dive into microarchitectural details — instead, we’ll build a working reference for how to write and understand x86 assembly code. Everything that follows is about the x86 family of processors, starting from the registers that form the foundation of all instructions. ...

Hello World in Real Mode

When your x86 computer first starts up, it’s in a surprisingly primitive state: No operating system - Obviously, since we haven’t loaded one yet No memory management - No virtual memory, no protection between processes No file system - Can’t open files, no directories, no abstraction layer No network stack - No TCP/IP, no internet connectivity No device drivers - No USB drivers, no graphics drivers, nothing What Services Are Available at Boot Time? Despite the barren landscape, the BIOS (Basic Input/Output System) gives us a few essential tools: ...

How does CPU Communicates With Peripheral Devices

Introduction: The Communication Challenges At its core, a CPU is designed for one primary task: processing data and executing instructions at incredible speed. But this processing power becomes meaningful only when it can interact with the rich ecosystem of peripheral devices that extend its capabilities. Why CPUs Need to Talk to Many Different Devices? Your CPU must read input from your mouse or keyboard, process that input to understand your intent, communicate with memory to load the browser application, send rendering commands to your graphics card, request data from your network interface to load the webpage, and potentially write temporary files to your storage device. Each of these interactions involves a different type of peripheral device, each with its own communication requirements, data formats, and timing constraints. ...

Processor Modes in x86

The 8086 Processor A Brief History The Intel 8086, released in 1978, marked a pivotal moment in computing history as Intel’s first 16-bit microprocessor. Designed by a team led by Stephen Morse, the 8086 was Intel’s answer to the growing demand for more powerful processors that could handle larger programs and address more memory than the existing 8-bit chips of the era. The processor introduced the x86 architecture that would become the foundation for decades of computing evolution. With its 16-bit registers and 20-bit address bus 1, the 8086 could access up to 1 megabyte of memory—a massive improvement over the 64KB limitation of 8-bit processors. However, it retained backward compatibility concepts that would prove both beneficial and constraining for future generations. ...

The Fetch Decode Execute Cycle

The Fetch-decode-execute cycle or instruction cycle is how CPU executes programs. During this cycle, the CPU retrieves an instruction from memory (fetch), interprets what action is required (decode), and then carries out the necessary operations to complete the instruction (execute). This cycle is crucial for the CPU to perform any computational tasks, and it repeats continuously while the computer is powered on. What is Machine Code? Machine code is the lowest-level programming language that consists of binary instructions directly executed by a CPU. Any program is compiled to a binary executable is transformed into machine code. Machine code consists of set of instructions which varies for each CPU architecture and is decided by the CPU manufacturer, eg: ARM, MIPS, x86, etc. Machine code consists of a set of instructions defined by the Instruction Set Architecture (ISA) of each CPU. The ISA, determined by the CPU manufacturer, varies across different architectures such as ARM, MIPS, and x86. This architecture-specific design means that machine code written for one type of CPU cannot be directly executed on another without translation or emulation. ...

Measuring CPU Performance

CPU Manufacturers publish several metrics related to CPU like clock speed, number of cores, cache sizes, ISA, performance per Watt, number of transistors and more. Measuring CPU performance is complex, and it cannot be summarized by a single metric. In this post, I’ll explore each of these metrics and discuss some standard benchmarking software and their limitations. What is Clock Speed, How does it Affects CPU Performance? All Synchronous digital electronic circuits require an externally generated time reference. This is usually a square wave signal provided to the circuit called as clock. A clock cycle is the fundamental unit of time measurement for a CPU. A clock cycle is a single electrical pulse in a CPU, during which the CPU can execute a fundamental operation such as accessing memory, writing data, or fetching a new set of instructions. A clock cycle is measured as the amount of time between two pulses of an oscillator. The clock speed of a CPU is measured in Hertz (Hz), which signifies the number of clock cycles it can complete in one second. Common units are Megahertz (MHz) and Gigahertz (GHz). ...

Key Differences between 32-bit and 64-bit CPU architectures

The terms 32 bit and 64 bit specifically relate to the size of the data and address registers within the CPU, which determines the maximum amount of memory that can be directly accessed and the range of values that can be processed. 1. Registers and Data Width: Since all calculations take place in registers, when performing operations such as addition or subtraction, variables are loaded from memory into registers if they are not already there. A 32-bit CPU has 32-bit wide registers, meaning it can process 32 bits of data in a single instruction. 2. Memory Addressing: 32-bit CPU can address up to 232 unique memory locations translates to a maximum of 4 GB of addressable memory (RAM). 64-bit CPU can address up to 264 unique memory locations allowing for a theoretical maximum of 16 exabytes of addressable memory. This limitation comes from the fact that a 32-CPU can only load integers that are 32 bits long, thus limiting the maximum addressable memory space. 3. Data Transfer Speeds: The memory bus width in 64-bit CPU is often 64 bits or more, meaning the physical path between the CPU and RAM can handle 64 bits of data in parallel. This helps in efficiently loading data into the cache but does not restrict the CPU to always reading 64 bits. Despite the ability to handle 64 bits of data in parallel, the CPU is not restricted to always reading 64 bits at a time. It can access smaller data sizes (e.g., 8-bit, 16-bit, 32-bit) as needed, depending on the specific instruction and data type. 4. Performance: 64-bit CPU’s perform better than 32-bit CPU’s. This performance difference comes up from various factors like size of registers, addressable memory space, larger bus width Some RISC architectures support SIMD (Single Instruction, Multiple Data) instructions that allow for parallel processing of multiple smaller data types within larger registers. For example, ARM’s NEON technology can operate on multiple 32-bit integers within 64-bit registers, which enable the parallel processing of smaller data types within larger registers. 5. Application Compatibility: 64-bit operating systems typically include backward compatibility to run 32-bit software seamlessly. These compatibility layers allow 32-bit applications to execute on 64-bit systems without any major issues. However, 32-bit applications may not fully utilize the advantages of 64-bit systems, such as increased memory addressing capabilities.

Components of CPU

Before learning Assembly, I think it would be useful to learn a bit about different components of CPU in general. If we think of CPU as a black box its main function is to fetch instructions from RAM which are in the form of machine code and execute them. Components of CPU Arithmetic Logic Unit (ALU) Memory Management Unit (MMU) Control Unit (CU) Registers Clock Cache Buses 1. Arithmetic Logic Unit (ALU) ALU is an electronic circuit made of NAND gates responsible for performing arithmetic and logical operations on integer binary numbers. It takes two operands as inputs and an opcode to indicate the type of operation to be performed. Operations supported by ALU are Add, Subtract, Negation, Two’s complement, AND, OR, XOR, bit shift, etc. ...