Sanketh's Blog

Characteristics of MBR Code

BIOS Boot Recap Previously, we saw that after the BIOS firmware is loaded, it searches for a bootable device from a list of storage options, such as a hard drive, SSD, USB, or network interface. The BIOS identifies a valid bootable device by checking for the 0x55AA signature at the end of the first sector. Once found, it loads the 512 bytes from this sector (LBA 0), which is known as the Master Boot Record (MBR). ...

What happens when you turn on computer?

1. Power‑On & Hardware Reset 1. Power‑Good Signal The power supply stabilizes voltages and asserts a “Power‑Good” (PWR_OK) line to the motherboard. All devices receive power and begin to initialize themselves. The Central Processing Unit (CPU) is initially held in a reset mode, meaning it’s not yet executing instructions. The memory layout is powered up, although the RAM itself has no content since it’s volatile. 2. CPU Reset Vector The reset vector is a predetermined memory address where the CPU begins execution after being powered on or reset. On x86 processors, this address is typically 0xFFFFFFF0 (near the top of the 4GB address space). When the CPU comes out of reset, its program counter (instruction pointer) is automatically set to this address. The motherboard’s memory mapping ensures that this address points to the BIOS/UEFI firmware ROM chip, so the very first instruction the CPU executes comes from the firmware. ...

Representation of Negative Numbers in Hardware

Representing negative numbers in binary poses unique challenges due to the inherent nature of binary systems. Unlike decimal systems, which can easily use a minus sign to indicate negative values, binary systems must encode this information within a fixed number of bits. This requirement leads to various methods of representation, each with its own set of advantages and limitations. The main challenge lies in developing a system that can accurately represent both positive and negative values while ensuring that arithmetic operations remain efficient and straightforward. In the following sections, we will explore several common approaches to representing negative numbers in binary, including their respective challenges and trade-offs. ...

Overview of MIPS Assembly

MIPS (Microprocessor without Interlocked Pipeline Stages) assembly is one of the RISC ISA’s. It was developed in the early 1980s at Stanford University by Professor John L. Hennessy. MIPS is widely used in academic research and industry, particularly in computer architecture courses due to its straightforward design and in various embedded systems applications for its efficiency and performance. History The first MIPS processor, the R2000, was introduced. It implemented the MIPS I architecture, which was one of the earliest commercial RISC processors. There are multiple versions of MIPS: including MIPS I, II, III, IV, and V; as well as five releases of MIPS32/64. MIPS I had 32-bit architecture with basic instruction set and addressing modes. MIPS III introduced 64-bit architecture in 1991, increasing the address space and register width. ...

The Fetch Decode Execute Cycle

The Fetch-decode-execute cycle or instruction cycle is how CPU executes programs. During this cycle, the CPU retrieves an instruction from memory (fetch), interprets what action is required (decode), and then carries out the necessary operations to complete the instruction (execute). This cycle is crucial for the CPU to perform any computational tasks, and it repeats continuously while the computer is powered on. What is Machine Code? Machine code is the lowest-level programming language that consists of binary instructions directly executed by a CPU. Any program is compiled to a binary executable is transformed into machine code. Machine code consists of set of instructions which varies for each CPU architecture and is decided by the CPU manufacturer, eg: ARM, MIPS, x86, etc. Machine code consists of a set of instructions defined by the Instruction Set Architecture (ISA) of each CPU. The ISA, determined by the CPU manufacturer, varies across different architectures such as ARM, MIPS, and x86. This architecture-specific design means that machine code written for one type of CPU cannot be directly executed on another without translation or emulation. ...

Measuring CPU Performance

CPU Manufacturers publish several metrics related to CPU like clock speed, number of cores, cache sizes, ISA, performance per Watt, number of transistors and more. Measuring CPU performance is complex, and it cannot be summarized by a single metric. In this post, I’ll explore each of these metrics and discuss some standard benchmarking software and their limitations. What is Clock Speed, How does it Affects CPU Performance? All Synchronous digital electronic circuits require an externally generated time reference. This is usually a square wave signal provided to the circuit called as clock. A clock cycle is the fundamental unit of time measurement for a CPU. A clock cycle is a single electrical pulse in a CPU, during which the CPU can execute a fundamental operation such as accessing memory, writing data, or fetching a new set of instructions. A clock cycle is measured as the amount of time between two pulses of an oscillator. The clock speed of a CPU is measured in Hertz (Hz), which signifies the number of clock cycles it can complete in one second. Common units are Megahertz (MHz) and Gigahertz (GHz). ...

Key Differences between 32-bit and 64-bit CPU architectures

The terms 32 bit and 64 bit specifically relate to the size of the data and address registers within the CPU, which determines the maximum amount of memory that can be directly accessed and the range of values that can be processed. 1. Registers and Data Width: Since all calculations take place in registers, when performing operations such as addition or subtraction, variables are loaded from memory into registers if they are not already there. A 32-bit CPU has 32-bit wide registers, meaning it can process 32 bits of data in a single instruction. 2. Memory Addressing: 32-bit CPU can address up to 232 unique memory locations translates to a maximum of 4 GB of addressable memory (RAM). 64-bit CPU can address up to 264 unique memory locations allowing for a theoretical maximum of 16 exabytes of addressable memory. This limitation comes from the fact that a 32-CPU can only load integers that are 32 bits long, thus limiting the maximum addressable memory space. 3. Data Transfer Speeds: The memory bus width in 64-bit CPU is often 64 bits or more, meaning the physical path between the CPU and RAM can handle 64 bits of data in parallel. This helps in efficiently loading data into the cache but does not restrict the CPU to always reading 64 bits. Despite the ability to handle 64 bits of data in parallel, the CPU is not restricted to always reading 64 bits at a time. It can access smaller data sizes (e.g., 8-bit, 16-bit, 32-bit) as needed, depending on the specific instruction and data type. 4. Performance: 64-bit CPU’s perform better than 32-bit CPU’s. This performance difference comes up from various factors like size of registers, addressable memory space, larger bus width Some RISC architectures support SIMD (Single Instruction, Multiple Data) instructions that allow for parallel processing of multiple smaller data types within larger registers. For example, ARM’s NEON technology can operate on multiple 32-bit integers within 64-bit registers, which enable the parallel processing of smaller data types within larger registers. 5. Application Compatibility: 64-bit operating systems typically include backward compatibility to run 32-bit software seamlessly. These compatibility layers allow 32-bit applications to execute on 64-bit systems without any major issues. However, 32-bit applications may not fully utilize the advantages of 64-bit systems, such as increased memory addressing capabilities.

Components of CPU

Before learning Assembly, I think it would be useful to learn a bit about different components of CPU in general. If we think of CPU as a black box its main function is to fetch instructions from RAM which are in the form of machine code and execute them. Components of CPU Arithmetic Logic Unit (ALU) Memory Management Unit (MMU) Control Unit (CU) Registers Clock Cache Buses 1. Arithmetic Logic Unit (ALU) ALU is an electronic circuit made of NAND gates responsible for performing arithmetic and logical operations on integer binary numbers. It takes two operands as inputs and an opcode to indicate the type of operation to be performed. Operations supported by ALU are Add, Subtract, Negation, Two’s complement, AND, OR, XOR, bit shift, etc. ...