PSN account ID the great wall slot gear

PlayStation 3 Architecture

If you are using accessibility tools or old browsers, switch to the "Classic" version.

This article has been published in many digital book stores for e-book readers. The new version is DR M-free and can be read offline. In addition, it is updated at the same pace as the website.

E-books can be purchased in Amazon Kindle, Apple Books, Kobo, and other stores. Profit will contribute to the improvement of the current article and the development of future articles.

Click here for details.

Support image

Model

First PlayStation 3 (PS3). It was released in Japan on November 11, 2006, on November 17, 2006 in the United States, and on March 23, 2007 in Europe.

PS3 2000/3000 series (commonly known as "slim"). September 01, 2009 in Europe and the United States, released in Japan on September 03, 2009.

PS3 4000 series (commonly known as Super Slim). Released in September 2012 overseas

Motherboard

Show the motherboard COK-001 revision (first one). The remaining 128MB NAND flash, Bl u-ray Pata drive, WiFi/BT dorter board, front panel, and mult i-card leader connector are attached to the back.

Motherboard and important parts labels

Diagram

Main architecture diagram

Simple introduction

In 2006, Sony announced the lon g-awaited "nex t-generation" video game console. At the root of the hardware architecture, the teaching of Emotion Engine has been inherited, and it is a shiny machine. On the other hand, their new "super processor" CELL BOADBAND ENGINE was invented in the crisis of technological innovation, and must be delayed as the multimedia service trends evolved. Not.

In this article, we will deeply consider the joint projects of Sony, IBM, Toshiba, and NVIDIA along with their execution and the impact on the industry.

About the length of the article

This article is not a typical "lunch time" article, as described in other game consoles in this series. If you are interested in all fields of PlayStation 3, you will read the entire process! Nevertheless, this sentence contains the following:~This sentence consists of countless engineers, six years of research and development, so I don't think it can be digested at once. Slowly (take a break if necessary), and finally, if you want to know more, see the "source source" section!

CPU

Welcome to the most popular and innovative part of this console.

At the beginning

The PS3 CPU is very complicated, but it is a very attractive engineering work that protrudes in the era of changes and experiments, with complex needs and unusual solutions. So, before stepping inside the PS3 CPU, I wrote the following paragraphs to include historical backgrounds in the article. By doing so, you can not only understand how the chips work, but also understand the reasons behind major design decisions.

  • Progress
  • New design philosophy
  • New mult i-core era

Progress

PS1 CPU (1994). Using MIPS technology, it was designed by LSI and Sony.

PS2 emotion engine (2001). Toshiba is designed again using MIPS technology.

SGI/MIPS is not good now, nearly 10 years after the appearance of the first PlayStation equipped with MIPS. Nintendo has recently abandoned SGI/MIPS and adopted a lo w-end PowerPC core with IBM a new supplier, while Microsoft, who has entered the market, chose Intel and its X86 empire.

It is a process involving other companies such as LSI (PS1 CPU) and Toshiba (PS2 emotion engine). The method continued until the PlayStation portable was released in 2004. So, what kind of new MIPS amalgamate did you intend to make for PlayStation 3?

Now, it turned out that the development of PlayStation 3 was ahead of the PlayStation portable. [1] A few months after the release of PS2, Sony has a partnership called IBM and Toshiba and "STI", and the only goal is to provide the next chip that drives the nex t-generation supercomputer. [2] 。 This alone may not seem to be luxurious, but the next chip was also used for the successor to PS2. In the end, in 2004, IBM announced Cell Broadband Engine (also called Cell BE or simply called Cell). [3]

New design philosophy

CELL BOADBAND ENGINE chip. Unfortunately, my camera is too shining.

In order to understand Cell's rapid proposal, it is necessary to take into account the issues that influenced the era (late 90's to the early 00s).

Every year, consumers seek further speed. It does not change at all times. However, the last approach used to solve it (pipelining the data path and increasing the clock frequency) has now failed. Intel's NetBurst cannot evolve anymore, and no promised successor is found anywhere. Similarly, IBM's PowerPC 970/G5 cannot achieve neither the "3GHz" promise nor a lo w-power CPU (so Apple cannot ship laptops equipped with the previous generation CPU) [4]. Overall, engineers seem to be facing a new scalability crisis.

So, the focus was moved to distributed computing. [5] In other words, instead of distributing the work load with multiple small machines, why are you particular about improving the performance of one machine? Conversely, this approach is not new, given that all the consoles analyzed on this website have multiple processors. However, the development of "single processor with multiple core" opens up a new possibility of CPU design (this is not necessarily limited to the console market).

As a result, Cell became part of the new wave of R & D. This new CPU is a combination of mult i-core design and special focus on vector processing. Remembering, vector computing is ideal for simulation (physical operation, writing, etc.), and it was previously embodied in the form of a geometry conversion engine or vector unit, but the design of CELL has progressed significantly from the previous two. You will know why you are there immediately.

New mult i-core era

Example of heterozinus design. To date, it has become a de facto architecture for many powerful game consoles.

Example of homogenius design. Each core can execute the same task as before, but it does not necessarily have to be limited to the task.

If you think about it, both the PS1 CPU and the Emotion Engine were already mult i-core processors. So why did Cell get so attention? The two chips so far are composed of one genera l-purpose core and multiple applications dedicated core (audio processors, images, images, etc.), a mix of architectures, and the genera l-purpose cores are responsible for ordering other core. I was.

This type of CPU design is called a heterogenius computing, and is a de facto option for building a machine specialized in a specific application (in this case, a game). Homodius computing, which is paired with this, is more mainstream in the PC market, and the CPU needs to execute a wider task (and all have the same priority). Therefore, the latter type may have multiple core of the same type.

Returning to the subject, Cell has both models. This CPU has two types of cores, one genera l-purpose "leader" and eight vectors "assistants". These vector core can play a variety of roles, and by doing so, they perform tasks before being resolved by heterodius design. However, since the vector of Cell is not limited to a single type of task, the core provides the flexibility of the homogeni a-spons. As a whole, this design is not perfect, and it will inherit some compromises, but through this sentence, I would like you to see various problems that CELL tries to solve and how it could be solved. 。

CELL Overview

After explaining all the history and theory, I think we are ready to make the protagonist of this section appear. That's cell:

Cell broadband engine (PS3 version). IBM designed for super computing and scientific simulation. SPE is erased means that it is invalid (not available). The "SPE" on the left is reserved for the operating system.

... and by the end of this section, you'll see what each component does.

Whole structure

CELL operates with powerful 3. 2GHz and is composed of numerous components. Therefore, for this analysis, this CPU can be divided into three main areas [6]:

  • Leader: This is the part that instructs the remaining circuit of CELL. Here is a part called Power Processing Element (PPE).
  • Assistant: These are as important as PPE, but their abilities are limited to the role of assistant / accelerator. This group is composed of eight synergistic processing elements (SPE).
  • Interface: As the need for bandwidth is increasing rapidly, a newer interface is implemented to move data without a bottleneck. The interface group has protocols such as the Element Interconnect Bus (EIB), the broadband engine interface unit (Bei), the memory interface controller (MIC), and the Flex I/O bus.

This information will be explored in more depth throughout this article, so there is no need to memorize these names. The main purpose of this section is to give the reader an idea of ​​the nature of the cell and familiarize them with all the components that we will discuss.

Structure of this Study

Looking at the structure so far, it was necessary to organize it so that it would not overwhelm you with so much information. Therefore, we will analyze the Cell by studying each component in this order:

  1. The bus that connects all the components, the Element Interconnect Bus (EIB).
  2. The PowerPC Processing Element (PPE) and its core element, the PowerPC Processing Unit (PPU).
  3. The general-purpose memory available on this console.
  4. The Synergistic Processing Element (SPE) and its core element, the Synergistic Processing Unit (SPU).
  5. The programming model devised to program the Cell efficiently.

Now, let's start the full-scale analysis.

Inside the Cell: The Heart

Since its launch, Cell has been called a network-on-chip (NoC) [7] rather than the traditional definition of a system-on-chip (SoC), due to Cell's unusual data bus, the Element Interconnect Bus (EIB). We have seen how highly loaded the CPU component is and how susceptible the system is to bottlenecks. IBM came up with a new design to tackle this challenge for the 11th time.

Simplified diagram of the Element Interconnect Bus (EIB). Each arrow between the "ramps" (nodes) represents two unidirectional buses, so each node is connected to the next node using four channels.

The EIB consists of 12 nodes, called "ramps", each connecting one component of the cell. The ramps are interconnected by four buses, two of which travel clockwise and the other two counterclockwise. Each bus (or channel) is 128 bits wide. However, instead of repeating a single-bus topology (as in the Emotion Engine and its predecessors), the lamps are interconnected according to a token-ring topology: given that the EIB provides four channels, there are four possible routes (rings).

You may wonder what the point of a token ring is if data has to travel a long path (compared to a single direct bus). A single bus is highly susceptible to congestion. That's why EIB engineers chose this topology to handle a lot of simultaneous traffic (read on if you want to know how token ring helps).

Data is transferred in 128-bit packets[8]. Each ring can have up to three simultaneous transfers, as long as the packets do not overlap. EIB works with command credits. In other words, whenever a component needs to start a transfer, it sends a request to the EIB's data arbiter, which manages the traffic in the ring. If the request is granted, the packet enters the ring and receives a "token". In addition, some components have priority over others, such as the MIC (Memory Interface Controller) component, where the main RAM is located. Finally, the data arbiter never places a packet on the ring that is longer than half the length of the ring.

Each ramp plays a part in the transfer, reading the destination address of the packet and deciding whether to forward the data to its own component or to the next ramp. During each clock cycle, the lamp can simultaneously send and receive packets of 128 bits (16 bytes). Therefore, considering that there are four channels and the EIB runs at 1. 6GHz (half the speed of the Cell), the theoretical maximum transfer rate is 16 bytes x 2 transfers/clock x 4 rings x 1. 6GHz = 204. 8GB/s. Of course, this value is too optimistic, and many other external factors affect the performance (origin/destination paths, bus conditions, etc.). In any case, many research papers by IBM and other authors summarize more realistic speeds using practical experiments [9].

Now that we have seen how the Cell components are interconnected, it is time to check out the first component of the chip...

Inside the Cell: The Leader

Now we are looking at the "main part" of the Cell. It is the part of the silicon that is responsible for directing the others. The name of this component is the PowerPC Processing Element (PPE), and you can think of it as the MIPS R5900 of the Emotion Engine.

Configuring PPE

Did you remember that you divided cells into areas before? The same can be done with PPE. IBM uses the term "element" to express an independent machine [10], but inside using the term "unit" to separate the core circuit from the interface that communicates with other parts of Cell. Masu.

A simple diagram of PowerPC processing element (PPE).

Nevertheless, the PowerPC processor element is surprisingly composed of two parts:

  • PowerPC Processing Unit (PPU): This is the logical part of PPE ("core"). Don't forget that this is not Nintendo's PPU! (Don't forget that PPU is not Nintendo's PPU!
  • PowerPC Processor Storage Subsystem (PPSS): A large interface connecting PPU and the outside world. In addition, we provide 512KB L2 cache.

As you can see, the design of the PPE (and the remaining part of CELL) is considerably modular and follows the RISC design teaching. You can immediately see that modularization is applied to the inside of the PPU.

PowerPC processing unit

From now on, I will look inside the PPU. I saw Cell, PPE, and PPU. Let's analyze the PPU like other CPU core.

  • Familiar architecture
  • Characteristic functions
Familiar architecture

In the first place, the PPU is not made from zero, but a reusable PowerPC technology. However, unlike the previous repetition, which IBM has obtained an existing processor and updated only half to meet the new requirements, PPE has not inherited the past CPU design. Instead, IBM has built a new CPU according to the PowerPC version 2. 02 (this is the last PowerPC specification before being rebranded to "Power ISA"). In conclusion, any chip that exists from that point does not have the same design PPU, and is programmed using the same machine code as other PowerPC chips.

But why did IBM choose PowerPC technology to develop hig h-performance chips? Simply, PowerPC is a mature platform [11].~It is a mature platform [11] that enjoyed the 1 0-yea r-old Macintos h-based test and revision, filling all the boxes on the Sony list. Last but not least, adopting a wel l-known architecture is good news for existing compilers and chords, which is a major advantage for new game consoles.

It is worth noting that IBM was one of the first PowerPC chips author, along with Motorola and Apple (I want you to remember AIM Alliance). Anyway, in the early 2000s, s o-called alliance members were already active separately, and Motorola / Free Scale had developed a different PowerPC series from IBM.

Characteristic functions

The PPU shares the history with PowerPC 970 (called G5 in Apple), both of which are the descendants of Power4, the predecessor of PowerPC, mainly used in workstations and supercomputers. This will be more clear if you show a modulated executable unit. This is a sudden change in comparison with the Game Cube's 750 line CPU, which was greatly influenced by Motorola, but slightly modified by IBM.

Returning to the story, the PPU is a complete 6 4-bit processor. This is

  • The size of the word is 64 bits.
  • There is a 6 4-bit genera l-purpose register (exactly 32).
  • The width of the data bus is at least 64 bits. In the next section of this article, you can see that it is much wider than that, but now that you will not lose its performance even if you transfer 6 4-bit words.
  • Theoretically, CPU can access up to 16 exercises memory. Theoretically, CPU can access up to 16 exercises memory. For this reason, recent CPUs have delegated address management to memory management units (MMU) and increased the use of address buses.

Finally, the PPU implements the PowerPC ISA version 2. 02 and includes an optional opcode of floating point square basket [12]. It is also expanded in the SIMD instruction group called Vector/Simd Multimedia Extension (VMX). On the other hand, there are some elements that are missing from the original specifications, such as the Little Endian mode (actually, CELL operates only with big endians) and small opera specifications.

PPU components

By looking at the PPU as a "microscope", it can be seen that this unit is composed of various blocks and subunites that execute independent operations (such as loading values ​​from memory, execution of operations, etc.). PPU's ability is defined depending on what each block can execute:

  • instruction
  • Memory management
  • Calculation
instruction

A simple diagram of the instruction unit (IU).

The first block is called the Instruction Unit (IU), and as the name suggests, it fetches instructions from the L2 cache and signals the other units to perform the requested operations. Like its i686 contemporaries, part of the instruction set is interpreted using microcode (the IU has a small ROM built in for this purpose). Finally, the IU also has a 32KB L1 cache for instructions.

Instruction issue is pipelined in a 12-stage pipeline, but in practice the number of stages varies widely depending on the type of instruction. For example, the branch prediction block may bypass large parts. When the IU and adjacent units are combined, the final number of stages is often closer to 24 (a large number, to be sure, but remember that the Cell is running at 3. 2GHz).

In some cases, the IU dispatches up to two instructions simultaneously, greatly increasing throughput. In practice, however, there are many conditions for this to work, so it is the programmer's and compiler's responsibility to optimize routines so that the sequence of instructions can take advantage of this feature. By the way, dual issue has also been implemented in earlier CPUs, and the terminology varies depending on the vendor, so I used the IBM definition here.

In addition, the IU is multithreaded, meaning it can execute two different instruction sequences (called "threads") simultaneously. Behind the scenes, the IU just alternates between the two threads each cycle, giving the appearance of multithreading. This technique is historically known as simultaneous multithreading (SMT) or hyperthreading (later coined by Intel). However, IBM's multithreading mitigates undesirable effects like pipeline stalls, since the CPU is no longer blocked when one instruction disrupts the flow. To achieve multithreading, IBM engineers duplicated the IU's internal resources, including the general-purpose registers (previously, we said that there were 32 available registers, but this is the number per thread. However, resources that do not belong to the PowerPC specification (such as L1 and L2 caches, interfaces, etc.) are still shared. Therefore, the latter group is single-threaded.

Combining dual threading and dual issue allows the PPU to execute up to 4 instructions per cycle. Even if this is a "best case scenario", it offers optimization opportunities that users will eventually notice in the game's frame rate!

Memory management

A simplified diagram of the Load Store Unit (LSU) and its neighbors.

The blocks below give the PPU the ability to execute load store instructions and perform memory management.

First, the Load Store Unit (LSU) executes "load" and "store" opcodes backed up by a 32KB L1 data cache. As a result, this unit has direct access to memory and registers.

Furthermore, the LSU is coupled with a Memory Management Unit (MMU), which is common in modern hardware. In simple terms, the MMU handles memory addressing using a virtual address map combined with memory protection. To improve the latter, this MMU in particular features a segment unit (grouping memory addresses using ranges called segments). To prevent performance degradation during this process, it also includes a Translation Lookaside Buffer (TLB) (which caches translated addresses) and a Segment Lookaside Buffer (SLB) (which caches segments).

Calculation

A simplified diagram of the units that perform arithmetic.

There are only two more units of the PPU left to discuss.

The first unit is the traditional fixed-point integer arithmetic unit (FXU). It performs integer arithmetic such as division, multiplication, bit rotation (similar to bit shifting, but the discarded bits are returned to the other side), and count leading zero (useful for normalizing vertex coordinates, etc.). The pipeline has 11 stages.

As you can see from the diagram, the FXU, LSU, and MMU are combined into a single unit called the Execution Unit (XU), because they share the same register file.

The combination of dual threading and dual issue allows the PPU to execute up to four instructions per cycle. Even in the "best case scenario," this provides optimization opportunities that users will eventually notice in their game frame rates! Memory Management

A simplified diagram of the Load-Store Unit (LSU) and its neighbours.

The following blocks give the PPU the ability to execute load-store instructions and perform memory management.

First, the Load-Store Unit (LSU) executes "load" and "store" opcodes backed up by a 32KB L1 data cache. As a result, this unit has direct access to memory and registers.

Furthermore, the LSU is coupled to a Memory Management Unit (MMU), which is common in modern hardware. In simple terms, the MMU handles memory addressing using a virtual address map combined with memory protection. To improve the latter, this MMU among other things has a segment unit (grouping memory addresses using ranges called segments). Also, to prevent performance degradation in this process, it contains a Translation Lookaside Buffer (TLB) (caching translated addresses) and a Segment Lookaside Buffer (SLB) (caching segments).

Arithmetic Operations

A simplified diagram of the units that perform arithmetic operations.

There are only two more units of the PPU left to explain.

The first unit is the traditional fixed-point integer arithmetic unit (FXU). It performs integer arithmetic such as division, multiplication, bit rotation (similar to bit shifting, but discarded bits are returned to the other side), and count leading zero (useful for normalizing vertex coordinates, etc.). The pipeline has 11 stages.

Looking at the diagram, the FXU, LSU, and MMU are combined into one unit called the Execution Unit (XU), because they share the same register file. The combination of dual threading and dual issue allows the PPU to execute up to 4 instructions per cycle. Even in the "best case scenario", this provides optimization opportunities that users will eventually notice in the frame rate of the game!

Memory Management

Simplified diagram of the Load Store Unit (LSU) and its neighbors.

The following blocks give the PPU the ability to execute load store instructions and perform memory management.

First, the Load Store Unit (LSU) executes "load" and "store" opcodes backed by a 32KB L1 data cache. As a result, this unit has direct access to memory and registers.

Furthermore, the LSU is coupled to a memory management unit (MMU), which is common in modern hardware. Briefly, the MMU handles memory addressing using a virtual address map combined with memory protection. To improve the latter, this MMU in particular has a segment unit (grouping memory addresses using ranges called segments). Also, to prevent performance degradation in this process, it contains a Translation Lookaside Buffer (TLB) (to cache translated addresses) and a Segment Lookaside Buffer (SLB) (to cache segments).

Arithmetic

Simplified diagram of the units that perform arithmetic.

There are only two more units of the PPU left to discuss.

The first unit is the traditional fixed-point integer arithmetic unit (FXU). It performs integer arithmetic such as division, multiplication, bit rotation (similar to bit shifting, but the discarded bits are returned to the other side), and count leading zero (useful for normalizing vertex coordinates, etc.). The pipeline has 11 stages.

In the diagram, the FXU, LSU, and MMU are combined into a single unit called the Execution Unit (XU) because they share the same register file.< 10% defective, Cell includes one spare SPE. Thus, if one of them comes out defective, the whole chip is not discarded. Now, that spare SPE will always be deactivated, independently whether it’s fine or not (Sony can’t have two different PS3s in the market).

The second unit is more interesting: the Vector/Scalar Unit (VSU), which performs floating-point and vector operations. It consists of a 64-bit FPU (following the IEEE 754 standard) and a Vector/SIMD Multimedia Extension unit (VXU), which executes a set of SIMD instructions called VMX. The latter operates on 128-bit vectors consisting of 16 8-bit values ​​to 4 32-bit values.[13] You may have heard of this extension, as "VMX" is IBM's name for Motorola's "AltiVec" and Apple's "Velocity Engine" (hooray for trademarks). Conversely, Cell's competitive SIMD capabilities are in other processors, so don't rest just yet!

PPE Summary

Now you know how the PPE works and what it's made of, but what's in it for developers?

After all, the PowerPC Processing Element is just a general-purpose processor. Remember the wide main bus (EIB)? IBM designed the PPE so that engineers could use it in conjunction with other processors to speed up specific applications (HPC, 3D graphics, security, scientific simulation, networking, video processing, etc.), and since this article is about the PlayStation 3, you'll see that the rest of the Cell is addressed with computer graphics and physics in mind.

Outside the Cell: Main Memory

  • No matter how good the PPE is, it's useless without the proper working space (memory) to make it work.
  • So Sony put 256MB of XDR DRAM on the motherboard. To answer that, we need to take a look at how memory blocks work and how they are connected to the Cell.
No matter how good the PPE is, it's useless without the proper working space (memory) to make it work.

First, the type of memory it has is called Extreme Data Rate (XDR). You might recognize XDR DRAM as the successor to the RDRAM that was in the NINTENDO 64 and PlayStation 2. But don't jump to conclusions just yet!

Like any company, Rambus is improving its inventions. Rambus' third revision (XDR) currently operates at octa-rate (four times faster than rival DDR DRAM).[14] One manufacturer's datasheet reports XDR latencies of 28ns to 36ns.[15] Nearly 10 times faster than first-generation RDRAM chips.

The first revision of the PlayStation 3 motherboard has four 64MB chips, processed in pairs. The XDR is connected to the Cell using two 32-bit buses, one for each pair. So, every time the PPU writes a word (64-bit data), it is split between the two XDR chips. The XDR chips are clocked at 400MHz.[16]

So Sony put 256MB of XDR DRAM on the motherboard. To answer that, we need to take a look at how memory blocks work and how they are connected to the Cell.

To connect to the XDR chips, the Cell uses another component in the Cell (like the PPE), the MIC (Memory Interface Controller). In addition, the MIC buffers memory transfers to improve bandwidth, but it has significant limitations. In short, the MIC has a minimum transfer data size of 128 bytes, making it suitable for sequential reads and writes. However, if the data is smaller than 128 bytes or if writes and reads need to be alternated, there is a performance penalty.

But is the MIC a bottleneck or an accelerator? Bandwidth optimization is important in data-hungry systems. In the past, we've seen solutions like the Light Gather Pipe and Write Back Buffer, so the MIC is simply a new proposal for a recurring problem. For reference, Sony claims a transfer rate of 25. 6GB/sec, but still, what the final transfer rate will actually be depends on a variety of factors (we all know how complicated it is to move data from one place to another inside the Cell).

So much for the main RAM, but there is also memory in another place, the hard drive. The PS3 also allows 2GB from the internal hard disk to be used for games as a working area (just like the original Xbox)[17].

Inside the Cell: the Assistant

We've seen before how Sony has equipped its general-purpose processors (in this case the PPE) with accelerators (the VPU and IPU in the case of the PS2, the GTE and MDEC in the case of the PS1) to achieve acceptable game performance. This is typical of video console hardware, where the general purpose processor can perform a wide range of tasks, but is not specialized in anything: the machine only needs a few skills, such as physics, graphics, and audio, and the coprocessor performs those tasks.

[The PPE is] a watered-down version to reduce power consumption. So it doesn't have the horsepower that you see in the Pentium 4 (...). If you recompile the code that runs on it today, whether it's Intel or AMD or whatever the power is, with Cell, it will still run today - you might have to change a library or two, but it will still run today. But it might be 60% or 50% slower, and people might say, "Oh my God! This Cell processor is terrible!" But that's because you're only using a portion of it. [18] - Dr. Michael Perrone, Manager of Cell Solutions, IBM TJ Watson Research Center

The accelerator in the Cell for the PS3 is the Synergistic Processor Element (SPE). The Cell has eight SPEs, but one is disabled while the console is running. This is because the chips require extraordinary precision to be manufactured (Cell was initially manufactured using a 90nm process) and the machines are not perfect. So, rather than discarding the circuits generated during the manufacturing process, we reuse the circuits generated during the manufacturing process.

SPE Structure

Next, the Synergistic Processor Elements (SPEs) are independent small computers inside the Cell that are commanded by the PPE. Remember what we said earlier about adopting elements from homogeneous computing? These co-processors are somewhat generic and not limited to a single application, so they can help with a wide range of tasks if programmed properly by the developer.

A simplified diagram of a Synergistic Processor Element (SPE). There are eight in the Cell (one is disabled).

As with the PPE, let's take a look at the SPEs. It's a short one, so if you want to know more about SPEs, check out the "Sources" section at the end of the article. Let's get started.

The SPE is a processor with a similar structure to the PPE, consisting of two parts:

Memory Flow Controller

  • Synergistic Processor Unit
  • Memory Flow Controller
  • The Memory Flow Controller (MFC) is the block that interconnects the cores with the rest of the Cell, and is the equivalent of the PowerPC Processor Storage Subsystem (PPSS) in the PPE. The main job of the MFC is to move data between the SPU's local memory and the Cell's main memory, and to synchronize the SPU with its neighbors.

The MFC has a buil t-in DMA controller that processes communication between EIB and SPU local memory. In addition, MFC has another component called SBI (SynerGistic Bus Interface) located between the EIB bus and the DMA controller. It is a very complicated circuit in summary, but basically interprets the commands and data received from the outside and send the signal to the internal unit of SPE. CELL's front door SBI is in two modes, a bus master (SPE is set to request data from the outside) and a bus slave (SPE is set to receive orders from the outside). Works.

Strangely, in consideration of the EIB packet restrictions (up to 12 8-bit length), MFC's direct memory access blocks can only move up to 16 kb per cycle.

Signistic processor unit

The Synagistic Processor Unit (SPU) is part of the SPE where the core processor exists, and is equivalent to the "PPU" when explaining PPE.

In contrast to the PPU, the SPU is separated from the other parts of Cell. Therefore, there is no memory shared between PPUs or other SPUs. Instead, the SPU has a local memory used as a work area. However, the contents of the local memory can be moved back and forth using MFC.

In terms of function, SPU is considerably limited than PPU. For example, the SPU does not include a memory management function (address conversion or memory protection), as well as stat e-o f-th e-art functions (such as dynamic branching prediction). Nevertheless, it demonstrates a very good performance in vector processing.

To program this unit, the developer uses a PPU to call a routine provided by the PlayStation 3 operating system, upload an executable file written exclusively for the SPU to the desired SPU, and send a signal to start execution. Later, the PPU holds the SPU thread reference for synchronization.

The SPU architecture < SPAN> MFC has a buil t-in DMA controller that processes communication between EIB and SPU local memory. In addition, MFC has another component called SBI (SynerGistic Bus Interface) located between the EIB bus and the DMA controller. It is a very complicated circuit in summary, but basically interprets the commands and data received from the outside and send the signal to the internal unit of SPE. CELL's front door SBI is in two modes, a bus master (SPE is set to request data from the outside) and a bus slave (SPE is set to receive orders from the outside). Works.

Strangely, in consideration of the EIB packet restrictions (up to 12 8-bit length), MFC's direct memory access blocks can only move up to 16 kb per cycle.

Signistic processor unit

The Synagistic Processor Unit (SPU) is part of the SPE where the core processor exists, and is equivalent to the "PPU" when explaining PPE.

In contrast to the PPU, the SPU is separated from the other parts of Cell. Therefore, there is no memory shared between PPUs or other SPUs. Instead, the SPU has a local memory used as a work area. However, the contents of the local memory can be moved back and forth using MFC.

In terms of function, SPU is considerably limited than PPU. For example, the SPU does not include a memory management function (address conversion or memory protection), as well as stat e-o f-th e-art functions (such as dynamic branching prediction). Nevertheless, it demonstrates a very good performance in vector processing.

To program this unit, the developer uses a PPU to call a routine provided by the PlayStation 3 operating system, upload an executable file written exclusively for the SPU to the desired SPU, and send a signal to start execution. Later, the PPU holds the SPU thread reference for synchronization.

  • The SPU architecture MFC has a buil t-in DMA controller that processes communication between EIB and SPU local memory. In addition, MFC has another component called SBI (SynerGistic Bus Interface) located between the EIB bus and the DMA controller. It is a very complicated circuit in summary, but basically interprets the commands and data received from the outside and send the signal to the internal unit of SPE. CELL's front door SBI is in two modes, a bus master (SPE is set to request data from the outside) and a bus slave (SPE is set to receive orders from the outside). Works.
    • Strangely, in consideration of the EIB packet restrictions (up to 12 8-bit length), MFC's direct memory access blocks can only move up to 16 kb per cycle.
    • Signistic processor unit

    The Synagistic Processor Unit (SPU) is part of the SPE where the core processor exists, and is equivalent to the "PPU" when explaining PPE.

    In contrast to the PPU, the SPU is separated from the other parts of Cell. Therefore, there is no memory shared between PPUs or other SPUs. Instead, the SPU has a local memory used as a work area. However, the contents of the local memory can be moved back and forth using MFC.

    In terms of function, SPU is considerably limited than PPU. For example, the SPU does not include a memory management function (address conversion or memory protection), as well as stat e-o f-th e-art functions (such as dynamic branching prediction). Nevertheless, it demonstrates a very good performance in vector processing.

    To program this unit, the developer uses a PPU to call a routine provided by the PlayStation 3 operating system, upload an executable file written exclusively for the SPU to the desired SPU, and send a signal to start execution. Later, the PPU holds the SPU thread reference for synchronization.

    SPU architecture

    Similar to other CPUs, the syneristic processor unit (SPU) is programmed using an instruction set architecture (ISA). Both SPUs and PPUs use the RISC method, but unlike PPU (implemented PowerPC ISA), the SPU's ISA is unique and is mainly composed of SIMD type instruction sets. As a result, the SPU has 128 12 8-bit genera l-purpose registers that store a vector that consists of a fixed decimal point or floating point value of 32/16 bit. On the other hand, SPU orders are very small at 32 bits to save memory. The first part contains an operand of up to three operands that include an operand in parallel.

    This is very compatible with the vector floating point operating unit, which debuted on PS2, has changed a lot since then. IBM and Sony provided a toolkit for programming SPUs using either C ++, C, or Assembly.

    Designed, this processor does not execute all instructions in the same unit, but divides the execution into two blocks, the execution pipeline. These two pipelines execute different types of instructions, so SPUs can issue two instructions to one cycle as possible. On the other hand, SPU does not issue two orders that depend on each other, reducing the possibility of data hazards.

    Let's look at the two pipelines [21]:

    Eizza number pipeline

    A simple diagram of odd pipeline. An odd pipeline executs most orders other than the arithmetic order. First of all, you can see that the SPU road store unit (SLS) plays three important roles:

    Equipped with a 256KB local memory that stores instructions and data. The type of memory is a single port (considering that this is a critical area, it is a bit disappointing that I did not use a dual port chip ...). And the address bus is 32 bits.

    Execute a road order and store instruction.

    The order is transferred to another block and issued.

    Note that there are only 256KB to save the program. Considering that the SPU program can be compiled with C/C ++, it is not easy to predict the size of the binary. For this reason, it is recommended that developers assume that only half of the available memory (128KB) is assumed. [22]

    Finally, there is the SPU Channel and DMA Transport (SSC) unit, which the memory flow controller uses to fill and fetch from local memory.

    Even Pipeline

    Simplified diagram of the even pipeline.

    What is noteworthy about the even pipeline is its computational power.

    There is a true fixed-point unit (FXU) that performs basic arithmetic, logical operations (AND, OR, etc.), and word shifts.

    Finally, there is a floating-point unit (FPU) that performs single-precision (32-bit float), double-precision (64-bit double), and integer (32-bit int) operations. It follows the IEEE standard, with some deviations (floating point works like the PS2).

    Cell Inside: Programming Style

    Now that Cell is nearing the end, how should developers program this monster? Similar to the previous programming model devised for the Emotion engine, IBM proposed the following methodology [23]:

    • PPE-centric approach
    • Representation of multi-stage patterns. The PPE assigns a task, which goes around each SPE and is finally returned with the processed data.

    A representation of a parallel pattern in which the PPE assigns a subtask to each SPE, each SPE returns the processed data, and the PPE merges it.

    A representation of a service pattern in which the PPE assigns a different task to each SPE, and each SPE executes the task independently.

    The PPE-centric approach is a set of programming patterns that puts the primary responsibility on the PPE and leaves the SPE to offload work. There are three possible patterns:

    Multi-stage pipeline model: The PPE is tasked with sending work to one SPE, which does the necessary calculation and passes the result to the next SPE. This continues until the last SPE in the chain sends the processed data back to the PPE.

    For obvious reasons, IBM does not recommend this design for primary tasks. It requires a lot of bandwidth and tends to be difficult to maintain.

    This means that each SPE has a single job, but that job specification does not last forever. PPEs must rearrange different jobs on the fly as program needs change.

    SPE-Centric Approach

    Represents an SPE-centric pattern, where each SPE is responsible for its function and interacts with the PPE only to obtain resources.

    Rather than providing services to PPE using SPE, the opposite is true. Using a buil t-in DMA unit, SPE fetches and executes the task stored in the main memory, and PPE is limited to resource management.

    This model is much sharper than other models, in the sense that the past patterns are traditional and close to the paradigm of a "genera l-purpose processor with coplocessa" like a PC. Therefore, a code base that implements the SP E-centered algorithm may make it difficult to transplant to other platforms.

    Conclusion

    As you can imagine, CELL's mult i-core design accelerates new technologies such as procedural generation, but all of these designs are implemented, especially in consideration of a code base that can be shared between different platforms between different platforms. Is not particularly easy.

    In one example, the Unreal Engine 3 developer (Epic Games) demonstrated the limit of the SPU to implement a collision detection system. [24] Their design depends on the BSP (BINARY Space Partitioning), an algorithm that depends on the comparison (branch). Since the SPU does not provide dynamic branch predictions such as PPU, the implementation is Platform 3 when compared to other platforms (platforms that provide consistent forecasting technology in all core, such as Xbox 360 and i686 PCs). I was disappointed with the user. Therefore, Epic Games had to rely on further optimization that correspond only to CELL.

    Software engineers will need time, patience, and a lot of learning for software engineers to fully withdraw Cell's potential. However, history has proven that it is impossible for all studios, and it is thought that the current console hardware (as of 2021) has become so homogenized.

    Graphics

    If you think CELL is all about that habit and you can take any tasks of this console, let me tell you something ridiculous: Sony has a different chip for 3D graphics. 。

    • Ancharmed 3 Drake's Treasure (2011).
    • THE ELDER Scrolls V: Skyrim (2011).
    • Kill Zone 3 (2011).
    • One Piece Pirate Musou (2012). Example of PS3 games. All are rendered at the maximum resolution (1280x720 pixels).

    With a supercomputer chip, Sony seemed to have to procure the GPU to complete PlayStation 3. This makes IBM/ Sony/ Toshiba hit the wall in an attempt to further expand the CELL, and makes Sony have only helped the graphics company.

    We were intended to launch the ICE (Initiative for a Common Engine) team and develop core technologies that can be shared at all first parties (...) For a while, there was no GPU (to PS3). I was going to move everything with the SPU. The ICE team has proved to Japan that it is impossible. I'm stupid. It is a great failure in terms of performance. That's why they finally added GPUs.

    -Ana Dog's Anonymous Sauce

    I am convinced that PS3 has a NVIDIA GPU chip for of f-road a part of the graphic pipeline. The chips are called Reality Synthesizer or "RSX" and operate at 500MHz. Compared to Cell's clock (3. 2GHz), it is worrisome about its clock speed, but you will immediately see that GPU is more suitable for parallel processing. In other words, it is important to find a balance between Cell and RSX when building graphics and pipelines (honestly, this sounds simpler than it actually is).

    This time, we will focus on RSX and its graphics function, and we will analyze the same level as previously performed.

    overview

    Five years have passed since NVIDIA debuted in the GeForce3/NV30 lineup in 2001, and at that time, powerful players such as 3DFX, S3, Artx/ATI were fighting in this field. Since then, the number of companies has gradually decreased, and by 2006, only ATI and NVIDIA remained as major video card suppliers in the PC market.

    RSX chip next to Cell.

    1. RSX inherits existing NVIDIA technology and is based on a 7800 GTX model sold for PCs, which implements GeForce7 (or NV47) architecture [27], which is also named "Curie". It has been reported.
    2. In my previous Xbox analysis, I talked about the pixel shader that debuted with GeForce3. Although there were some floating, most of them were gradual changes, and there is not much groundbreaking compared to the GeForce3 pixel shader.
    3. On the other hand, the 7800 GTX depends on the PCI Express protocol to communicate with the CPU, whereas RSX has been remodeled to operate with a proprietary protocol called Flex I/O [28]. Flex I/O works in two modes:
    4. BIC mode (used in mult i-processor environment) to connect other Cell processors.
    5. The lower IOIF mode connects two peripherals, "high speed" and "low speed".

    However, since RSX is not CELL, use the IOIF protocol and use the fastest slot.

    For comparison, IOIF operates as a 32-bit parallel bus, and the theoretical bandwidth is up to 20 GB/s, while the PCI-Express used at 7800 GTX (X16 1. 0) is 16-bit. In a serial bus, the theoretical bandwidth is up to 4 GB/s.

    Content arrangement

    RSX is equipped with a 256 MB dedicated GDDR3 SDRAM. Surprisingly, this is the same memory type that is mounted on Wii. The memory bass operates at 650MHz, and the theoretical bandwidth is up to 20. 8GB/ sec.

    An example of how data is organized throughout the available memory. Note how RSX can access content from different memory chips.

    In this 256MB, CELL can place everything necessary for RSX to render the frame. This includes vertex data, shader, texture and command. Thanks to CELL's Flex I/O bus, RSX can use the abov e-mentioned 256MB XDR memory (CPU main RAM) as a work space, which has some performance penalties. This is useful for pos t-processing rendering frames with SPUs.

    As you can see, this game console does not implement a UMA architecture, but if the programmer decides to do so, the graphics data can be distributed into different memory tips. I would like to read more about this feature before many "technical explanations" say that "PS3 was restricted because the PS3 was not equipped with UMA". From. This may be the case in a specific case, but unless you mention them, the general claims are misplaced in my opinion.

    Finally, RSX supports a lot of data optimization to save bandwidth, as examples of 4: 1 color compression, Z compression, and "tile" mode (this will be described in detail later. I will).

    Building frame

    Next, let's take a look at how RSX processes and renders the 3D scene.

    RSX pipeline outline.

    The RSX pipeline model is very similar to GeForce3, but has been super enhanced with five years of technology advancement. Regarding the PlayStation Portable GPU, it is recommended to read it because its chips and many new developments and needs overlap. However, let's look at what we have got here ... [29].

    command

    The figure of the command stage.

    Like other GPUs, you need a circuit block to receive external orders. In the case of RSX, this is processed in two blocks, a host and a graphics front end.

    HOST plays a role in reading the command from the memory (local or main) and converting it to an internal signal that can be understood by other components of RSX, and is performed using four su b-blocks:

    Pusher: Fetch the graphics command from the memory and interpret the branch order. It also includes a 1KB pr e-fetch buffer. The processed command is sent to FIFO cache.

    FIFO Cash: Save up to 512 commands decoded by Pusher in FIFO and provide prompt access.

    Puller: As the name suggests, when the RSX can be rendered, take the command from the FIFO cache and send it to the next unit.

    Graphics FIFO: Save up to eight commands read by Graphics Front End.

    The graphics front end is read from the graphics FIFO, sends a signal to the necessary units inside the RSX, and performs an operation. If you remember, this is equivalent to GeForce3 'pfifo'.

    As you can see, commands and data pass as many buffers and cache before reaching the final destination. This is intentional, to prevent stalls on a pipeline due to different units and buses at different speeds. In other words, cash memory uses as fast as possible.

    Bartex shaderer

    Note that if the vertex does not need to be processed further by the vertex shader, the vertex processing engine (VPE) will be bypassed.

    The next unit is the Geometry Processing block, and the vertex conversion is performed in the evolved version of GeForce3 "Vertex Block". This can still be programmed by using a Vertex Shade, which is widely used in the graphics industry. In addition, orders were increased to a minimum of 512 instructions (originally 136 orders!).

    The block that runs a shader is called a vertex processing engine (VPE), and one vertex can be processed in one clock cycle. Eight VPEs are running in parallel, as if not enough. After the GeForce6 series, NVIDIA has matched the Shade Programming Interface to a model called "Vertex Shader Model 3" or "vs_3_0_". VPE also supports no n-monopoly OpenGL 2. 1 model [31] and NVIDIA's unique varieties (CG) [32].

    Compared to GeForce3, there are new orders available for branches and subroutine calls. In addition, VPE contains four texture sampler that plugs a texture color between this stage in case the programmer wants to perform some operation for a texture using this unit.

    The geometry processing block works as follows:

    The index peak processor (IDX) fetches vertex data and texture from VRAM and caches. Then send the data to VAB.

    The vertex attribute buffer (VAB) extracts data from the IDX cache and redirects to each VPE.

    Each VPE processes data based on the loaded shader. VPE calculates one shader order per clock.

    The results of each VPE are sent to Post Transform Cache, and Post Transform Cache caches the results to skip the same calculation for the same vertices. This applies only when using the vertex index instead of vertex data.

    The final result is stored in the view portcal unit (ViewPort Cull Unit: VPC), and applies the shizing to destroy the vertex outside the viewport.

    Rasterize

    A simple diagram of the rasterize stage. RSX embeds a variety of units to calculate the values ​​used for pixels and color interpolations.

    Next, convert the vertices to pixels (rasterize). The RSX rastalizer is very fast, can rasterize up to 8x8 pixels (64 pixels) per cycle, operate with a maximum of 4096x4096 pixel frame buffer (but developers may need less). 。

    Rastaliza accepts dots, lines (including strips and closed types), triangles (including strip and fan), square, and mult i-angle. Naturally, as with this console generation, rasterizers operate with subpixel coordinates, and the sampling points are half of pixels (0. 5). This allows you to apply ant i-alias methods such as mult i-sampling later. Mult i-sampling reasses the same geometry multiple times, but shifts several subpixels for each batch (RSX supports four different shift modes) and calculate the average. As a result, a smooth image is obtained.

    In addition, this unit also performs Z-cooling using a dedicated RAM in RSX (with a capacity of about 3 million pixels). This saves the processing of already rendered pixels and stencils, and can perform the Z test early on the input geometry.

    A separate unit is used for rasterizing 2D objects (sprites), but this is separate from the 3D pipeline. As a result, the RSX operates in two modes, 2D and 3D, but switches between them intermittently, which is expensive in terms of performance.

    Pixel Shader

    Pixel/Fragment Stage Diagram.

    Next, we have the Fragment Shader & Texture block. This is a programmable unit (using "fragment programs" or "shaders") that applies texture mapping and other effects.

    An advanced successor to the GeForce3 texture unit, the new block contains six fragment units (also called "pipes"), each of which processes 2x2 texels (named "quads"). To organize multiple units working simultaneously, another sub-block called the Shader Quad Distributor (SQD) is placed to dispatch quads to each fragment unit. Each fragment unit then loads a fragment program.

    • To perform operations, each pipe contains a huge amount of 1536 128-bit registers. Moreover, each pipe can process multiple quads in parallel (multi-threaded), but the number of quads processed in parallel depends on the number of registers allocated to the fragment program (number of threads = 1536 / number of shader reserved registers). Globally, up to 460 quads can be processed in parallel. Furthermore, up to three fragment pipes can process two instructions at the same time (dual issue like PPU), as long as the instructions do not depend on each other.
    • The fragment unit provides similar operation type instructions as the vertex unit, with the addition of texture-related opcodes, such as multiple types of texture fetches (since textures are encoded using many structs and may then be compressed) and unpacking. Like the vertex block, the fragment shader conforms to DirectX's Pixel Shader 3. 0 model [33], OpenGL's NV_fragment_program2 profile [34], and Cg's 'fp40' profile [35]. All of this is to make programming easier and to avoid having to learn low-level APIs from scratch.
      • Finally, because the unit always fetches texture pieces from either video RAM or main RAM, this block contains three texture caches: a 4KB L1 cache for each pipe, a 48KB L2 cache for video RAM fetches, and a 96KB L2 cache for main RAM. Note that the main RAM cache is rather large; this was a sensible decision made to compensate for the higher latency.

      Pixel operations

      Before writing the result in the frame buffer (stored in the VRAM or main RAM), the final block, called the raster operation block (ROP), runs the final test to the pixels of the result.

      There are two ROPs consisting of four blocks (a total of eight). Each group runs Z test, alpha blending, and final writing to memory. As a whole, this circuit can process up to 16 Z values ​​and 8 pixel colors per clock. Strangely, the PC version of the NVIDIA 7800 GTX has 16 ROP instead of eight ROPs, but is this to give priority to the memory bandwidth consumed by the SPU?

      In order to further save the bandwidth, ROP also provides color compression and Z compression. In addition, a tiring mode that optimizes memory access by video encoder is also available. In the tilling mode, the frame buffer is stored in a consecutive 128 B block in which the broadcast/ scanned method is in the same way. For this reason, the GPU no longer needs to perform page swaps (used to specify a memorial dress) while transferring the frame buffer for display, resulting in improving the bandwidth. These "tiles" are stored in a marked location in memory exclusively for this type of addressing.

      Unified video output

      • The time when dozens of analog signals were pushed into one socket to respond to game consoles and all areas on the earth. In PlayStation 3, HDMI (High Definition Media Interface), a unified video signal that will soon be adopted around the world, has finally been adopted, allowing audio and video to be transferred simultaneously.
      • There is an HDMI output on the back of the PS3 and on the left, and an old mult i-V for analog video output on the other side.
        • The HDMI connector is composed of 19 pins [36], all in one socket. This means that the digital signal is transferred, which means that the image and audio are broadcast using discrete 0 and 1 (not in the continuous range of the analog signals). As a result, there is no problem with interference or image deterioration like previous devices, such as screen artifacts generated by inexpensive SCART cables.
        • To this day, the HDMI protocol has been continuously revised [37], and the new version of the specifications is more functional (larger image resolution, more imaging resolution, holding the same physical medium for rear compatibility. It offers a refresh rate, alternative color space, etc.).

        Throughout the PS3's lifecycle, Sony added certain HDMI features to the PS3 in new revisions through software updates.[38] The last protocol compatible with the PS3 was version 1. 4, which notably brought support for "3D TV", but other features such as higher video resolutions remained at 1920x1080 pixels (and most games still render framebuffers at 1280x720 pixels).

        True" 3D Vision/Projection

        So what was the "3D TV" mentioned earlier? Well, coincidentally, the life of this console overlapped with a short-lived craze for 3D televisions (so-called 3DTVs).[39] To support these, Sony updated their SDK, helped the RSX render stereoscopic frames, and implemented the "3D spec" in their HDMI encoder. What happens behind the scenes is that the encoder broadcasts two frames at a time, and the TV alternates between them, just like Master System 3D glasses did 30 years ago.

        Audio

        A simple diagram of the rasterize stage. RSX embeds a variety of units to calculate the values ​​used for pixels and color interpolations.

        While the need for better graphics tends to grow exponentially (consumers want more scenery, better details and colors), the demand for sound never reaches the same level. The assumption is that this is because the capabilities have reached the limits of our perception (sampling rate 44. 1kHz, resolution 16bit). All that's left is to implement more channels and effects, but at least in consumer devices, we don't need the processing power that would require installing special chips.

        Audio Pipeline Summary

        Finally, audio is implemented entirely in software and processed by the SPU (Synergistic Processor Unit, not Sound Processing Unit! It's a bit ironic that the initials are the same...). Secondly, Sony's SDK provides a number of libraries to make the SPU perform audio sequencing, mixing and streaming. If that's not enough, you can also apply a lot of effects.

        But where does the audio signal for broadcast go? To the RSX. This chip also includes a port for broadcasting the raw audio signal to a TV. Before sending the signal, it is encoded in different formats depending on the selected output (analog, HDMI or S/PDIF, the latter also known as "digital audio").

        I/O and backward compatibility

        All I/O operations are left to another huge chip called the Southbridge[40]. This is very similar to the architecture that the original Xbox used back in the day. It seems like the architectural gap between consoles is narrowing, or maybe this approach has proven to be very reliable and architecture agnostic.

        The big Southbridge chip overseeing the small I/O chips and the interfaces.

        • The same photo with the important parts labeled.
          • Like the PS2's IOP, the Southbridge is a completely custom designed, but this time made by Toshiba (they called it the "Super Companion Chip"[41]). So while the Southbridge is still a piece of unnamed silicon, it does an excellent job of integrating many interfaces and protocols, both external (USB, Ethernet, etc.) and internal (SATA, etc.). For reference, previously the slow clock speed of the IOP bottlenecked high-speed interfaces like ATA and Ethernet, greatly reducing their overall bandwidth.
          • In addition, the Southbridge implements encryption algorithms to seamlessly protect communication between standard protocols, such as hard drive data.

          Southbridge connection diagram

          Overall, the Southbridge is loaded with a huge amount of interfaces, which has to do with the fact that this console was designed within the "multimedia hub" trend. A console can't just play games, it needs to be a DVD and Blu-ray player, a set-top box (partially) and a photo viewer (using a multi-card reader to pull in camera photos).

          External interfaces

          Like many PC towers of the time (including mine), a multi-card reader was required. Next to that, there are four USB 2. 0 ports. This was pretty "premium" for a console that cost £425! (£628 in 2021 money).

          The same console with the lid closed. To cut costs further, later models sealed off the bay and removed the card reader and two USB ports.

          For user-accessible ports, the Southbridge is connected to:

          USB 2. 0 hub: Provides four front USB A ports. These can be used to link/charge accessories and controllers.

          Serial ATA (SATA) interface: connects Blu-ray drives and 2. 5" hard drives.

          Until 2008, Blu-ray readers interfaced with Parallel ATA [42] and therefore had an intermediate chip that did the SATA to PATA conversion.

          "Less wires" equipment

          • Thanks to the widespread adoption of Bluetooth technology, wired operation is a thing of the past. The new shape of the PS2's DualShock 2 controller is called 6-axis and, although not a radical change as other manufacturers have decided, it has a gyroscope for a new type of human input. However, this comes at the expense of haptic feedback (Rumble). A year later, Sony surprised players with the DualShock 3, which brought back the haptic motor.
            • Another topic is that it is now possible to power the console from a wireless controller.
            • Internal Interface

            As for the internal components, the Southbridge connects:

            Starship 2: Adapter for two 128 MB NAND flash chips. Behind the scenes, Starship bridges the Southbridge's local bus with the standardized "Common Flash Interface Protocol" (widely adopted for interfacing with flash memory) [43]. The PS3 stores its operating system on these, among other things.

            Simplified diagram of the even pipeline.

            The EE+GS chips send the video signal directly to the RSX.

            • These chips are not accessible to developers and are only used for backward compatibility!
              • Backwards Compatibility
              • Now that we've talked about the PS2 chips, it's time to talk about the backward compatibility of the PlayStation 3.
              • First, let's introduce how backward compatibility works in general: a console is able to play the games of its predecessors with the help of software (which tells the existing hardware to behave as expected by the old games) and hardware (either the existing hardware provides full or partial backward compatibility, or the company adds additional chips to recreate the old system within the new motherboard). With the amount of processing power the PS3 shows, it is expected that Sony will ship a PS2 emulator that runs within the Cell and is accelerated by the RSX. But for some reason, Sony decided not to do that and instead put the PS2 chipset in one corner of the motherboard.

              There's a big EE+GS chip, two 16MB RDRAM modules, and a "PS2 bridge".

              The same photo, with the important parts labelled:

              Meanwhile, the missing but less important chips (IOP, SPU, etc.) are recreated by software running on the Cell. For game saves, initially users had to get a memory card adapter, but when a new software update landed, the memory card was emulated as a disk image stored on the hard disk, and the magic gate (encryption system) was seamlessly handled by one SPU.

              Since Cell and RSX are "on" during PS2 gameplay, two different scaling methods are provided to expand the screen area during gameplay: "nearest neighbor" or "smoothing (anti-aliasing)".

              • The PS3 user interface showing the game entry after inserting a PS2 disc. (Don't worry about the other icons, they're not official).
              • As mentioned above, the PS3 can run PS2 games with great compatibility. Moreover, it can take advantage of the new features of the new console (wireless control, HDMI interface, virtual memory card).
              • In addition, PS1 games can also be run without the need for an old SoC or GPU (relying on pure software emulation).

              A strange ending

              Throughout the PS3's lifecycle, Sony gradually removed the PS2-specific chips from the PS3 motherboard, making backward compatibility software-only (with bigger limitations, such as only running PS2 games purchased from the online store). Since Sony did not replace the PS2 chipset (as they had previously replaced the PS1 hardware inside the PS2), it makes me wonder about the technical and business rationale behind this. Now, as a case study, here is my brief opinion on why this is the case:

              Timing: Sony probably intended for PS2 owners to buy the new product as a trade-in for their current PS2. However, for some reason Sony was unable to have a software emulator ready by the launch date, so initially they resorted to adding in the chips. Then, as software emulation got going, they gradually removed the additional chips in further motherboard revisions.

              To complement this, developer "M4j0r" commented: "What may be interesting is that Sony developed two hardware emulation revisions at the same time (EE/GS and GS only), since some games may work better depending on which one you use."[44]

              I understand that reducing power consumption is related to other factors, such as the new revision of CELL and RSX. However, I think the PS2 chipset play an important role.

              Personally, we believe that pure software emulation is the most feasible option in the long term due to its expandability, customization, and independence from its own hardware. But, of course, as you can see from the fact that volunteers communities continue to develop PCSX2, it takes more efforts to implement them accurately (but note that the abov e-mentioned emulator works only on X86 PCs. I want you to do it).

              Horizontal compatibility

              • The compatibility story is not over yet! Surprisingly, Sony is able to play some of the PlayStation Portable games. The emulation was completely software, which is the same as the compatibility of PS2 in the later model.
                • Since PS3 has no UMD diskinder, you need to access the game catalog from the Sony online store to download and install PSP games.

                operating system

                • Now that home game consoles have become a powerful multimedia hub, a more complex operating system needs to increase the abstract layer to provide more services and games to users. While maintaining security and performance, we realize all of them.
                • As a result, terms such as shells and BIOS are no longer used to explain this field. Currently, the general term is the "operating system", which is an analysis of many areas (boot loaders, kernels, user interfaces). As usual, we recommend that you first check the PSP OS. The module design is also reproduced in PS3.
                • Cell privilege security
                • Before entering the details, you need to mention the various operation modes of CELL. Initially, I was going to explain it in the "CPU" section, but since it became a tremendous density, I will introduce it here where I can easily understand how to use it. Furthermore, this mode affects not only the Sony developed for this game console, but also the design of all operating systems that operate within CELL.
                • Nevertheless, CELL implements a series of privileges inherited from PowerPC specifications to prevent unauthorized access to confidential data and resources. In other words, Cell executes the program in two modes:

                Privilege mode: Cell allows access to every corner of hardware (registers, memory addresses, opera, etc.). For security reasons, this mode is used only by the operating system core (that is, a kernel).

                In addition, Cell is prepared to perform multiple operating systems at the same time, and "privilege mode" is further divided into privileges 1 and privilege 2 to achieve it at the hardware level. "Privilege 2" is used by the kernel, and "privilege 1" is used by hypervisor, and the latter mediates resources between different kernels that operate at the same time.

                The Hyper Vizer function is also a research field of IBM headquarters. [51] [52]

                • Furthermore, SPE has an operation mode called isolated mode, shielding the execution process in the SPU so that the external unit (PPE or other SPE) cannot be accessed until the SPU ends. This can be enabled after uploading the program to any SPE, so that the processor is not tampered with wise code (such as encrypted routines).
                • Sony's operating system, which is described in the future, uses security using all the explanated modes.
                • overview

                As mentioned earlier, the OS is very complicated. Therefore, in order to proceed with this section without much difficulty, the types of files in the operating system of this console can be divided into different layers:

                LOADERS: If you shorten the long story, the program/ binary of this console is systematically encrypted. In other words, "loader" is a program that runs a "real" program. In other words, the loader gets a binary, decrypts, checks the straightforward, and finally sends it to each processor (either PPE or SPE). This may not sound complicated, but the loader is further chained to protect software. Finally, the loader exists in many media.

                Some slooders (depending on the software update) by Sony, others cannot change. This is irrelevant to whether the loader is installed in the storage that can be rewritten, and some loaders are encrypted using the consol e-specific key, so after the console is shipped from the factory (at least conventional). It cannot be changed by the means.

                Some binari borrowed code from Free BSD and NetBSD Project [54].

                Unlike other hierarchies, this data is destroyed does not lead to catastrophe.

                OS security hierarchy

                Generally speaking, the OS of the PS3 is designed in the same modular way as the PSP. If you recall from the last article, the OS is composed of multiple modules. These modules may provide services to the user (like games and apps) or may exist in memory indefinitely to provide services to other modules (like in the form of system calls and drivers). Some modules have more privileged access than others (kernel modules vs. user modules).

                Diagram showing how the components of the PlayStation OS fit into the Cell privilege levels. "OtherOS" will be explained further in the next section.

                The operating system calls many modules with different privileges throughout its lifecycle. Sony built the OS so that modules operate at three privilege levels of the Cell:

                1. Level 1: There is a hypervisor programmed by Sony. This program, also known as Lv1, is the door to every bit of this console and is chained to exceptions triggered by the MMU. That said, the hypervisor only accepts requests by programs authorized by Sony (which exist at the next privilege level). While the hypervisor resides in memory, it also provides support for low-level system calls and the FAT16 file system.
                2. Level 2: Naturally reserved for the kernel, it is a privileged program also known as lv2 or "Supervisor". The kernel abstracts the hypervisor, so level 3 programs do not directly interact with it. The kernel provides multi-threading capabilities for both the PPU and SPU. Finally, the kernel bootstraps the userland modules.
                3. Level 3: The remaining programs (called userland/userspace), including games and visual shells, run at this level. These plebs are under the will of the kernel to communicate with the console's hardware and cannot unilaterally spawn new processes/programs.
                4. Storage media
                5. So where are these data stored? From the perspective of the average user, there are only two visible media: Blu-ray discs for games and hard disks for saves. However, there are a few others, so let's take a look at them one by one!
                  1. Cell BootROM
                  2. It turns out that the Cell has a small ROM hidden somewhere where manufacturers can store a "protected" boot loader. By providing this space, IBM is saving any company (not just Sony) the trouble of manually implementing custom obfuscation techniques to protect the boot code.
                  3. This part is already physically protected by obfuscation and does not need to be encrypted. Therefore, it is ideal for the firs t-stage boot loader (not encrypted), and PlayStation 3 preserves the initial boot stage here.
                  4. NAND/NOR flash memory

                  Do you remember the 256MB NAND flash that you briefly touched before? This is where most of the operating systems are stored. Until Sony announced the Cechh model at the end of 2007, the 256MB NAND had been replaced with a 16MB NOR. As a result, some files had to be moved to another place. To make it easier, let's first see what these chips are preserved [55]:

                  Consol e-specific loader: Specifically, two loaders, called BootLDR and METLDR. These files are encrypted with keys engraved at the time of manufacture, so they cannot be replaced!

                  However, Sony's hyper visor has a hidden function that can update these.

                  Due to the large size, models with NAND flash also save the remaining part of the OS (called GameOS or DevFlash). This includes the following:

                  Visual Shell (VSH): It has inherited the characteristic interface of PSP, and a large number of modules (plu g-ins) and assets are bundled.

                  Emulator: The abov e-mentioned program that allows you to run PS1, PS2 or PSP games on PS3. The loaded PS2 emulator depends on the resivation of the game console (whether the PS2 hardware is completely installed, partially installed, or not using any PS2 hardware).

                  Runime Library: The program developed in Sony's SDK is dynamically linked to a series of libraries stored here.

                  Blu-ray Player: A program that processes interaction and movie decoding with Blu-ray drive.

                  System assets: Fonts and certificates that depend on binary to operate.

                  Other data on the NAND console, such as XREGISTRY (Network Settings, PlayStation Network Account, Bluetooth connection device list), revoked record, OtherOS loader (I will explain in detail in the next paragraph, but it is really interesting). It is stored.

                  Hard drive

                  The debuted 2. 5-inch hard disk drive provides permanent data storage for the following between 20GB and 500GB (because more revisions are shipped):

                  User content: Game save, trophy (see the game section for details), and other use r-related data.

                  Game Asset: Games can copy files from disks to hard disks to reduce loading time. These are treated as "game data" by the operating system.

                  Cache: A separate partition of 2GB is provided for games, where they can be temporarily stored (if the main RAM is not enough).

                  However, on NOR systems, the GameOS is also stored on the HDD. As a result, every time the user replaces the hard disk, the console requests an update file to reinstall the GameOS on the disk. Either way, neither NOR nor NAND systems will boot without a hard disk.

                  Some user data can be backed up on a USB stick and moved to another console if necessary, but this process reformats the new console before copying the old data.

                  eMMC

                  In 2012, Sony announced a redesigned revision of the console called "SuperSlim" (codename: CECH-4xxx). It came in three varieties: one with a 250GB hard drive, one with a 500GB hard drive, and a third with a 12GB eMMC built-in flash. The first two options follow the filesystem layout implemented in the NOR model, while the third option follows the NAND layout, storing everything (including user data) on the eMMC and storing the system files.

                  However, the eMMC model has a problem. According to the PS3 Dev Wiki [57], Sony seems to have included Panasonic's "MN66840" chip instead of the NOR chip, redirecting the NOR bus to the eMMC. This seems to be a cost-saving trick, since it reuses the same south bridge as the other variants.

                  Strangely, when the user installs a hard disk drive in the eMMC model, the console moves all user data from the eMMC to the new hard disk drive. As a result, the user can make full use of the hard disk, although the free space on the eMMC is wasted.

                  Boot Process

                  Now, with all the knowledge we have so far, we will now learn how the system boots. The reason is simple: Sony doesn't want you to tamper with their hardware or software, so they use multiple layers of obfuscation and encryption to prevent you from breaking in and sideloading your own code.

                  In the next section, we'll explain what the console does when you press the power button. Note that this process has only been significantly changed once (after a hacker cracked it). So, for simplicity's sake, we'll start with the "original" boot process (the one implemented before system version 3. 60) [58] [59] [60] :

                  Another chip in the motherboard (called Syscon) powers it up and executes instructions from its internal ROM. It then sends a "configuration ring" to the Cell over SPI (serial connection), initializing the Cell and deactivating the 8th SPU. Finally, it latches the power lines and brings the Cell to life.

                  The Cell's PPU reset vector points to a hidden ROM that contains a routine to find and decrypt the bootldr from Flash. The decrypted fragment is loaded by the first SPU in isolation mode.

                  After loading the bootldr, the isolated SPU initializes some of its hardware (XDR memory and I/O interfaces) and decrypts a binary called lv0 for the PPU to execute.

                  The PPU executes lv0, decrypts metldr (console-only loader) and sends it to the 3rd SPU.

                  SPU2 runs metldr, which then runs five more loaders in sequence:

                  lv1ldr decrypts and loads lv1, which contains the hypervisor, which takes over the first privilege level. In addition, lv1 sets up the hard drive, Blu-ray drive, and RSX.

                  lv2ldr decrypts and loads lv2, which contains the kernel, and runs it on top of the hypervisor. It also completes the initialization of RSX, PS2 emulation, Bluetooth, USB controller, and multi-card reader.

                  appldr decrypts and loads vsh (Visual Shell) and other dependencies. vsh allows the user to load games later.

                  isoldr decrypts and loads modules that run on the third SPU in isolation modules. These modules are important for security and perform many cryptographic functions throughout the console's lifecycle. As a result, the third SPU is reserved for security functions and cannot be used for games (leaving only six SPEs for games).

                  After loading vsh, the PPU allows the user control through a graphical user interface, followed by the iconic orchestral splash sound followed by the XMB menu.

                  Modifications to the boot process

                  In March 2011, a hacker known as "GeoHot" broke the security of metldr, undermining the reliability of subsequent loaders. Sony therefore retaliated by issuing hardware and software security updates. These fixes are detailed in the "Anti-Piracy" section of this article.

                  Visual Shell

                  Getting bored with this theory? Let's switch to something everyone can actually see: the visual shell.

                  The XrossMediaBar (XMB), a new user interface that had gained international recognition two years earlier, was slightly improved for couch operation (the so-called "10-foot user interface") and extended to take advantage of "full HD" resolution (1920x1080 pixels).

                  The PSP XMB (2004), rendered at 480x272 pixels.

                  The PS3 XMB (2006), rendered at 1920x1080 pixels.

                  While many of the features will be familiar to PSP users, Sony added new apps that take advantage of the potential of the Cell, RSX and Blu-ray drives. Many of these are related to multimedia (such as video players and picture slideshows), television (such as on-demand TV apps like the BBC iPlayer), social profiles (online avatars) and online purchases (such as PlayStation Now and PlayStation Store).

                  Also, with the assumption that the home game console will be shared by multiple members, the XMB supports multiple users, each of whom can use a different PlayStation Network account and save separate user data (purchased games and save data).

                  Just like the PSP, when you highlight a game, the background is displayed stylishly.

                  The XMB has a huge amount of settings, which is especially useful when you need to set up a 1080p TV with 3. 1 surround audio.

                  A variety of multimedia options.

                  The XMB can install games, updates, and expansions (DLC) using native package installers.

                  Finally, the inclusion of a hard disk drive is a relief for veterans who in the past were forced to purchase expensive dedicated storage (Memory Stick Pro Duo) when they ran out of space.

                  Lend me your PS3

                  Impressively, not all of the apps bundled with this game console had selfish purposes. With the advent of distributed computing and Cell's capabilities for data science projects, Stanford University teamed up with Sony to allow PlayStation 3 owners to contribute to medical research. The result was Folding@home (pronounced "fold at home").

                  Folding@home was an application installed on every PlayStation 3 that, when opened by a user, connected to a central server and ran protein simulations. In addition, the app could run in the background during off-peak hours.

                  Folding@Home would display the work accomplished since the user launched the app.[61]

                  Throughout Folding@home's lifetime, the computing power of 15 million PS3 users around the world supported Folding@home's research towards treating Alzheimer's disease.[62] Eventually, Folding@home and Sony shut down the app in 2012, with the former living on other platforms.

                  This is my personal opinion, but I read a project that makes a global contribution using distributed computing abilities, in contrast to the sensational articles on cryptocurrency mining. Is fun. Don't forget that the new and powerful technology has a selfless application developed for it.

                  Multi OS proposal

                  When IBM explained Cell from the software level, CELL stated that it could execute multiple OS at the same time because it had many executable cores. So, Sony advanced this idea and added an option to install secondary OS to XMB. [64] This feature is called OtherOS, in a nutshell, partition manager (XMB only guides users to change the size of the GameOS partition and assign a new space for the second OS). Provides buttons for booting from the first OS (because the OtherOS boot file is already set up to Flash). In other words, users just need to put the OS into the new partition. As a result, many Linux distributions (Ubuntu, Fedora, etc.) have added PS3 as another installation target. This can be considered as a mental successor to Linux for PS2. Red Ribbon GNU/Linux is a distribution dedicated to PS3/CELL and is compiled using the PPC64 target. Thanks to Otheros, experienced users have the opportunity to develop their own applications that operate on Cell without license restrictions, which were particularly interesting for research / science purposes. [66] [67] For multimedia applications, Blu-ray drive and multi-card reader were accessible from Otheros. < SPAN> This is my personal opinion, but in contrast to the fact that sensational articles on the mining of cryptocurrencies are endless, it has made a global contribution by making full use of distributed computing abilities. It's fun to read the project. Don't forget that the new and powerful technology has a selfless application developed for it.

                  Multi OS proposal

                  When IBM explained Cell from the software level, CELL stated that it could execute multiple OS at the same time because it had many executable cores. So, Sony advanced this idea and added an option to install secondary OS to XMB. [64] This feature is called OtherOS, in a nutshell, partition manager (XMB only guides users to change the size of the GameOS partition and assign a new space for the second OS). Provides buttons for booting from the first OS (because the OtherOS boot file is already set up to Flash). In other words, users just need to put the OS into the new partition. As a result, many Linux distributions (Ubuntu, Fedora, etc.) have added PS3 as another installation target. This can be considered as a mental successor to Linux for PS2.

                  Red Ribbon GNU/Linux is a distribution dedicated to PS3/CELL and is compiled using the PPC64 target.

                  Thanks to Otheros, experienced users have the opportunity to develop their own applications that operate on Cell without license restrictions, which were particularly interesting for research / science purposes. [66] [67] For multimedia applications, Blu-ray drive and multi-card reader were accessible from Otheros. This is my personal opinion, but I read a project that makes a global contribution using distributed computing abilities, in contrast to the sensational articles on cryptocurrency mining. Is fun. Don't forget that the new and powerful technology has a selfless application developed for it.

                  Multi OS proposal

                  When IBM explained Cell from the software level, CELL stated that it could execute multiple OS at the same time because it had many executable cores. So, Sony advanced this idea and added an option to install secondary OS to XMB. [64] This feature is called OtherOS, in a nutshell, partition manager (XMB only guides users to change the size of the GameOS partition and assign a new space for the second OS). Provides buttons for booting from the first OS (because the OtherOS boot file is already set up to Flash). In other words, users just need to put the OS into the new partition. As a result, many Linux distributions (Ubuntu, Fedora, etc.) have added PS3 as another installation target. This can be considered as a mental successor to Linux for PS2.

                  Red Ribbon GNU/Linux is a distribution dedicated to PS3/CELL and is compiled using the PPC64 target.

                  Thanks to Otheros, experienced users have the opportunity to develop their own applications that operate on Cell without license restrictions, which were particularly interesting for research / science purposes. [66] [67] For multimedia applications, Blu-ray drive and multi-card reader were accessible from Otheros.

                  On the other hand, the authority of OtherOS may be higher than GameOS (at the kernel level), but does not exceed the hypervisor on the memory. Therefore, hardware access from Otheros still depends on the intention of Sony's hypervisor, and the latter blocks access to RSX command buffers (used to faster graphics operations). Of the components, it hinders the use of the shader unit). As a result, Linux distributions rely on software rendering (all graphics are drawn by Cell) and stream and display the frame buffer to RSX. It is a pity that OtherOS cannot fully use this console function, but this was probably done to reduce the attack surface. Ironically, how to use OtherOS Cell is similar to how IBM/ Toshiba/ Sony initially assumed PS3!

                  OtherOS, which followed the same fate as Folding@Home, was finally deleted in the subsequent update, but the cause was different (mainly security). Shortly thereafter, OtherOS has revived informal with software Exploit and reverse engineering efforts. At this time, OtherOS is available if the user installs custom firmware. This will be described in detail in the 'Anti-Piracy and Homebrew' section.

                  At the time of writing this article, the developer René Rebe is implementing an appropriate XF86 driver that uses acceleration provided by RSX and 256 MB memory. His job is combined with other developments that remove the hypervisor restrictions (initially due to the discovery of Software Exploit and later used by custom firmware) The situation has been published and continues to work with voluntary donations. [69]

                  Renewal potential

                  Let's talk about the renewal of GameOS at the end of this long section.

                  In short, like PSP, Sony distributes a PS3UPDAT. PUP file packaging all new OS binary. The security system of the game console is not protected by the key unique to the game console, and only the files stored in the rewritten storage (flash, hard drive, EMMC) can be updated, and otherwise it must be the same. It doesn't.

                  PUP files are distributed through Sony's official website and XMB update assistants, and are included in the contents of the game disk (all games are embedded with PUP files that reflect the developed SDK version. ) The model equipped with the NAND flash has only a 256MB capacity, and the entire OS is stored there, so Sony did not release an update file of more than 256MB.

                  • game
                    • This section covers topics on games development, distribution, and services.
                    • Development ecosystem

                    Developers will develop software because this game console integrates the technology of various companies, including the products that have already been commercialized in other markets (such as the NVIDIA PC GPU "GeForce7" line "). I was drowned in various tools. Note that this does not mean that it was easy to develop, but it should be evaluated compared to the Acembley era.

                    • In order to program Cell, IBM and Sony shipped separate development suite, and IBM targeted no n-restricted environments such as Linux (and Otheros), but Sony's tools are PS3 GameOS. Was clearly targeted as the only execution environment.
                      • Nevertheless, IBM distributes the IBM Cell SDK package free of charge. This contains a GCC tool chain that has been modified to generate the binary of the PPU and SPU, and can be developed with C, C ++, Fortran, and Assembly. In addition, since it is a cross platform, the code can be compiled from other devices (X86 PC, etc.). SDK also includes a low-level library that facilitates SIMD mathematics and SPU-PPU management. Finally, a fork of Eclipse IDE was bundled.
                      • In order to alleviate the complexity of Cell development, IBM has also developed another shor t-lived compiler called an XLCL that compiles the OpenCL code for PPU and SPU (C/C ++ variant for parallelization calculations). However, it was only distributed through IBM's AlphaWork channel and remained experimental.

                      Well, how about Sony? Like the PSP SDK, they have a hardware debut kit (many variations of different sizes and function enhancements), and a compiler, library, and debugger, a compiler, library, and debugger using Visual Studio 2008 (later 2010) as IDE. Shipped [71]. The company supported only PS3, so the SDK contained the same GCC toolchain, but was complemented by a large number of libraries supporting graphics, audio, and I/O. In the case of graphics/RSX, Sony provided GCM to build raw commands and provided PSGL built on GCM to provide OpenGL ES API. To describe the shader, NVIDIA provided a CG, a shader compiler that analyzes a language similar to HLSL (a shader language defined by Microsoft).

                      Licens e-free development

                      The emergence of native Homebrew (operation on GameOS instead of Otheros) has created a new open source SDK to avoid dependence on libraries protected by Sony's copyright and prevent copyright litigation. One example is PSL1Ght, SDK used in combination with PS3TOOLCHAIN ​​[72] provides a complete development suite for developing legal Homebrew (however, disabled signature checks and remodeled the console/ hack. You need to do it).

                      • Returning to 2018, I built a unique sweet based on PS3Toolchain, but since it was distributed in the Docker container [73], the developer did not need to compile PS3TOOLCHAIN, so I was compiled instead. I was able to download the setup (I was able to save on the compilation time for hours). The container also bundles many tools, such as the NVIDIA CG Shader, and reduced the dependence issues found during the experiment in the PSL1Gh t-based project. Eventually, it was a fun experiment that I was able to learn more about the development environment.
                      • Development outsourcing
                      • At this time, let's point out that the popularity of strange business models has grown: a game engine. Instead of developing a game from scratch over time and money, how about buying a code base of another company and building a game on it? It is what a game studio like Epic Games envisioned [74]. Epic Games not only sells popular game titles such as Unreal TourNament 3, but also licensing stri p-down versions (without assets) for other developers. This was packaged and named "Unreal Engine 3". In short, game engines only process the basic parts (physical operations, writing, etc.), so developers only need to add custom content (script, texture, model, sound, etc.). 。
                      • Game engines licensing is not a new business model, but the harsh environment of PS3 has eventually an attractive option for development.

                      Returning to 2018, I built a unique sweet based on PS3Toolchain, but since it was distributed in the Docker container [73], the developer did not need to compile PS3TOOLCHAIN, so I was compiled instead. I was able to download the setup (I was able to save on the compilation time for hours). The container also bundles many tools, such as the NVIDIA CG Shader, and reduced the dependence issues found during the experiment in the PSL1Gh t-based project. Eventually, it was a fun experiment that I was able to learn more about the development environment.

                      After the game development is over, the next story is about distribution. Therefore, here we will explain the formal distribution method of PS3 game software.

                      Bl u-ray disc

                      An example of a retail game.

                      New generation = new media. As the limits of the game industry (capacity restriction) and the film industry (480i format) [75] are revealed, Sony announces another standard on behalf of the new home appliances. It's a matter of time. Bl u-ray disc was selected for this new game console.

                      Development outsourcing

                      The Blu-ray data format responds to many needs in various industries, including high-resolution films, digital copyright management (DRM), region locks, new file systems, and the execution environment of the Java program. In the video game industry, the retail game for PlayStation 3 was distributed on a 25 GB or 50 GB Blu-ray disk with copy protection. PS3 lasers can read DVDs (8x speed) and CDs (24x speed), and can play old games and movies.

                      Lone titles are executed from the disk, but in subsequent games, some assets are copied to hard drive to improve the loading speed. Nevertheless, you always need a game disk to start a game.

                      Online store

                      1. XMB PS Store app.
                      2. When you open the store, the game catalog is displayed.
                        • Search interface (by the way, bypass XMB native keyboard).

                      An example of a game entry (with suggestions that can only be used on other game consoles ...).

                      At this time, let's point out that the popularity of strange business models has grown: a game engine. Instead of developing a game from scratch over time and money, how about buying a code base of another company and building a game on it? It is what a game studio like Epic Games envisioned [74]. Epic Games not only sells popular game titles such as Unreal TourNament 3, but also licensing stri p-down versions (without assets) for other developers. This was packaged and named "Unreal Engine 3". In short, game engines only process the basic parts (physical operations, writing, etc.), so developers only need to add custom content (script, texture, model, sound, etc.). 。

                      In the digital store, Sony also had the opportunity to sell digital versions of PS1, PS2, and PSP games called PlayStation Classics. These are downloaded and installed in the same way, but use the bundled emulator for the operation. In fact, the PS2 classic launches the same no n-accelerated software emulator, regardless of whether the PS3 model has a PS2 chipset [82]! The hardwar e-based emulation in PS3 will end the curtain.

                      • PS Store is only a website that can only be accessed through the XMB PS Store app. Through its life cycle, the user interface has been renewed several times, reflecting global demand for more flashy user interfaces.
                      • Network service
                      • In addition to the online store, many online solutions have been added to the platform, including the debut of the free online service PlayStation Network, which directly competes with Microsoft's paid service Xbox Live.
                      • PlayStation Network allows users to create personal accounts, assign avatars, and use their new digital personas to perform mult i-player games, messages, and other social interactions. If you clear a specific event in the game, you can get a trophy, and the trophy is displayed in an online profile (like an honorary medal), threatening rivals and gathering respect from friends. You can.

                      Game engines licensing is not a new business model, but the harsh environment of PS3 has eventually an attractive option for development.

                      A friend list. (The name is hidden to make it easier to understand).

                      If you play a little online games, an unspecified number of people will send a message.

                      Last but not least, the game will be updated so that it has an updated OS. Therefore, when the game is launched, the XMB may encourage the XMB to update the game (the form of a “package”) that patches or adds new content. The update is installed on the hard drive and works in the same way as the layered file system.

                      Pirat e-edition measures and Homebrew

                      Everything you just read must be protected in some way from "unauthorized" access. If you want an overview of how Sony did it, check it out.

                      Security infrastructure overview

                      Many parts of the console already provide security features that don't need to be implemented manually in software:

                      SysCon is an unnamed proprietary chip (mentioned briefly in the boot process) that controls the power lines for the Cell, RSX, and South Bridge. Its EEPROM contains records that are read by modules in the operating system to determine which features are enabled and which are disabled.[83]

                      • I used the word "unnamed" because SysCon is just a microcontroller, either an off-the-shelf ARM7TDMI-S beefed up with MagicGate support (yes, the PS3 shares some of its DNA with the Game Boy Advance and later revisions of the PS2), or a custom 78K0R variant from NEC.[84] SysCon's internal firmware is what intrigues the most.
                      • SysCon and Cell communicate with each other using a serial interface (SPI) connected to the Cell's TEST component[85]. TEST provides many debug features on the Cell, but SysCon only connects to the "Pervasive logic" port, allowing SysCon to manage areas such as power and heat.[86]
                      • In addition to this, Sony implemented the following protections in software:
                      • A complex chain of trust that starts with the Cell's unencrypted boot ROM and ends with a graphical user interface (XMB) that only loads encrypted binaries (by Sony) under the kernel and hypervisor.
                      • The chain of trust implements multiple encryption algorithms, including asymmetric systems like RSA and ECDSA and symmetric systems like AES, combined with HMAC and SHA-1 (to verify data integrity).
                      • These special keys are used for bootldr and metldr (early boot stages).
                      • Defeat

                      We've seen how much this console can do, but did we really think hackers would settle for the limited capabilities of OtherOS? Neither did Sony. As hackers would later attest, the company worked hard to protect some areas, but others remained semi-closed.

                      The PS3 hacking community was very active, with many tools and documentation produced every year. I'll focus on some milestones that paved the way for an influx of content and homebrew development, but more information can be found in PS3History [88].

                      Hypervisor bypass

                      PS Jailbreak

                      • Honorable Mention
                      • Crash of Encryption
                        • Hypervisor bypass

                        In 2010, the community turned in a good direction after the hacking scene was quiet for three years. George Hotz is known to have previously unlocked the iPhone's first model (commonly known as "2G") and used it only (initially in Cingular/at & amp; T). In a hacker, he succeeded in reading and writing a protected area in memory without being stopped by the hypervisor. Later, he released Exploiting with simple summaries on his blog [89].

                        This Exploit is required: Linux installation (but for any but limited code execution) and an external gratcher (interface with the main RAM). be. To put it simply, the hypervisor uses a hash table stored in the main RAM to catalogs the privileged level and its privileged level, so the user program cannot access the protected memory space. This attack breaks the consistency of such a table and enables writing, and uses its privilege to change the entry so that the current program can access all corners in memory. Function.

                        In summary, HOTZ can request many memory blocks that refer to the same physical adress on the hypervisor in Linux/Otheros, but the XDR bus has an external interference (due to a gratch that sends an electric pulse). It was discovered that the program was assigned and canceled those blocks, and the assignment release process would end halfway. [90] As a result, the hypervisor hash table (located in the RAM) remains an entry of the assigned address, but at the same time, it is assumed that the space has been released. HOTZ's abuse is then demanding more blocks, so the hypervisor extends the table with more entries, and has a hash tablet entry until the memory position overlaps to the memory position of the block. Continue. The hash table has an old entry that allows users to access the address, so the hypervisor eventually gives users the right to correct the hash table entry! Thus, this Exploit modifies the entry to extend access to all memory space.

                        This Exploit requires an OTHEROS environment where Linux runs, but hackers have been able to investigate the important parts of the system that could not be accessed, so it was a major step towards a more reverse engineering and research project. 。 At the same time, Sony released Software Update 3. 21 and deleted Otheros. The hackers just hesitated to continue working, and only increased the reasons for accelerating their work.

                        PS Jail Break

                        In the latter half of 2010, a group named "PS JAILBREAK" announced a unique solution that runs Homebrew directly from the native shell (XMB under GameOS) without modifying the hardware of the game console. )did. This is all about Sony's discomfort, and Sony will soon take legal measures to prevent the sale of the product.

                        1. "PS Jail Break" was composed of USB dongle, which inserts into the USB port on the front before turning on the game console. After that, the user must press the power button and press the eject button immediately after that. If you succeed in the instructions, the user can see the normal XMB interface, but some sel f-made apps have been added to dump the "PKG installation" option and the Bl u-ray game to the hard disk. I was.
                        2. Behind the scenes, this dongle is performing a huge amount of work, which is divided into two groups [91]:
                        3. USB Exploit: When the console is turned on, Dongle tricks the system to think that it is connected to a 6-port USB hub, execute a complicated USB command sequence until it reaches a heap overflow, and the PS3 kernel (PS3. Escalate access to level 2).
                        4. Payload: This is also a complicated package, patching the original shell, enabling hidden functions (such as "PKG installation" entry, etc.) that can only be used in the debug unit or disabled signature verification (optional. To load the module/ package), redirect the Blu-ray command to the hard drive instead of the hard drive (to load the game from the hard drive). The fact that this program can change from the kernel level to this point makes what the hypervisor is good at.
                        5. To supplement this, I asked the M4J0R later: "Interestingly, this is not an abuse of Sony's code, this part of Lv2 is written by Rositech, and the developers of the abuse ( It may have been accessible to the source code by 2008 hacking.
                        6. This product was later reversible engineering by other communities, and open source clones appeared soon (such as PS Groove). Some forks were even on the Calculator of Texas Instruments. In any case, Sony responded quickly in software update 3. 42 and removed the gold mine.

                        Good work

                        Before we talk about the big prizes of the PS3 homebrew scene, here are some methods developed around the same time:

                        USB Jig: Also a USB stick, this time programmed to trick the console into entering factory service mode. The program built into the jig replicates what Sony provides to its engineers. The main benefit of service mode is that it allows the console to be downgraded to a PSJailbreak-compatible version. The payload was also available in the form of a Homebrew app for the PSP.[95] Sony responded by patching service mode to make it difficult to restore it to "normal" mode or to change the firmware from service mode, discouraging users from relying on it.

                        Optical Disc Emulator (ODE): A series of hardware products shipped by different companies (Cobra, E3, etc.). Instead of tampering with the firmware of the game console, it tampers with the SATA/PATA interface of the Blu-ray. ODE is a board that sits between the motherboard and the Blu-ray drive, acting as an intermediary that tricks the console into thinking it has a valid disc game inside, but instead loads a disc image from an external USB drive. There is a long gap in the history of PS3 hacking, a period of "unhackability" where there were no software exploits available for the new console. So, at a hefty price, ODE came along to fill that gap.

                        Downgraders: As Sony continued to mitigate the exploits with more software updates, users were left with no viable option other than downgrading to an exploitable firmware. So, at E3 and elsewhere, companies appeared that shipped specialized equipment that could overwrite the console's system the "hard way" - by directly flashing the NAND and NOR chips. For obvious reasons, this method required more skill and patience than the USB-based alternative.

                        Isolated leak: This is a research purpose, not the "function" (but indispensable for further development). Anyway, the revoked data (used to put a dangerous certificate on a blacklist) is analyzed by Lv2LDR. Now, it turns out that this process has many vulnerabilities. First of all, the reasons for the explanation are that the revoked data can be written on a user land. Second, the parser does not check the boundary of the fetched data (again). As a result, hackers have succeeded in creating a custom revoked data that generates a buffer overflow and ultimately run any code in the SPU separation mode. As a result, hackers can access confidential data (such as keys), which are probably protected from other parts of the system.

                        Decline of encryption

                        Like the PSP saga, early Exploit took a lot of effort, and Sony was easily patched, resulting in a disadvantage. However, as it happened in the PSP, it was a matter of time to discover this system's basic security, that is, the chain of trust.

                        In 2011, George Hotz (along with the Fail0verflow team) released a secret encryption key used by Sony to sign the binaries executed on METLDR. The binary loaded on the boot stage is signed by the ECDSA key. Because it is an asymmetric encryption system, anyone who has a private key (Sony, now others) can encrypt binary and sign, and as a result, METLDR's eyes are "real". You will see it. METLDR is the third boot stage before loading Lv1 (hypervisor), so the hacker can customize and develop all of the hypervisor, kernel, and everything below it. In addition, all the PlayStation 3 on the market is assumed that custom binary is real. In short, the Pandor a-style Exploit is completely software.

                        • The discovery of this key, which should have been calculated, was because Sony's ECDSA algorithm implementation had a "failure". To shorten the long story, the formula used in ECDSA uses a random value that has not been changed in all update files [97] distributed by Sony, and turns that number into a constant. It was easy to solve other variables, and eventually happened.
                        • The effect of this discovery is described in the next paragraph.
                        • The era of custom firmware (CFW)

                        METLDR cracks mean that everyone can now create a "official" system for PS3, and as a result, various communities have created a "flavor" of GameOS with various customizes. These systems were modified Sony's official firmware files (distributed as updates by Sony) and r e-packaged using Sony's spilled keys, which could be installed anywhere. As a result, it was called a custom firmware (CFW), a de facto method for hacking this console until Sony took a strict countermeasure.

                        CFW that opened and installed VSH menu. This variation (called 'Rebug') is to debug the console (focus on the IP address at the bottom right, to enter the process of running the process, and to enter it with a debugger). Also made it possible.

                        In the meantime, many CFWs appeared on the Internet under many names ('Rebug', 'FERROX', etc.), and contained customization like [98]:

                        Disable the signature verification of the modules to be installed or installed.

                        Use a hypervisor (level 1) or a kernel (level 2) to make reading and writing on any memorial dress (classic Peek and Poke).

                        Enable the hidden debug function and install the packaged module as a 'pkg' file. These did not need to be signed with Sony's key to operate in the CFW environment.

                        avatar-logo

                        Elim Rim - Journalist, creative writer

                        Last modified 03.09.2025

                        As of the release of The Taken King, you must be level 40 to be able to equip Legendary (purple) or Exotic (yellow) gear in all slots. You are. I'm almost sure your controller is messed up. You have 2 things to try: 1. Switch off and unplug your ps4. Then take a pin and insert into. The ps5 wall mount kit has a removable design that can be turned into a ps5 stand and placed on a desktop, the bottom of stand features 6 non-slip pads and.

Play for real with EXCLUSIVE BONUSES
Play
enaccepted