Thursday, February 7, 2013

SATA & PCIe

SATA
Serial ATA (SATA) was designed to replace the older AT Attachment standard (ATA; later referred to as Parallel ATA or PATA). The operation of SATA is defined by Intel's Advanced Host Controller Interface (AHCI) standard. The specification describes a system memory structure for computer hardware vendors to exchange data between host system memory and attached storage devices. It allows the use of advanced features of SATA such as hotplug and native command queuing (NCQ). AHCI is supported out of the box on Windows Vista and newer versions of Windows.

Many SATA controllers offer selectable modes of operation: legacy Parallel ATA emulation, standard AHCI mode, or vendor-specific RAID (which generally enables AHCI). Windows Vista and Windows 7, do not configure themselves to load the AHCI driver upon boot if the SATA-drive controller was not in AHCI mode at the time of installation. For this reason, Intel recommends choosing RAID mode on their motherboards (which also enables AHCI) rather than AHCI/SATA mode for maximum flexibility.

SATA uses a point-to-point architecture. The physical connection between a controller and a storage device is not shared among other controllers and storage devices. SATA defines multipliers, which allows a single SATA controller to drive multiple storage devices. The multiplier performs the function of a hub; the controller and each storage device is connected to the hub. For this reason, SCSI drives provide greater sustained throughput than multiple SATA drives connected via a simple (i.e. command-based) port multiplier because of disconnect-reconnect and aggregating performance.

SATA revision 1.0 - 1.5 Gbit/s - 150 MB/s
  • Taking 8b/10b encoding overhead into account, they have an actual uncoded transfer rate of 1.2 Gbit/s (150 MB/s).
  • Do not support Native Command Queuing (NCQ).
SATA revision 2.0 - 3 Gbit/s - 300 MB/s
  • Backward compatible with SATA 1.5 Gbit/s.
  • Taking 8b/10b encoding into account, the maximum uncoded transfer rate is 2.4 Gbit/s (300 MB/s). 
SATA revision 3.0 - 6 Gbit/s - 600 MB/s (May 27, 2009)
  • Taking 8b/10b encoding into account, the maximum uncoded transfer rate is 4.8 Gbit/s (600 MB/s). 
  • Isochronous Native Command Queuing (NCQ) streaming command to enable isochronous quality of service data transfers for streaming. 
  • An NCQ Management feature that helps optimize performance by enabling host processing and management of outstanding NCQ commands.
  • The enhancements are aimed at improving quality of service for video streaming and high-priority interrupts. 
SATA revision 3.1
  • mSATA, SATA for solid-state drives in mobile computing devices, a PCI Express Mini Card-like connector (electrically compatible).
  • Data signals (TX±/RX± SATA, PETn0 PETp0 PERn0 PERp0 PCI-express) need connection to the SATA host controller instead of the PCI-express host controller.
  • Queued TRIM Command, improves solid-state drive performance.
  • Required Link Power Management, reduces overall system power demand of several SATA devices.
  • A small low insertion force (LIF) connector for more compact 1.8-inch storage devices.
SATA revision 3.2 - SATA Express
  • SATA Express uses SATA software protocols over the PCIe hardware interface to increase SATA transfer speeds up to 8Gbit/s or 16Gbit/s.
  • µSSD (micro SSD) introduces a ball grid array electrical interface for miniaturized, embedded SATA storage.
PCI Express
PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed serial computer expansion bus standard designed to replace the older PCI, PCI-X, and AGP bus standards. PCI uses a shared parallel bus architecture, where the PCI host and all devices share a common set of address/data/control lines. In contrast, PCIe is an interconnect bus based on point-to-point topology, with separate serial links connecting every device to the root complex (host).

The PCIe link between two devices can consist of anywhere from 1 to 32 lanes. The PCIe standard defines slots and connectors for multiple widths: ×1, ×4, ×8, ×16, ×32. A lane is composed of two differential signaling pairs: one pair for receiving data, the other for transmitting it. Conceptually, each lane is used as a full-duplex byte stream, transporting data packets in eight-bit 'byte' format, between endpoints of a link, in both directions simultaneously.

A PCI-X (133 MHz 64-bit) device and PCIe device at 4-lanes (×4), Gen1 speed have roughly the same peak transfer rate in a single-direction: 1064 MB/sec. The PCIe bus has the potential to perform better than the PCI-X bus in cases where multiple devices are transferring data communicating simultaneously, or if communication with the PCIe peripheral is bidirectional.

PCI Express 1.0a (PCI Express 1.1, 2005)
  • PCIe 1.x uses an 8b/10b encoding scheme that results in a 20 percent ((10-8)/10) overhead on the raw bit rate. It uses a 2.5 GHz clock rate, therefore delivering an effective 250 000 000 bytes per second (250 MB/s) maximum data rate.
PCI Express 2.0 (2007)
  • Doubles the transfer rate compared with PCIe 1.0 to 5 GT/s and the per-lane throughput rises from 250 MB/s to 500 MB/s. This means a 32-lane PCIe connector (×32) can support throughput up to 16 GB/s aggregate.
  • PCIe 2.0 uses 8b/10b encoding scheme, in account of 20% overhead, the effective max transfer rate is 4 Gbit/s.
  • Intel's first PCIe 2.0 capable chipset was the X38. AMD started supporting PCIe 2.0 with its AMD 700 chipset series and nVidia started with the MCP72.
PCI Express 3.0 (Nov 2010)
  • Bit rate of 8 gigatransfers per second (GT/s). Other new features for the PCIe 3.0 specification include a number of optimizations for enhanced signaling and data integrity, including transmitter and receiver equalization, PLL improvements, clock data recovery, and channel enhancements for currently supported topologies.
  • PCIe 3.0 upgrades the encoding scheme to 128b/130b from the previous 8b/10b, reducing the overhead to approximately 1.5% ((130-128)/130). This is achieved by a technique called "scrambling" that applies a known binary polynomial to a data stream in a feedback topology. Because the scrambling polynomial is known, the data can be recovered by running it through a feedback topology using the inverse polynomial.
  • AMD graphic card, the Radeon HD 7970, launched on January 9, 2012, is the world's first PCIe 3.0 graphic card. The new interface would prove advantageous when used for general purpose computing with technologies like OpenCL, CUDA and C++ AMP.
NVIDIA uses the high bandwidth data transfer of PCIe for its Scalable Link Interface (SLI) technology, which allows multiple graphics cards of the same chipset and model number to run in tandem, allowing increased performance. AMD has also developed a multi-GPU system based on PCIe called CrossFire. AMD and NVIDIA have released motherboard chipsets that support as many as four PCIe ×16 slots, allowing tri-GPU and quad-GPU card configurations.

External Uses
  • PCI Express protocol can be used as data interface to flash memory devices, such as memory cards and solid state drives. One such format is XQD card developed by the CompactFlash Association, SATA Express and SCSI Express.
  • Many high-performance, enterprise-class solid state drives are designed as PCI Express RAID controller cards with flash memory chips placed directly on the circuit board; this allows much higher transfer rates (over 1 Gbyte/s) and IOPS (I/O operations per second) (over 1 million) comparing to Serial ATA or SAS drives.