Teradata RDBMS Components
Architecture of Teradata RDBMS
Teradata is designed using
. Each processing unit processes its own unit of data in parallel. Teradata systems can be either SMP (Symmetric Multi Processing) or MPP (Massively Parallel Processing). In simple words a SMP system is a single node system where as a MPP system has two or more nodes working in parallel.
Teradata architecture contains following components :
The basic building block for a Teradata system, the node is where the processing occurs for the database. A node is simply collection of many hardware and software components.
The PDE (Parallel Database Extensions) software layer runs the operating system on each node. It was created by NCR to support the parallel environment.
System disks are contained on the node used for the following:
- Operating system software
- Teradata software
- Application software
- System dump space
Teradata database tables are stored on disk arrays, not on the system disks.
Vprocs share a free memory pool within a node. A segment of memory is allocated to a vproc for its use, then returned to the memory pool for use by another vproc. The free memory pool is a collection of memory available to the node.
A virtual processor or a vproc is a group of one or more software processes running under the operating system’s multi-tasking environment:
- On the UNIX operating system, a vproc is a collection of software processes.
- On the Windows operating systems, a vproc is a single software process.
The two types of Teradata vprocs are:
- AMP (Access Module Processor)
- PE (Parsing Engine)
When vprocs communicate, they use BYNET hardware (on MPP systems), BYNET software, and PDE. The BYNET hardware and software carry vproc messages to and from a particular node. Within a node, the BYNET and PDE software deliver messages to and from the participating vprocs.
PEs (Parsing Engines) are vprocs that receive SQL requests from the client and break the requests into steps. The PEs send the steps to the AMPs and subsequently return the answer to the client.
AMPs (Access Module Processors) are virtual processors (vprocs) that receive steps from PEs (Parsing Engines) and perform database functions to retrieve or update data. Each AMP is associated with one virtual disk (vdisk), where the data is stored. An AMP manages only its own vdisk, not the vdisk of any other AMP.
Vdisk (Virtual Disk)
A vdisk is the logical disk space that is managed by an AMP. Depending on the configuration, a vdisk may not be contained on the node; however, it is managed by an AMP, which is always a part of the node.
The vdisk is made up of 1 to 64 pdisks (user slices in UNIX or partitions in Windows NT, whose size and configuration vary based on RAID level). The pdisks logically combine to comprise the AMP’s vdisk. Although an AMP can manage up to 64 pdisks, it controls only one vdisk. An AMP manages only its own vdisk, not the vdisk of any other AMP.
The BYNET (banyan network) is a combination of hardware and software that provides high performance networking between the nodes of a Teradata system. A dual-redundant, bi-directional, multi-staged network, the BYNET enables the nodes to communicate in a high speed, loosely-coupled fashion. It is based on banyan topology, a mathematically defined structure that has branches reminiscent of a banyan tree.
The BYNET is a high-speed interconnect (network) that enables multiple nodes in the system to communicate.
The BYNET hardware and software handle the communication between the vprocs.
The nodes of an MPP system are connected with the BYNET hardware, consisting of BYNET boards and cables.
The BYNET software is installed on every node. This BYNET driver is an interface between the PDE software and the BYNET hardware.
SMP systems do not contain BYNET hardware. The PDE and BYNET software emulates BYNET activity in a single-node environment. The SMP implementation is sometimes called “boardless BYNET.”