Partitioned primary index or PPI

Partitioned primary index or PPI is used for physically splitting the table into a series of subtables. With the proper use of Partition primary Index we can save queries from time consuming full table scan. Instead of scanning full table, only one particular partition is accessed.

Follow the example below to get the insight of PPI –

We have an order table (ORDER_TABLE) having two columns – Order_Date and Order_Number, in which PI is defined on Order_Date. The primary Index (Order_Date) was hashed and rows were distributed to the proper AMP based on Row Hash Value then sorted by the Row ID. The distribution of rows will take place as explained in the image below –PPI_1

Now when we execute Query –

Select * from Order_Table where Order_Date between 1-1-2003 and 1-31-2003;

This query will result in a full table scan despite of Order_Date being PI.


Now we have defined PPI on the column Order_date. The primary Index (Order_Date) was hashed and rows were distributed to the proper AMP based on Row Hash Value then sorted by the Order_Date and not by Row ID. The distribution of rows will take place as explained in the image below –


Now when we execute Query –

Select * from Order_Table where Order_Date between 1-1-2003 and 1-31-2003;

This query will not result in a full table scan because all the January orders are kept together in their partition.


Partitions are usually defined based on Range or Case as follows.

Partition by CASE









ORDER_AMOUNT < 10000 ,

ORDER_AMOUNT < 20000 ,



Partition by RANGE









PARTITION BY range_n (

ORDER_DATE BETWEEN date ‘2012-01-01’ and ‘2012-12-31’ Each interval ‘1’ Month,

NO range OR UNKNOWN ) ;

If we use NO RANGE or NO CASE – then all values not in this range will be in a single partition.

If we specify UNKNOWN, then all null values will be placed in this partition

Date, time, user, session – Built-in functions

What are Built-in functions?

When you need the system to return information about the system – user, date, time, session etc, built-in functions are used. Built-in functions are sometimes referred to as special registers. Mostly used with the SELECT statement.


If you are wondering about your Teradata Account information just use this function to see your current account string.


Returns the current system date and it is same as the function DATE.


This function returns the current system time and current session ‘Time Zone’ displacement. This function is similar to the TIME function but with extra feature of Time Zone displacement.


Returns the current system timestamp (including year, month and day) and current session Time Zone displacement.


If you are wondering how to find your current database name, use this function which returns the name of the default database for the current user.

• DATE: –

As mentioned earlier, it does return the current date.


Returns a number for the session the current user is in. Two sessions does not have the same number. If you want to identify your session, this is the function to use.

• TIME: –

As mentioned, this function provides the current time based on a 24-hour day; mean to say for 4:00 pm, you would see 16:00:00.

• USER: –

If you have forgotten your username after you have logged in, this command would come to your rescue. This one gives the user name of the current user.

Teradata Architecture detailed

For a simpler introduction check out Teradata Architecture explained
The Teradata database currently runs normally on NCR Corporation’s WorldMark Systems in the UNIX MP-RAS environment. Some of these systems consist of a single processing node (computer) while others are several hundred nodes working together in a single system. The NCR nodes are based entirely on industry standard CPU processor chips, standard internal and external bus architectures like PCI and SCSI, and standard memory modules with 4-way interleaving for speed.
At the same time, Teradata can run on any hardware server in the single node environment when the system runs Microsoft NT and Windows 2000. This single node may be any computer from a large server to a laptop.
Whether the system consists of a single node or is a massively parallel system with hundreds of nodes, the Teradata RDBMS uses the exact same components executing on all the nodes in parallel. The only difference between small and large systems is the number of processing components.
When these components exist on different nodes, it is essential that the components communicate with each other at high speed. To facilitate the communications, the multi-node systems use the BYNET interconnect. It is a high speed, multi-path, dual redundant communications channel. Another amazing capability of the BYNET is that the bandwidth increases with each consecutive node added into the system. There is more detail on the BYNET later in this chapter.

Teradata Components

As previously mentioned, Teradata is the superior product today because of its parallel operations based on its architectural design. It is the parallel processing by the major components that provide the power to move mountains of data. Teradata works more like the early Egyptians who built the pyramids without heavy equipment using parallel, coordinated human efforts. It uses smaller nodes running several processing components all working together on the same user request. Therefore, a monumental task is completed in record time.
Teradata operates with three major components to achieve the parallel operations. These components are called: Parsing Engine Processors, Access Module Processors and the Message Passing Layer. The role of each component is discussed in the next sections to provide a better understanding of Teradata. Once we understand how Teradata works, we will pursue the SQL that allows storage and access of the data.

Parsing Engine Processor (PEP or PE)

The Parsing Engine Processor (PEP) or Parsing Engine (PE), for short, is one of the two primary types of processing tasks used by Teradata. It provides the entry point into the database for users on mainframe and networked computer systems. It is the primary director task within Teradata.
As users “logon” to the database they establish a Teradata session. Each PE can manage 120 concurrent user sessions. Within each of these sessions users submit SQL as a request for the database server to take an action on their behalf. The PE will then parse the SQL statement to establish which database objects are involved. For now, let’s assume that the database object is a table. A table is a two-dimensional array that consists of rows and columns. A row represents an entity stored in a table and it is defined using columns. An example of a row might be the sale of an item and its columns include the UPC, a description and the quantity sold.
Any action a user requests must also go through a security check to validate their privileges as defined by the database administrator. Once their authorization at the object level is verified, the PE will verify that the columns requested actually exist within the objects referenced.
Next, the PE optimizes the SQL to create an execution plan that is as efficient as possible based on the amount of data in each table, the indices defined, the type of indices, the selectivity level of the indices, and the number of processing steps needed to retrieve the data. The PE is responsible for passing the optimized execution plan to other components as the best way to gather the data.
An execution plan might use the primary index column assigned to the table, a secondary index or a full table scan. The use of an index is preferable and will be discussed later in this chapter. For now, it is sufficient to say that a full table scan means that all rows in the table must be read and compared to locate the requested data.
Although a full table scan sounds really bad, within the architecture of Teradata, it is not necessarily a bad thing because the data is divided up and distributed to multiple, parallel components throughout the database. We will look next at the AMPs that perform the parallel disk access using their file system logic. The AMPs manage all data storage on disks. The PE has no disks.
Activities of a PE:
·       Convert incoming requests from EBCDIC to ASCII (if from an IBM mainframe)
·       Parse the SQL to determine type and validity
·       Validate user privileges
·       Optimize the access path(s) to retrieve the rows
·       Build an execution plan with necessary steps for row access
·       Send the plan steps to Access Module Processors (AMP) involved

Access Module Processor (AMP)

The next major component of Teradata’s parallel architecture is called an Access Module Processor (AMP). It stores and retrieves the distributed data in parallel. Ideally, the data rows of each table are distributed evenly across all the AMPs. The AMPs read and write data and are the workhorses of the database. Their job is to receive the optimized plan steps, built by the PE after it completes the optimization, and execute them. The AMPs are designed to work in parallel to complete the request in the shortest possible time.
Optimally, every AMP should contain a subset of all the rows loaded into every table. By dividing up the data, it automatically divides up the work of retrieving the data. Remember, all work comes as a result of a users’ SQL request. If the SQL asks for a specific row, that row exists in its entirety (all columns) on a single AMP and other rows exist on the other AMPs.
If the user request asks for all of the rows in a table, every AMP should participate along with all the other AMPs to complete the retrieval of all rows. This type of processing is called an all AMP operation and an all rows scan. However, each AMP is only responsible for its rows, not the rows that belong to a different AMP. As far as the AMPs are concerned, it owns all of the rows. Within Teradata, the AMP environment is ashared nothing” configuration. The AMPs cannot access each other’s data rows, and there is no need for them to do so.
Once the rows have been selected, the last step is to return them to the client program that initiated the SQL request. Since the rows are scattered across multiple AMPs, they must be consolidated before reaching the client. This consolidation process is accomplished as a part of the transmission to the client so that a final comprehensive sort of all the rows is never performed. Instead, all AMPs sort only their rows (at the same time – in parallel) and the Message Passing Layer is used to merge the rows as they are transmitted from all the AMPs.
Therefore, when a client wishes to sequence the rows of an answer set, this technique causes the sort of all the rows to be done in parallel. Each AMP sorts only its subset of the rows at the same time all the other AMPs sort their rows. Once all of the individual sorts are complete, the BYNET merges the sorted rows. Pretty brilliant!
Activities of the AMP:
·       Store and retrieve data rows using the file system
·       Aggregate data
·       Join processing between multiple tables
·       Convert ASCII returned data to EBCDIC (IBM mainframes only)
·       Sort and format output data

Message Passing Layer (BYNET)

The Message Passing Layer varies depending on the specific hardware on which the Teradata database is executing. In the latter part of the 20th century, most Teradata database systems executed under the UNIX operating system. However, in 1998, Teradata was released on Microsoft’s NT operating system. Today it also executes under Windows 2000. The initial release of Teradata, on the Microsoft systems, is for a single node.
When using the UNIX operating system, Teradata supports up to 512 nodes. This massively parallel system establishes the basis for storing and retrieving data from the largest commercial databases in the world, Teradata. Today, the largest system in the world consists of 176 nodes. There is much room for growth as the databases begin to exceed 40 or 50 terabytes.
For the NCR UNIX systems, the Message Passing Layer is called the BYNET. The amazing thing about the BYNET is its capacity. Instead of a fixed bandwidth that is shared among multiple nodes, the bandwidth of the BYNET increases as the number of nodes increase. This feat is accomplished as a result of using virtual circuits instead of using a single fixed cable or a twisted pair configuration.
To understand the workings of the BYNET, think of a telephone switch used by local and long distance carriers. As more and more people place phone calls, no one needs to speak slower. As one switch becomes saturated, another switch is automatically used. When your phone call is routed through a different switch, you do not need to speak slower. If a natural or other type of disaster occurs and a switch is destroyed, all subsequent calls are routed through other switches. The BYNET is designed to work like a telephone switching network.
An additional aspect of the BYNET is that it is really two connection paths, like having two phone lines for a business. The redundancy allows for two different aspects of its performance. The first aspect is speed. Each path of the BYNET provides bandwidth of 10 Megabytes (MB) per second with Version 1 and 60 MB per second with Version 2. Therefore the aggregate speed of the two connections is 20MB/second or 120MB/second. However, as mentioned earlier, the bandwidth grows linearly as more nodes are added. Using Version 1 any two nodes communicate at 40MB/second (10MB/second * 2 BYNETs * 2 nodes). Therefore, 10 nodes can utilize 200MB/second and 100 nodes have 2000MB/second available between them. When using the version 2 BYNET, the same 100 nodes communicate at 12,000MB/second (60MB/second * 2 BYNETs * 100 nodes).
The second and equally important aspect of the BYNET uses the two connections for availability. Regardless of the speed associated with each BYNET connection, if one of the connections should fail, the second is completely independent and can continue to function at its individual speed without the other connection. Therefore, communications continue to pass between all nodes.
Although the BYNET is performing at half the capacity during an outage, it is still operational and SQL is able to complete without failing. In reality, when the BYNET is performing at only 10MB/second per node, it is still a lot faster than many normal networks that typically transfer messages at 10MB per second.
All messages going across the BYNET offer guaranteed delivery. So, any messages not successfully delivered because of a failure on one connection automatically route across the other connection. Since half of the BYNET is not working, the bandwidth reduces by half. However, when the failed connection is returned to service, its topology is automatically configured back into service and it begins transferring messages along with the other connection. Once this occurs, the capacity returns to normal.

Teradata RDBMS Components and Architecture

Teradata RDBMS Components

Architecture of Teradata RDBMS

Teradata is designed using

Shared-Nothing architecture

. Each processing unit processes its own unit of data in parallel. Teradata systems can be either SMP (Symmetric Multi Processing) or MPP (Massively Parallel Processing). In simple words a SMP system is a single node system where as a MPP system has two or more nodes working in parallel.

Teradata architecture contains following components :
1) Node
3) PE
4) AMP

Architecture Components



The basic building block for a Teradata system, the node is where the processing occurs for the database. A node is simply collection of many hardware and software components.


The PDE (Parallel Database Extensions) software layer runs the operating system on each node. It was created by NCR to support the parallel environment.

System Disks

System disks are contained on the node used for the following:


  • Operating system software
  • Teradata software
  • Application software
  • System dump space

Teradata database tables are stored on disk arrays, not on the system disks.


Vprocs share a free memory pool within a node. A segment of memory is allocated to a vproc for its use, then returned to the memory pool for use by another vproc. The free memory pool is a collection of memory available to the node.


A virtual processor or a vproc is a group of one or more software processes running under the operating system’s multi-tasking environment:


  • On the UNIX operating system, a vproc is a collection of software processes.
  • On the Windows operating systems, a vproc is a single software process.

The two types of Teradata vprocs are:

  • AMP (Access Module Processor)
  • PE (Parsing Engine)

When vprocs communicate, they use BYNET hardware (on MPP systems), BYNET software, and PDE. The BYNET hardware and software carry vproc messages to and from a particular node. Within a node, the BYNET and PDE software deliver messages to and from the participating vprocs.


PEs (Parsing Engines) are vprocs that receive SQL requests from the client and break the requests into steps. The PEs send the steps to the AMPs and subsequently return the answer to the client.


AMPs (Access Module Processors) are virtual processors (vprocs) that receive steps from PEs (Parsing Engines) and perform database functions to retrieve or update data. Each AMP is associated with one virtual disk (vdisk), where the data is stored. An AMP manages only its own vdisk, not the vdisk of any other AMP.

Vdisk (Virtual Disk)

A vdisk is the logical disk space that is managed by an AMP. Depending on the configuration, a vdisk may not be contained on the node; however, it is managed by an AMP, which is always a part of the node.

The vdisk is made up of 1 to 64 pdisks (user slices in UNIX or partitions in Windows NT, whose size and configuration vary based on RAID level). The pdisks logically combine to comprise the AMP’s vdisk. Although an AMP can manage up to 64 pdisks, it controls only one vdisk. An AMP manages only its own vdisk, not the vdisk of any other AMP.


The BYNET (banyan network) is a combination of hardware and software that provides high performance networking between the nodes of a Teradata system. A dual-redundant, bi-directional, multi-staged network, the BYNET enables the nodes to communicate in a high speed, loosely-coupled fashion. It is based on banyan topology, a mathematically defined structure that has branches reminiscent of a banyan tree.


The BYNET is a high-speed interconnect (network) that enables multiple nodes in the system to communicate.
The BYNET hardware and software handle the communication between the vprocs.

  • Hardware:

    The nodes of an MPP system are connected with the BYNET hardware, consisting of BYNET boards and cables.

  • Software:

    The BYNET software is installed on every node. This BYNET driver is an interface between the PDE software and the BYNET hardware.


SMP systems do not contain BYNET hardware. The PDE and BYNET software emulates BYNET activity in a single-node environment. The SMP implementation is sometimes called “boardless BYNET.”

Teradata Primary/Secondary Index

Primary and Secondary Index  for performance

Don’t get confused with Primary Keys and Foreign Keys. Those are to relate tables with each other


Primary Index (PI) Secondary Index (SI)
The Primary Index can consist of up to sixteen different columns. These columns, when considered together, provide a comprehensive technique to derive a Unique Primary Index (UPI, pronounced as “you-pea”). Although uniqueness is good in most cases, Teradata does not require that a UPIbe used. It also allows for a Non-Unique Primary Index (NUPI, pronounced as new-pea). A Secondary Index (SI) is used in Teradata as a way to directly access rows in the data, sometimes called the base table, without requiring the use of PI values. Unlike the PI, an SI does not effect the distribution of the data rows. Instead, it is an alternate read path and allows for a method to locate the PI value using the SI. Once the PI is obtained, the row can be directly accessed using the PI. Like the PI, an SI can consist of up to 16 columns.
Every table must have a PI and it is established when the table is created. If the DDL statement does not specify a PI, but it specifies a PRIMARY KEY (PK), the named column(s) are used as the UPI. Although Teradata does not use primary keys, the DDL may be ported from another vendor’s database system. An SI can also be unique (USI, pronounced as you-sea) or non-unique (NUSI, pronounced as new-sea). If the SQL does not use a NUSI, you should consider dropping it, due to the fact that the subtable takes up PERM space with no benefit to the users. The Teradata EXPLAIN is the easiest way to determine if your SQL is using a NUSI. Furthermore, the optimizer will never use a NUSI without STATISTICS.


To store the data, the value(s) in the PI are hashed via a calculation to determine which AMP will own the data. The same data values always hash the same row hash and therefore are always associated with the same AMP.

Use these links to learn about partitioned primary Index, advantages of partitioned primary index and secondary index

Processing Types

Set processing VS Row processing in Teradata


Row-by-Row Processing Set Processing

Row-by-row processing is where there are many rows to process, one row is fetched at a time and all calculations are done on it, then it is updated or inserted. Then the next row is fetched and processed. This is row-by-row processing and it makes for a slow program.

A benefit of row processing is that there is less lock contention.

A lot of data processing is set processing, which is what relational databases do best. Instead of processing row-by-row sequentially, you can process relational data set-by-set, without a cursor. For example, to sum all payment rows with 100 or less balances, a single SQL statement completely processes all rows that meet the condition as a set. With sufficient rows to process, this can be 10 to 30 or more times faster than row-at-a-time processing.

An update with all AMPs involved

Single session processing which takes advantage of parallel processing

Efficient updates of large amounts of data

Teradata Administration is easier than other databases

Due to reduced need of Teradata administration, Teradata provide huge cost advantages, especially when it comes to staffing DB admins. If data grows rapidly, the system can expand easily to accommodate it

The cost saves are due to these tasks not needed to be done:

Reorganize data or index space.

Pre-allocate table/index space and format partitioning. While it is possible to have partitioned indexes in Teradata, they are not required.

Pre-prepare data for loading (convert, sort etc.,)

Unload/Reload data spaces due to expansion. With Teradata, the data can be redistributed on the larger configuration with no offloading and reloading required.

Write or run programs to split input source files into partitions for loading.




Teradata began to associate itself with the term, “Big Data” in 2010. CTO, Stephen Brobst, attributes the rise of big data to “new media sources, such as social media.”

The increase in semi-structured and unstructured data gathered from online interactions prompted Teradata to form the “Petabyte club” in 2011 for its heaviest big data users.

The rise of big data resulted in many traditional data warehousing companies updating their products and technology.

For Teradata, big data prompted the acquisition of Aster Data Systems in 2011 for the company’s MapReduce capabilities and ability to store and analyze semi-structured data.

Public interest in big data resulted in a 13% increase in Teradata’s global sales.



Uses of Teradata

Teradata empowers organizations find business improvement opportunities (BIO) and grow through data mining

Teradata integrates customer’s discrete data from multiple departments and helps in analytics and new val

Teradata data mining

Teradata data mining

ue generation

Teradata keeps accurate data in a single structure in a common format that enables organizations have a 360 degree analysis of a customer to add new value to life.

Teradata uses parallelism to process terabytes of data and allows users get business-critical reports & statistics in short time. Thus it reduces cost and increases productivity & efficiency