Social Icons

NoPI Table

TeradataWiki-Teradata No Primary Index
A NoPI Table is simply a table without a primary index. It is a Teradata 13.00 feature. As rows are inserted into a NoPI table, rows are always appended at the end of the table and never inserted in the middle of a hash sequence. Organizing/sorting rows based on row hash is therefore avoided, which speeds up the load operation. NoPI tables are typically used as staging tables for initial load by FastLoad or TPump Array Inserts, then an Insert/Select is used to apply the updates to a populated table.

Prior to Teradata Database 13.00, Teradata tables required a primary index. The primary index was used to hash and distribute rows to the AMPs according to hash ownership. The objective was to divide data as evenly as possible among the AMPs to make use of Teradata’s parallel processing. Each row stored in a table has a RowID which includes the row hash that is generated by hashing the primary index value. For example, the optimizer can choose an efficient single-AMP execution plan for SQL requests that specify values for the columns of the primary index.

Starting with Teradata Database 13.00, a table can be defined without a primary index. This feature is referred to as the NoPI Table feature. NoPI stands for No Primary Index. NO PRIMARY INDEX is specified in the CREATE TABLE syntax.

Without a PI, the hash value as well as AMP ownership of a row is arbitrary. Within the AMP, there are no row-ordering constraints and therefore rows can be appended to the end of the table as if it were a spool table. Each row in a NoPI table has a hash bucket value that is internally generated. A NoPI table is internally treated as a hashed table; it is just that typically all the rows on one AMP will have the same hash bucket value.

Benefits:
•A NoPI table will reduce skew in intermediate ETL tables which have no natural PI.
•Loads (FastLoad and TPump array insert) into a NoPI staging table are faster.

2 comments:

  1. Should we collect stats on a No PI table? Any Idea?

    ReplyDelete
  2. Yes you need to collect stats on heavy hitting columns.

    ReplyDelete