Social Icons

A to Z Performance Tuning

Teradata Performance Tuning
A) Explain the EXPLAIN: Check for EXPLAIN plan to see how exactly Teradata will be executing the query. Try to understand basic keywords in Explain Plan like confidence level, join strategy used, re-distribution happening or not.

B) Collect STATS: The stats of the columns used join conditions should updated. Secondary Indexes without proper STATS can be of little or no help. Check for STATS status of the table.

C) Use Proper PI: If the Primary index is not properly defined in any one or all of the tables in the query. Check if the PI of target table loaded is unique.

D) Use PPI: If there is Partition Primary Index created on a table, try to use it. If you are not using it in filter condition, it will degrade the performance.

E) No FUNCTIONS in Conditions: Try to avoid using function in join conditions. Ex Applying COALESCE or TRIM etc causes high CPU consumption.

F) Use PPI: If Partition Primary Index is defined in tables try to use it. If you are not using it in filter condition, it will degrade the performance.

G) Same column DATA TYPES:
Define same data type for the joining columns.

H) Avoid IN clause in filter conditions: When there can be huge number of values in where conditions, better option can be to insert such values in a volatile table and use volatile table with INNER JOIN in the main query.

I) Use Same PI in Source & Target: PI columns also can help in saving the data into disk .If the Source and Target have the same PI, data dump can happen very efficiently form source to target.

J) Collect STATS on VOLATILE table: Collect stats on volatile tables where required can save AMPCPU. Remove stats if already present where it is not getting used.
If the volatile table contains UNIQUE PI, then go for sample stats rather than full stats.

K) DROPPING volatile tables explicitly: Once volatile tables is no more required you can drop those. Donít wait for complete procedure to be over. This will free some spool space immediately and could prove to be very helpful in avoiding No More Spool Space error.

L) NO LOG for volatile tables: Create volatile tables with NO LOG option.

M) Check DBQL Stats: Keep your performance stats accessible. Target the most AMPCPU consuming query first.

N) UPDATE clause: Do not write UPDATE clause with just SET condition and no WHERE condition. Even if the Target/Source has just one row, add WHERE clause for PI column.

O) DELETE & INSERT: Sometimes replacing UPDATE with DELETE & INSERT can save good number of AMPCPU. Check if this holds good for your query.

P) Query SPLITS: Split queries into several smaller queries logically and use volatile tables with proper PI.

Q) Try MSR: If same target table is loaded multiple times, try MSR for several sections. This will speed the final MERGE step into target table and you may see good CPU gain.

R) Try OLAP Functions:
Check if replacing co-related sub query with OLAP function may result in AMPCPU saving.

S) Avoid DUPLICATE data: If the join columns in the tables involved in the query have duplicates. Use Distinct or Group by, load into a volatile table, collect stats and use the volatile table.

T) Use Proper JOINS: If joins used, donít use right outer, left or full joins where inner joins is sufficient.

U) User proper ALIAS: Check the aliases in the joins. Small mistake could lead to a product join.

V) Avoid CAST: Avoid unnecessary casting for DATE columns. Once defined as DATE, you can compare date columns against each other even when they are in different format. Internally, DATE is stored as INTEGER. CAST is required mainly when you have to compare VARCHAR value as DATE.

W) Avoid UDF:
Most of the functions are available in Teradata for data manipulations. So avoid User Defined Functions

X) Avoid FULL TABLE SCAN: Try to avoid FTS scenarios like SI should be defined on the columns which are used as part of joins or Alternate access path. Collect stats on SI columns else there are chances where optimizer might go for FTS even when SI is defined on that particular column

Y) Avoid using IN/NOT IN: For large list of values, avoid using IN /NOT IN in SQLs. Write large list values to a temporary table and use this table in the query

Z) Use CONSTANTS:
Use constants to specify index column contents whenever possible, instead of specifying the constant once, and joining the tables. This may provide a small savings on performance.

24 comments:

  1. Its always pleasure going through this website.

    Can you please elaborate more on MSR.

    Also can you confirm if OLAP functions are recommended for better performance? Because, I have come through scenarios where ROW NUMBER / RANK turned out more costlier in terms of time and resource consumption.

    ReplyDelete
    Replies
    1. Thanks for that. If data has many duplicate values ROW NUMBER / RANK will help. For more Queries, Please post via Contact US tab for better discussion and will provide you solutions.

      Delete
  2. thank's a lot.this info helps me to brush my performance techniques.

    ReplyDelete
  3. thanks a lot admin

    ReplyDelete
  4. Thanks a lot Admin:) It helped me a lot:)

    ReplyDelete
  5. Many thanks for your efforts

    ReplyDelete
  6. Thanks.very well explained..

    ReplyDelete
  7. Thanks a Lot Admin!! This article is good and did help me.

    ReplyDelete
  8. Thanks for a nice article.

    ReplyDelete
  9. Thanks a lot :) It helped me.

    ReplyDelete
  10. thanks good explaination

    ReplyDelete
  11. Thanks good article to quickly revise

    ReplyDelete
  12. Good one..thank you :)

    ReplyDelete
  13. thanks admin good information...

    ReplyDelete
  14. Thank you very much, very helpful web site!

    ReplyDelete
  15. Very well documented.. but can anyone explain how small mistake in alias lead to product join??

    ReplyDelete
  16. good one...thank for your effort!

    ReplyDelete
  17. Thanks for good tips. Keep sharing!

    ReplyDelete