The following are generally accepted “Best Practices” for Informatica PowerCenter ETL development and if implemented, can significantly improve the overall performance.
Category | Technique | Benefits | |
Source Extracts | Loading data from Fixed-width files take less time than delimited, since delimited files require extra parsing. Incase of Fixed width files, Integration service know the Start and End position of each columns upfront and thus reduces the processing time. | Performance Improvement | |
Using flat files located on the server machine loads faster than a database located on the server machine. | Performance Improvement | ||
Mapping Designer | There should be a place holder transformation (Expression) immediately after the Source and one before the target. Data type and Data width changes are bound to happen during development phase and these place holder transformations are used to preserve the port link between transformations. | Best Practices | |
Connect only the ports that are required in targets to subsequent transformations. Also, active transformations that reduce the number of records should be used as early in the mapping. | Code Optimization | ||
If a join must be used in the Mapping, select appropriate driving/master table while using joins. The table with the lesser number of rows should be the driving/master table. | Performance Improvement | ||
Transformations | If there are multiple Lookup condition, make the condition with the “=” sign first in order to optimize the lookup performance. Also, indexes on the database table should include every column used in the lookup condition. | Code Optimization | |
Persistent caches should be used if the lookup data is not expected to change often. This cache files are saved and can be reused for subsequent runs, eliminating querying the database. | Performance Improvement | ||
Integration Service processes numeric operations faster than string operations. For example, if a lookup is done on a large amount of data on two columns, EMPLOYEE_NAME and EMPLOYEE_ID, configuring the lookup around EMPLOYEE_ID improves performance. | Code Optimization | ||
Replace Complex filter expression with a flag (Y/N). Complex logic should be moved to the expression transformation and the result should be stored in a port. Filter expression should take less time to evaluate this port rather than executing the entire logic in Filter expression. | Best Practices | ||
Power Center Server automatically makes conversions between compatible data types which slowdown the performance considerably. For example, if a mapping moves data from an Integer port to a Decimal port, then back to an Integer port, the conversion may be unnecessary. | Performance Improvement | ||
Assigning default values to a port; Transformation errors written to session log will always slow down the session performance. Try removing default values and eliminate transformation errors. | Performance Improvement | ||
Complex joins in Source Qualifiers should be replaced with Database views. There won’t be any performance gains, but it improves the readability a lot. Also, any new conditions can be evaluated easily by just changing the Database view “WHERE” clause. | Best Practices |
0 comments:
Post a Comment