Monday, May 11, 2009

Informatica Upgrade Challenge –Default SQL Join for a Source Qualifier in 7x vs. 8x


Default SQL Query Generation for a Source Qualifier:

When relational sources are joined in one Source Qualifier transformation, the PowerCenter Server joins the tables based on the related keys in each table. This default join will be an equijoin like below
Source1.column_name = Source2.column_name
For Default joins to work, the columns in the default join must have:
  • A primary key-foreign key relationship
  • Matching data types
In current scenario, Most of the Datawarehouse are designed such a way that the primary key – foreign key relationship are designed in the logic instead of physical tables. In scenarios, where the fact tables are joined with dimension tables, the developer writes the join condition specifically in user defined join property present in source qualifier. This can be also done by default joins by creating relationships between the tables in Informatica instead of creating physically on the tables.
Creating relationships between the tables in Informatica are simple, just by dragging and dropping the column from one source definition to the other in Source Analyzer.

PowerCenter Server and SQL Query Generation

When a session is executed, Powercenter Server has two options
  1. Use the SQL Query typed by the developer if the ‘SQL Query’ property text window has ‘some text’ which is not blank
  2. If the ‘SQL Query’ property is blank then the PowerCenter Server generates a query for each Source Qualifier transformation when it runs the session.
  3. The SQL Query generation process for option 2 is bit different in PowerCenter 7x and 8x.
The Default query from Powercenter 7x is built in the below order
  1. SELECT keyword
  2. Field/Port Names which are linked to the next transformation from Source Qualifier
  3. FROM Keyword
  4. List of table names from the source definitions connected to the Source Qualifier separated by Comma
  5. WHERE Keyword
  6. [Value Present in the “User Defined Join” property ]
  7. [AND Keyword]  combined with Default Join Condition formed by Powercenter based on the relationship (If the User Defined Join is not present)
  8. [AND Keyword]  combined with Value present in the “Source Filter” property
  9. [ORDER BY keyword By Default, It selects the first field which is being selected after the SELECT clause.]
Where as in the Powercenter 8x, the default query is built in the below order
  1. SELECT keyword
  2. Field/Port Names which are linked to the next transformation from Source Qualifier
  3. FROM Keyword
  4. List of table names from the source definitions connected to the Source Qualifier separated by Comma
  5. WHERE Keyword
  6. [Value Present in the “User Defined Join” property ]
  7. [AND Keyword]  combined with Value present in the “Source Filter” property
  8. [AND Keyword]  combined with Default Join Condition formed by Powercenter based on the relationship (If the User Defined Join is not present)
  9. [ORDER BY keyword By Default, It selects the first field which is being selected after the SELECT clause.]
The Default join condition in 8x is appended next to the Source Filter where as in 7x the default join is appended before the source filter.

I came across an issue in a recent upgrade project because of this difference in behavior. The mapping that ran properly in 7x which extracted the required data from the source, actually ran into problem 8x. The upgraded mapping in 8x created a Cartesian SQL join. When analyzed found that the source filter had the last line commented with ‘—‘. This made the default join condition to also get commented in 8x which resulted in Cartesian product of the source tables.

So the key is to determine how many of the Informatica mappings/sessions have Source Filter property set with a comment ‘—‘, this could help identify this issue much earlier in the upgrade.

Thanks for reading, share any other upgrade challenge that you have faced.

To know more about Informatica Upgrade Challenge

0 comments:

Post a Comment