Thursday, March 19, 2009

Informatica PowerCenter 8x Key Concepts – 6


6.  Integration Service (IS)

The key functions of IS are
  • Interpretation of the workflow and mapping metadata from the repository.
  • Execution of the instructions in the metadata
  • Manages the data from source system to target system within the memory and disk
The main three components of Integration Service which enable data movement are,
  • Integration Service Process
  • Load Balancer
  • Data Transformation Manager

6.1 Integration Service Process (ISP)

The Integration Service starts one or more Integration Service processes to run and monitor workflows. When we run a workflow, the ISP starts and locks the workflow, runs the workflow tasks, and starts the process to run sessions. The functions of the Integration Service Process are,
  • Locks and reads the workflow
  • Manages workflow scheduling, ie, maintains session dependency
  • Reads the workflow parameter file
  • Creates the workflow log
  • Runs workflow tasks and evaluates the conditional links
  • Starts the DTM process to run the session
  • Writes historical run information to the repository
  • Sends post-session emails

6.2    Load Balancer

The Load Balancer dispatches tasks to achieve optimal performance. It dispatches tasks to a single node or across the nodes in a grid after performing a sequence of steps. Before understanding these steps we have to know about Resources, Resource Provision Thresholds, Dispatch mode and Service levels
  • Resources – we can configure the Integration Service to check the resources available on each node and match them with the resources required to run the task. For example, if a session uses an SAP source, the Load Balancer dispatches the session only to nodes where the SAP client is installed
  • Three Resource Provision Thresholds, The maximum number of runnable threads waiting for CPU resources on the node called Maximum CPU Run Queue Length. The maximum percentage of virtual memory allocated on the node relative to the total physical memory size called Maximum Memory %. The maximum number of running Session and Command tasks allowed for each Integration Service process running on the node called Maximum Processes
  • Three Dispatch mode’s – Round-Robin: The Load Balancer dispatches tasks to available nodes in a round-robin fashion after checking the “Maximum Process” threshold. Metric-based: Checks all the three resource provision thresholds and dispatches tasks in round robin fashion. Adaptive: Checks all the three resource provision thresholds and also ranks nodes according to current CPU availability
  • Service Levels establishes priority among tasks that are waiting to be dispatched, the three components of service levels are Name, Dispatch Priority and Maximum dispatch wait time. “Maximum dispatch wait time” is the amount of time a task can wait in queue and this ensures no task waits forever
A .Dispatching Tasks on a node
  1. The Load Balancer checks different resource provision thresholds on the node depending on the Dispatch mode set. If dispatching the task causes any threshold to be exceeded, the Load Balancer places the task in the dispatch queue, and it dispatches the task later
  2. The Load Balancer dispatches all tasks to the node that runs the master Integration Service process
B. Dispatching Tasks on a grid,
  1. The Load Balancer verifies which nodes are currently running and enabled
  2. The Load Balancer identifies nodes that have the PowerCenter resources required by the tasks in the workflow
  3. The Load Balancer verifies that the resource provision thresholds on each candidate node are not exceeded. If dispatching the task causes a threshold to be exceeded, the Load Balancer places the task in the dispatch queue, and it dispatches the task later
  4. The Load Balancer selects a node based on the dispatch mode

6.3 Data Transformation Manager (DTM) Process

When the workflow reaches a session, the Integration Service Process starts the DTM process. The DTM is the process associated with the session task. The DTM process performs the following tasks:
  • Retrieves and validates session information from the repository.
  • Validates source and target code pages.
  • Verifies connection object permissions.
  • Performs pushdown optimization when the session is configured for pushdown optimization.
  • Adds partitions to the session when the session is configured for dynamic partitioning.
  • Expands the service process variables, session parameters, and mapping variables and parameters.
  • Creates the session log.
  • Runs pre-session shell commands, stored procedures, and SQL.
  • Sends a request to start worker DTM processes on other nodes when the session is configured to run on a grid.
  • Creates and runs mapping, reader, writer, and transformation threads to extract, transform, and load data
  • Runs post-session stored procedures, SQL, and shell commands and sends post-session email
  • After the session is complete, reports execution result to ISP
Pictorial Representation of Workflow execution:
  1. A PowerCenter Client request IS to start workflow
  2. IS starts ISP
  3. ISP consults LB to select node
  4. ISP starts DTM in node selected by LB

Friday, March 13, 2009

Multi-User Environment for Siebel Analytics/OBIEE by Alok Chowdhary on March 13, 2009 in Siebel Street


By default, only one user can edit the repository at a time, but a more efficient environment would allow developers to modify the repository simultaneously and then check in changes. Oracle BI allows multiple developers to work on objects from the same repository during group development of Oracle BI applications.

Steps for configuring Oracle BI multi-user development environment:

1)Create Project
In the Admin tool, open the Project Manager
Path:-Select Manage > Projects and then Action > New project for creating new project.
From the figure down below, you can notice two parts, in the left you can see objects that are available for the project and the right part consists of objects that can be added in projects. Select the objects from the left part that you want to add to the project and click the Add button. If you have selected the presentation catalog, all fact and dependent objects are selected in the project.
Siebel
Besides the catalog, other objects such as USER, groups, Variables and initialization blocks can also be added in projects. Apart from this, you can also remove unwanted objects from project by clicking the Remove button.

2) Set up a shared network directory
Administrator needs to identify or create a shared network directory that can be accessible to developers to keep the repository file at that location. This repository is the master repository which is accessible to multiple developers to check in or check out the changes done. Developer has to point to this directory path when they use Admin tool at their machine.

Making changes in the Admin tool at the local machine to use as a multi-user development environment:
1)Point to multi-user directory:
Set up for Admin tool to point multi-user  development directory.
Path:-Select Tools > Options and then select Multiuser tab.
multi-user-environment
From the figure,  it can be seen that it is a two field, one multi-user development directory in which you have to browse the path of the shared directory where the original repository has been kept for development purposes. The other field full name  is optional,  but if the user mentions name in that field, it helps in tracking the changes made by each user and stored in the HKEY_CURRENT_USER part of registry and is therefore unique for each login on computer.
2. Check out project:-
After pointing to the multi-user development , the directory developer can check out desired projects.
To check out projects, go to path File > Multiuser > Checkout which will be only available when the multi-user environment is setup. After this, the developer is presented with a dialog box to select the master repository if one has more than one repository. Select the repository , then enter user name and password, it will navigate to select the project or projects to be imported. After selecting the projects,  user must enter the name of the new repository which will be stored in user’s local directory.
Check out project

3.Admin tool task during checkout:-
During checkout , the admin tool performs the following task:-
  • Makes a temporary copy of the master repository on the local machine.
  • Saves local copy of projects in the new repository on the local machine.
  • Saves second local copy of project in the new repository on the l
    local machine with prefix as “original”.
  • Deletes temporary copy of the master repository from the local machine.
4. Changes done in metadata:-
Changes can be performed on logical tables, table definitions and logical table sources. Developers can work on the same project but if one developer deletes objects,  it will be migrated without any warning. So developers should keep in mind that modifications can affect others too.
5. Tasks done during check-in:-
The Admin tool perform the process of locking the master repository to prevent other developers from attempting to merge at the same time and copies the master repository to a local directory so that the developer will be merging with the latest or recent repository.
6. Check in changes:-
After performing modifications on the repository, the developer needs to check in changes and merge with the master repository in the shared path. Only one developer at a time can merge it. After selecting File > Multi-user >Merge local changes, the developer is shown a  dialog box having full name and the option to write comments if any and after clicking ok , the admin tool performs the process of copying master from shared and keeps it on the local machine.
After developers lock the master repository,  the merge process take place. After the merge process,  developers have to publish to the network.  Go to path File >Multi-user >Publish to Network to publish changes done in the repository.  This will finally merge local repository changes to the master repository and at the same time a local copy of the repository has been removed from the local machine.
Advantages:
  • History menu option:-This option gives the detail of all the changes performed during the merge process. We can have the version history that tracks all the changes performed in the repository during the merge process.
  • It helps to track the Project history.
Disadvantage:
  • Multiuser develop environment is purely for repository development. Dashboards/Reports can be developed/managed by any no of users in the browser once the presentation server is up.
Tp know more about OBIEE 

Tuesday, February 24, 2009

Avoid Several Restarts of Siebel (SWSE) Web Server After Each Build


When changes made to the browser scripts, customized images, style sheets, help files these files are copied to the respective folders under Siebel root/ Siebel Server/ Webmaster. To synchronize the changes to the web server, The Siebel web server usually restarted.


Restarting the Siebel web servers after each build is tedious task when you have several builds from development to testing or other environments.

Alternatively you can use UpdateWebImages!

UpdateWebImages! Interesting, I will explain step by step below.

Type the below URL in the browser and hit the enter key. That’s it.
http://host:port/application/start.swe?SWECmd=UpdateWebImages&SWEPassword=WebUpdateProtectionKey
Host = The name of the Web server machine.
Port = The Web server listen port (not required if using the default port, 80).
Application = Any Siebel application hosted by this Web server (such as callcenter_enu, sales_enu, and so on).
WebUpdateProtectionKey = The unencrypted version of the Web Update Protection Key, which is defined in the eapps.cfg file by the WebUpdatePassword parameter.

To know more about Siebel Web Server

Monday, February 16, 2009

Industry Specific BI – What's the common denominator?

My previous post on business process fundamentals concluded with a friendly exhortation to BI practitioners inciting them to view their craft from the point of optimizing business process.
So the next time you are involved in any BI endeavor, please ask this question to yourself and the people involved in the project – “So which business process is this BI project supposed to optimize, why and how?” I define ‘Optimization’ loosely as anything that leads to bottom-line or top-line benefits.
Business processes by its very definition belong to the industry domain. Companies have their own business processes – some of them are standard across firms in that particular domain and many of them are unique to specific companies. Efficiency of business processes is a source of competitive advantage and the fact that ERP vendors like SAP has special configurations for every industry illustrates this point. So by corollary, for BI to be effective in optimizing business processes, it has to be tied to specific industry needs creating what can be called as “Verticalized Business Intelligence”. (V-BI in short)
At Hexaware’s Business Intelligence & Analytics practice (the company and team that I belong to), we have taken the concept of V-BI pretty seriously and have built solutions aimed at industry verticals. You can view our vertical specific BI offering at this link and we definitely welcome your comments on that.
Though Verticalized BI is a powerful idea, companies typically need an “analytics anchor point” to establish a BI infrastructure before embarking on their domain specific BI initiatives. The analytics anchor point, mentioned above, should have the following characteristics:
  • All organizations across domains should have the necessity to implement it
  • Business process associated with these analytics needs to be fairly standardized and should be handled by experts
  • Should involve some of the most critical stakeholders within the organization as the success of this first initiative will lay the foundation for future work
Based on my experience in providing consulting services for organizations in laying down an Enterprise BI roadmap, I feel that “Financial Analytics” has all the right characteristics to become the analytics anchor point for companies. Financial Analytics, the common denominator, typically comprises of:
  • General Ledger Analysis – (also known as Financial Statements Analysis)
  • Profitability Analysis (Customer / Product Profitability etc.)
  • Budgeting, Planning & Forecasting
  • Monitoring & Controlling – The Dashboards & Scorecards
  • General Ledger Consolidation
The above mentioned areas are also classified as Enterprise Performance Management. The convergence of Performance Management and BI is another interesting topic (recent announcements of Microsoft have made this subject doubly interesting!) and I will write about it in my future posts.
In my humble opinion, the prescription for Enterprise BI is:
  • Select one or more areas of Financial Analytics (as mentioned above) as your first target for Enterprise BI.
  • During the process of completing step 1, establish the technology and process infrastructure for BI in the organization
  • Add your industry specific BI initiatives (Verticalized Business Intelligence) as you move up the curve
I, for one, truly believe in the power of Verticalized BI to develop solutions that provide the best fit between business and technology. That business and IT people can sit across the table and look at each other with mutual respect is another important non-trivial benefit.
Thanks for reading. Do you have any other analytics anchor points for organizations to jumpstart their BI initiatives? Please do share your thoughts.
Read More About  Industry Specific BI

Tuesday, February 10, 2009

New Mile Stone in Workforce Modelling and Intelligence – Orgplus


It was the last quarter of 2007 when we had a discussion on the workforce modelling and intelligence(refer -”workforce modelling and intelligence” post) in our pitstop. Now, the news is – Orgplus enterprise is named as the top HR product of the year 2008.


Yes,.. In a latest news release by pressreleasepoint, the author states that “OrgPlus allows managers to easily visualize and understand their talent data, and if an organization is growing, reorganizing or merging, OrgPlus rapidly models these changes using accurate financial, workforce, and budget data so HR, finance and executive teams can make better decisions to impact overall corporate objectives.”


The Integration between the PeopleSoft HCM 9.0 and Orgplus was much awaited to be validated by Oracle and now Orgplus had recieved the integration validation. In short, the integration revolution within the HR IT Products seems to win the race.The importance of such integrations were realised by companies who concentrated on their ROI few years back and I would suggest them even in our present financial crisis term. 

Enterprise Content Management – Livelink Vs SharePoint


Do you feel that the growth of an organization is directly proportional to the content mangement within the organization ? or vice versa? Let’s look into one of the driving force behind a companies growth – It’s the Enterprise Content Management. ’Collaboration’, ‘Goals’, ‘Processes’, Business Data Analysis are the words frequently uttered across; when a company revisits its own growth.

The ECM suites provides a better way to organize, reuse and share data across the organization. To name a few, Livelink ECM Suite and Microsoft SharePoint provides an interactive space for document management, version management, auditing, business workflows, business intelligence and many other interesting search features with a user friendly – ‘web site’ style application.And also Microsoft’s BDC and Web Part technologies plays a vital role in third party integrations.(the details of which can be seen in our future posts)

By now,one can define in simple terms that a place where documents are managed can be termed as Enterprise Content Management Suites.In reality, the ROI’s are too high for organizations that have implemented SharePoint or Livelink. Livelink goes one step further by handling the retention management process for documents and incorporating a seemless workflow approval chain along the organization tree. This enables an organization to setup the time period for which a document can be accessible and then purged. In comparision, Livelink seems to be DoD certified and SharePoint may be in the process of getting one…

Thursday, February 5, 2009

Analytics, choosing it

We observe many BI Project Sponsors clearly asking for an Analytics Package implementation to meet business needs; the benefit is that it saves time. By deciding on an analytics package we can get the application up quickly and comes with all typical benefits of a ‘buy’ solution against a ‘build’ solution.
So what are the key parameters that we need to look for in choosing an Analytics Package. The following would be the points to consider in choosing an Analytics Package, in the order of importance.
1.The effort to arrive at the right data model for a BI system is huge and as well quite tedious, so a comprehensive ‘Data Model & Metrics, Calculations’ from the package is very important.
2.The flexibility and the openness in managing Data Model is also very critical, some of tools to manage the data model elements that can be looked for are
  • Ability to browse the data elements and its definitions
  • Support for customization of the data model without getting back to the database syntax
  • Auto Source System profiling and field mapping from the source systems to the data model
  • Enabling validation of data type, data length of the data model against the source system field definitions
  • Means to ensure that customization of the data model in terms of field addition doesn’t happen when a similar element exists
  • Availability of standard code data as applicable to the functional area
  • Supporting country specific needs in terms of data representation
3. ETL process for a BI system is also a major effort. Though the absolute effort of pulling the data and making it available for the package in the required format cannot be avoided, availability of plug-ins that can understand the data structure from typical systems like ERP would save good amount of effort.
4. Availability of ETL process for typical data validation as part of ETL is also a must; integration with any data quality product would be valuable
5. Ability to support audit and compliance requirements for data usage and reporting
6. Integration of the package with industry specific research data from vendors like D&B, IMS etc to enable benchmarking the performance metrics against industry peers/competitors
7. Customizable Security Framework
8. Semantic layer definition with formulas, hierarchies etc
9. Ready to use Score Cards and dashboard layouts
10. Pre built reports and portal
Often all the pre delivered reports under go changes and are almost completely customized when implemented. So availability of a larger list of reports itself doesn’t mean a lot since most of the reports would be minor variations from one other. Certain compliance reports would be useful when it comes along with the package; these would be published industry standard report formats.
Definitely an evaluation phase to test the Analytics products capability on a sample of the data before choosing it is a must, the above ten points would the evaluation criteria during this exercise.