Tuesday, March 24, 2009

Agile – Take off to cost reduction


The word ‘Agile’ has been around for a while and has created a buzz in Software Development. The history of agile programming can be dated back to the 70’s.
‘Agile Programming and Practices’ introduces the concept of producing small units of output that are thoroughly Software developed and tested at a constant period of time (usually referred to as ‘sprints’). It emphasizes on delivering the smallest workable unit that best delivers the business ideas while constantly improving the product by adding functionalities to it over a period of time. The underlying fact of agile programming is that it is ‘iterative and incremental’.
Agile practices introduce small self organized teams that are highly interactive. Better collaboration is one of the primary outcomes of agile programming while little emphasis is given for formal planning and documentation. The agile concepts emphasize eliminating waste through’ Lean Programming’ techniques. It provides a means to rectify the flaws at an early stage, thereby reducing the rework costs to a greater extent. Above all the greatest advantage lies in the fact that the agile programming model ensures that the project proceeds in the right direction.
Agile practices provide an ‘adaptability to change’. Agile practices are best suited for object oriented software projects where individual objects (or modules) are developed and tested for their integrity and defects before being actually integrated into the final product and tested.
There are different agile methodologies that have been framed over the due course of time. Some of the noteworthy are ‘Agile Modeling’, ‘Agile Unified Process’, ‘Extreme Programming’, ‘Agile Data method’, ‘Scrum’, ‘Open Unified Process’ etc. Some of the noteworthy agile practices followed in the industry are Test Driven Development, Behavior Driven Development, Pair Programming, and Continuous Integration etc…
Let us examine the feasibility of applying a test driven development approach to a check in activity for an airline departure control solution. The prime aim of the check in activity is to ensure that the passenger has been successfully checked in, a seat allocated and a boarding pass printed and each of these is independent by their nature. A test driven approach to this always asserts that the end point is fulfilled as the functionality materializes. This would greatly reduce the rework overheads which otherwise would have increased if the whole system was designed to be tested in the end.

Thursday, March 19, 2009

Informatica PowerCenter 8x Key Concepts – 6


6.  Integration Service (IS)

The key functions of IS are
  • Interpretation of the workflow and mapping metadata from the repository.
  • Execution of the instructions in the metadata
  • Manages the data from source system to target system within the memory and disk
The main three components of Integration Service which enable data movement are,
  • Integration Service Process
  • Load Balancer
  • Data Transformation Manager

6.1 Integration Service Process (ISP)

The Integration Service starts one or more Integration Service processes to run and monitor workflows. When we run a workflow, the ISP starts and locks the workflow, runs the workflow tasks, and starts the process to run sessions. The functions of the Integration Service Process are,
  • Locks and reads the workflow
  • Manages workflow scheduling, ie, maintains session dependency
  • Reads the workflow parameter file
  • Creates the workflow log
  • Runs workflow tasks and evaluates the conditional links
  • Starts the DTM process to run the session
  • Writes historical run information to the repository
  • Sends post-session emails

6.2    Load Balancer

The Load Balancer dispatches tasks to achieve optimal performance. It dispatches tasks to a single node or across the nodes in a grid after performing a sequence of steps. Before understanding these steps we have to know about Resources, Resource Provision Thresholds, Dispatch mode and Service levels
  • Resources – we can configure the Integration Service to check the resources available on each node and match them with the resources required to run the task. For example, if a session uses an SAP source, the Load Balancer dispatches the session only to nodes where the SAP client is installed
  • Three Resource Provision Thresholds, The maximum number of runnable threads waiting for CPU resources on the node called Maximum CPU Run Queue Length. The maximum percentage of virtual memory allocated on the node relative to the total physical memory size called Maximum Memory %. The maximum number of running Session and Command tasks allowed for each Integration Service process running on the node called Maximum Processes
  • Three Dispatch mode’s – Round-Robin: The Load Balancer dispatches tasks to available nodes in a round-robin fashion after checking the “Maximum Process” threshold. Metric-based: Checks all the three resource provision thresholds and dispatches tasks in round robin fashion. Adaptive: Checks all the three resource provision thresholds and also ranks nodes according to current CPU availability
  • Service Levels establishes priority among tasks that are waiting to be dispatched, the three components of service levels are Name, Dispatch Priority and Maximum dispatch wait time. “Maximum dispatch wait time” is the amount of time a task can wait in queue and this ensures no task waits forever
A .Dispatching Tasks on a node
  1. The Load Balancer checks different resource provision thresholds on the node depending on the Dispatch mode set. If dispatching the task causes any threshold to be exceeded, the Load Balancer places the task in the dispatch queue, and it dispatches the task later
  2. The Load Balancer dispatches all tasks to the node that runs the master Integration Service process
B. Dispatching Tasks on a grid,
  1. The Load Balancer verifies which nodes are currently running and enabled
  2. The Load Balancer identifies nodes that have the PowerCenter resources required by the tasks in the workflow
  3. The Load Balancer verifies that the resource provision thresholds on each candidate node are not exceeded. If dispatching the task causes a threshold to be exceeded, the Load Balancer places the task in the dispatch queue, and it dispatches the task later
  4. The Load Balancer selects a node based on the dispatch mode

6.3 Data Transformation Manager (DTM) Process

When the workflow reaches a session, the Integration Service Process starts the DTM process. The DTM is the process associated with the session task. The DTM process performs the following tasks:
  • Retrieves and validates session information from the repository.
  • Validates source and target code pages.
  • Verifies connection object permissions.
  • Performs pushdown optimization when the session is configured for pushdown optimization.
  • Adds partitions to the session when the session is configured for dynamic partitioning.
  • Expands the service process variables, session parameters, and mapping variables and parameters.
  • Creates the session log.
  • Runs pre-session shell commands, stored procedures, and SQL.
  • Sends a request to start worker DTM processes on other nodes when the session is configured to run on a grid.
  • Creates and runs mapping, reader, writer, and transformation threads to extract, transform, and load data
  • Runs post-session stored procedures, SQL, and shell commands and sends post-session email
  • After the session is complete, reports execution result to ISP
Pictorial Representation of Workflow execution:
  1. A PowerCenter Client request IS to start workflow
  2. IS starts ISP
  3. ISP consults LB to select node
  4. ISP starts DTM in node selected by LB

Friday, March 13, 2009

Multi-User Environment for Siebel Analytics/OBIEE by Alok Chowdhary on March 13, 2009 in Siebel Street


By default, only one user can edit the repository at a time, but a more efficient environment would allow developers to modify the repository simultaneously and then check in changes. Oracle BI allows multiple developers to work on objects from the same repository during group development of Oracle BI applications.

Steps for configuring Oracle BI multi-user development environment:

1)Create Project
In the Admin tool, open the Project Manager
Path:-Select Manage > Projects and then Action > New project for creating new project.
From the figure down below, you can notice two parts, in the left you can see objects that are available for the project and the right part consists of objects that can be added in projects. Select the objects from the left part that you want to add to the project and click the Add button. If you have selected the presentation catalog, all fact and dependent objects are selected in the project.
Siebel
Besides the catalog, other objects such as USER, groups, Variables and initialization blocks can also be added in projects. Apart from this, you can also remove unwanted objects from project by clicking the Remove button.

2) Set up a shared network directory
Administrator needs to identify or create a shared network directory that can be accessible to developers to keep the repository file at that location. This repository is the master repository which is accessible to multiple developers to check in or check out the changes done. Developer has to point to this directory path when they use Admin tool at their machine.

Making changes in the Admin tool at the local machine to use as a multi-user development environment:
1)Point to multi-user directory:
Set up for Admin tool to point multi-user  development directory.
Path:-Select Tools > Options and then select Multiuser tab.
multi-user-environment
From the figure,  it can be seen that it is a two field, one multi-user development directory in which you have to browse the path of the shared directory where the original repository has been kept for development purposes. The other field full name  is optional,  but if the user mentions name in that field, it helps in tracking the changes made by each user and stored in the HKEY_CURRENT_USER part of registry and is therefore unique for each login on computer.
2. Check out project:-
After pointing to the multi-user development , the directory developer can check out desired projects.
To check out projects, go to path File > Multiuser > Checkout which will be only available when the multi-user environment is setup. After this, the developer is presented with a dialog box to select the master repository if one has more than one repository. Select the repository , then enter user name and password, it will navigate to select the project or projects to be imported. After selecting the projects,  user must enter the name of the new repository which will be stored in user’s local directory.
Check out project

3.Admin tool task during checkout:-
During checkout , the admin tool performs the following task:-
  • Makes a temporary copy of the master repository on the local machine.
  • Saves local copy of projects in the new repository on the local machine.
  • Saves second local copy of project in the new repository on the l
    local machine with prefix as “original”.
  • Deletes temporary copy of the master repository from the local machine.
4. Changes done in metadata:-
Changes can be performed on logical tables, table definitions and logical table sources. Developers can work on the same project but if one developer deletes objects,  it will be migrated without any warning. So developers should keep in mind that modifications can affect others too.
5. Tasks done during check-in:-
The Admin tool perform the process of locking the master repository to prevent other developers from attempting to merge at the same time and copies the master repository to a local directory so that the developer will be merging with the latest or recent repository.
6. Check in changes:-
After performing modifications on the repository, the developer needs to check in changes and merge with the master repository in the shared path. Only one developer at a time can merge it. After selecting File > Multi-user >Merge local changes, the developer is shown a  dialog box having full name and the option to write comments if any and after clicking ok , the admin tool performs the process of copying master from shared and keeps it on the local machine.
After developers lock the master repository,  the merge process take place. After the merge process,  developers have to publish to the network.  Go to path File >Multi-user >Publish to Network to publish changes done in the repository.  This will finally merge local repository changes to the master repository and at the same time a local copy of the repository has been removed from the local machine.
Advantages:
  • History menu option:-This option gives the detail of all the changes performed during the merge process. We can have the version history that tracks all the changes performed in the repository during the merge process.
  • It helps to track the Project history.
Disadvantage:
  • Multiuser develop environment is purely for repository development. Dashboards/Reports can be developed/managed by any no of users in the browser once the presentation server is up.
Tp know more about OBIEE