Thursday, March 29, 2012

Strategies For Testing Data Warehouse Applications


Introduction:

There is an exponentially increasing cost associated with finding software defects later in the development lifecycle. In data warehousing, this is compounded because of the additional business costs of using incorrect data to make critical business decisions. Given the importance of early detection of software defects, let’s first review some general goals of testing an ETL application:

Below content describes the various common strategies used to test the Data warehouse system:
Data completeness: 

Ensures that all expected data is loaded in to target table.

1. Compare records counts between source and target..check for any rejected records.
2. Check Data should not be truncated in the column of target table.
3. Check unique values has to load in to the target. No duplicate records should be existing.
4. Check boundary value analysis (ex: only >=2008 year data has to load into the target)

Data Quality:

1.Number check: if in the source format of numbering the columns are as xx_30 but if the target is only 30 then it has to load not pre_fix(xx_) .. we need to validate.

2.  Date Check: They have to follow Date format and it should be same across all the records. Standard format : yyyy-mm-dd etc..

3. Precision Check: Precision value should display as expected in the target table.

Example: In source 19.123456 but in the target it should display as 19.123 or round of 20.

4.  Data Check: Based on business logic, few record which does not meet certain criteria should be filtered out.
Example: only record whose date_sid >=2008 and GLAccount != ‘CM001’ should only load in the
target table.

5. Null Check: Few columns should display “Null” based on business requirement
Example: Termination Date column should display null unless & until if his “Active status”
Column is “T” or “Deceased”.

Note: Data cleanness will be decided during design phase only.

Data cleanness:

Unnecessary columns should be deleted before loading into the staging area.

1.  Example: If a column have name but it is taking extra space , we have to “trim” space so before loading in the staging area with the help of expression transformation space will be trimed.

2. Example: Suppose telephone number and STD code in different columns and requirement says it should be in one column then with the help of expression transformation we will concatenate the values in one column.

Data Transformation: All the business logic implemented by using ETL-Transformation should reflect.

Integration testing:

Ensures that the ETL process functions well with other upstream and downstream processes.

Example:
1.  Downstream:Suppose if you are changing precision in one of the transformation “column”, let us assume a “EMPNO” is column having data type with size 16, this data type precision should be same for all transformation where ever this “EMPNO” column is used.

2.  Upstream: If the source is SAP/ BW and we are extracting data there will be ABAP code which will act as interface between SAP/ BW and map where there source is SAP /BW and to modify existing mapping we have to re-generate the ABAP code in the ETL tool (informatica)., if we don’t do it, wrong data will be extracted since ABAP code is not updated.

User-acceptance testing:

Ensures the solution meets users’ current expectations and anticipates their future expectations.
Example: Make sure none of the code should be hardcoded.

Regression testing:

Ensures existing functionality remains intact each time a new release of code is completed.

Conclusion:

Taking these considerations into account during the design and testing portions of building a data warehouse will ensure that a quality product is produced and prevent costly mistakes from being discovered in production.

BI Testing-SQL Performance tuning


Introduction:
Generally ETL performance testing is confirmation test to ensure that an ETL ‘system’ can handle the load of multiple users and transaction.  For any project this is primarily ensuring that the ‘system’ can easily manage the throughput of millions of transactions.
You can improve your application performance by optimizing the queries you use. The following sections outline techniques you can use to optimize query performance.
Improve Indexes:
  • Creating useful indexes is one of the most important ways to achieve better query performance. Useful indexes help you find data with fewer disk I/O operations and less system resource usage.
  • To create useful indexes, you must understand how the data is used, the types of queries and the frequencies they run, and how the query processor can use indexes to find your data quickly.
  • When you choose what indexes to create, examine your critical queries, the performance of which will affect the user’s experience most. Create indexes to specifically aid these queries. After adding an index, rerun the query to see if performance is improved. If it is not, remove the index.
  • As with most performance optimization techniques, there are tradeoffs. For example, with more indexes, SELECT queries will potentially run faster. However, DML (INSERT, UPDATE, and DELETE) operations will slow down significantly because more indexes must be maintained with each operation. Therefore, if your queries are mostly SELECT statements, more indexes can be helpful. If your application performs many DML operations, you should be conservative with the number of indexes you create.
Choose what to Index:
  • We recommend that you always create indexes on primary keys. It is frequently useful to also create indexes on foreign keys. This is because primary keys and foreign keys are frequently used to join tables. Indexes on these keys let the optimizer consider more efficient index join algorithms. If your query joins tables by using other columns, it is frequently helpful to create indexes on those columns for the same reason.
  • When primary key and foreign key constraints are created, SQL Server Compact 3.5 automatically creates indexes for them and takes advantage of them when optimizing queries. Remember to keep primary keys and foreign keys small. Joins run faster this way.
Use Indexes with Filter Clauses
  • Indexes can be used to speed up the evaluation of certain types of filter clauses. Although all filter clauses reduce the final result set of a query, some can also help reduce the amount of data that must be scanned.
  • A search argument (SARG) limits a search because it specifies an exact match, a range of values, or a conjunction of two or more items joined by AND. It has one of the following forms:
Understand Response Time Vs Total Time:
  • Response time is the time it takes for a query to return the first record. Total time is the time it takes for the query to return all records. For an interactive application, response time is important because it is the perceived time for the user to receive visual affirmation that a query is being processed. For a batch application, total time reflects the overall throughput. You have to determine what the performance criteria are for your application and queries, and then design accordingly.
Example:
  • Suppose the query returns 100 records and is used to populate a list with the first five records. In this case, you are not concerned with how long it takes to return all 100 records. Instead, you want the query to return the first few records quickly, so that you can populate the list.
  • Many query operations can be performed without having to store intermediate results. These operations are said to be pipelined. Examples of pipelined operations are projections, selections, and joins. Queries implemented with these operations can return results immediately. Other operations, such as SORT and GROUP-BY, require using all their input before returning results to their parent operations. These operations are said to require materialization. Queries implemented with these operations typically have an initial delay because of materialization. After this initial delay, they typically return records very quickly.
  • Queries with response time requirements should avoid materialization. For example, using an index to implement ORDER-BY, yields better response time than using sorting. The following section describes this in more detail.
Index the ORDER-BY / GROUP-BY / DISTINCT Columns for Better Response Time
  • The ORDER-BY, GROUP-BY, and DISTINCT operations are all types of sorting. The SQL Server Compact 3.5 query processor implements sorting in two ways. If records are already sorted by an index, the processor needs to use only the index. Otherwise, the processor has to use a temporary work table to sort the records first. Such preliminary sorting can cause significant initial delays on devices with lower power CPUs and limited memory, and should be avoided if response time is important.
  • In the context of multiple-column indexes, for ORDER-BY or GROUP-BY to consider a particular index, the ORDER-BY or GROUP-BY columns must match the prefix set of index columns with the exact order. For example, the index CREATE INDEX Emp_Name ON Employees (“Last Name” ASC, “First Name” ASC) can help optimize the following queries:
    • .. ORDER BY / GROUP BY “Last Name” …
    • … ORDER BY / GROUP BY “Last Name”, “First Name” …
It will not help optimize:
  • … ORDER BY / GROUP BY “First Name” …
  • … ORDER BY / GROUP BY “First Name”, “Last Name” …
For a DISTINCT operation to consider a multiple-column index, the projection list must match all index columns, although they do not have to be in the exact order. The previous index can help optimize the following queries:
  • … DISTINCT “Last Name”, “First Name” …
  • … DISTINCT “First Name”, “Last Name” …
It will not help optimize:
  • … DISTINCT “First Name” …
  • … DISTINCT “Last Name” …
Rewrite Subqueries to Use JOIN
Sometimes you can rewrite a subquery to use JOIN and achieve better performance. The advantage of creating a JOIN is that you can evaluate tables in a different order from that defined by the query. The advantage of using a subquery is that it is frequently not necessary to scan all rows from the subquery to evaluate the subquery expression. For example, an EXISTS subquery can return TRUE upon seeing the first qualifying row.
Example:
To determine all the orders that have at least one item with a 25 percent discount or more, you can use the following EXISTS subquery:
SELECT “Order ID” FROM Orders O
WHERE EXISTS (SELECT “Order ID”
FROM “Order Details” OD
WHERE O.”Order ID” = OD.”Order ID”
AND Discount >= 0.50)
You can rewrite this by using JOIN:
SELECT DISTINCT O.”Order ID” FROM Orders O INNER JOIN “Order Details”
OD ON O.”Order ID” = OD.”Order ID” WHERE Discount >= 0.50
Limit Using Outer JOINs
OUTER JOINs are treated differently from INNER JOINs in the optimizer. It does not try to rearrange the join order of OUTER JOIN tables as it does to INNER JOIN tables. The outer table (the left table in LEFT OUTER JOIN and the right table in RIGHT OUTER JOIN) is accessed first, followed by the inner table. This fixed join order could lead to execution plans that are less than optimal.
Use Parameterized Queries:
  • If your application runs a series of queries that are only different in some constants, you can improve performance by using a parameterized query. For example, to return orders by different customers, you can run the following query:
  • SELECT “Customer ID” FROM Orders WHERE “Order ID” = ?
  • Parameterized queries yield better performance by compiling the query only once and executing the compiled plan multiple times. Programmatically, you must hold on to the command object that contains the cached query plan. Destroying the previous command object and creating a new one destroys the cached plan. This requires the query to be re-compiled. If you must run several parameterized queries in interleaved manner, you can create several command objects, each caching the execution plan for a parameterized query. This way, you effectively avoid re-compilations for all of them.
17 Tips for Avoiding Problematic Queries
1. Avoid Cartesian products
2. Avoid full table scans on large tables
3. Use SQL standards and conventions to reduce parsing
4. Lack of indexes on columns contained in the WHERE clause
5. Avoid joining too many tables
6. Monitor V$SESSION_LONGOPS to detect long running operations
7. Use hints as appropriate
8. Use the SHARED_CURSOR parameter
9. Use the Rule-based optimizer if I is better than the Cost-based optimizer
10. Avoid unnecessary sorting
11. Monitor index browning (due to deletions; rebuild as necessary)
12. Use compound indexes with care (Do not repeat columns)
13. Monitor query statistics
14. Use different tablespaces for tables and indexes (as a general rule; this is old-school somewhat, but the main point is reduce I/O contention)
15. Use table partitioning (and local indexes) when appropriate (partitioning is an extra cost feature)
16. Use literals in the WHERE clause (use bind variables)
17. Keep statistics up to date
Conclusion
ETL projects today are designed for correct functionality and adequate performance, i.e., to complete within a time window. However, the task of optimizing ETL designs is left to the experience and intuition of the ETL designers. In addition, ETL designs face additional objectives beyond performance.

ETL testing Fundamentals


Introduction:
Comprehensive testing of a data warehouse at every point throughout the ETL (extract, transform, and load) process is becoming increasingly important as more data is being collected and used for strategic decision-making. Data warehouse or ETL testing is often initiated as a result of mergers and acquisitions, compliance and regulations, data consolidation, and the increased reliance on data-driven decision making (use of Business Intelligence tools, etc.). ETL testing is commonly implemented either manually or with the help of a tool (functional testing tool, ETL tool, proprietary utilities). Let us understand some of the basic ETL concepts.
BI / Data Warehousing testing projects can be conjectured to be divided into ETL (Extract – Transform – Load) testing and henceforth the report testing.
Extract Transform Load is the process to enable businesses to consolidate their data while moving it from place to place (i.e.) moving data from source systems into the data warehouse. The data can arrive from any source:
Extract - It can be defined as extracting the data from numerous heterogeneous systems.
Transform - Applying the business logics as specified b y the business on the data derived from sources.
Load - Pumping the data into the final warehouse after completing the above two process. The ETL part of the testing mainly deals with how, when, from, where and what data we carry in our data warehouse from which the final reports are supposed to be generated. Thus, ETL testing spreads across all and each stage of data flow in the warehouse starting from the source databases to the final target warehouse.
Star Schema
The star schema is perhaps the simplest data warehouse schema. It is called a star schema because the entity-relationship diagram of this schema resembles a star, with points radiating from a central table. The center of the star consists of a large fact table and the points of the star are the dimension tables.
A star schema is characterized by one OR more of very large fact tables that contain the primary information in the data warehouse, and a number of much smaller dimension tables (OR lookup tables), each of which contains information about the entries for a particular attribute in the fact table.
A star query is a join between a fact table and a number of dimension tables. Each dimension table is joined to the fact table using a primary key to foreign key join, but the dimension tables are not joined to each other. The cost-based optimizer recognizes star queries and generates efficient execution plans for them. A typical fact table contains keys and measures. For example, in the sample schema, the fact table sales, contain the measures, quantity sold, amount, average, the keys time key, item-key, branch key, and location key. The dimension tables are time, branch, item and location.
Snow-Flake Schema
The snowflake schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake. Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table.
For example, a location dimension table in a star schema might be normalized into a location table and city table in a snowflake schema. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance. Figure above presents a graphical representation of a snowflake schema.
When to use star schema and snowflake schema?
When we refer to Star and Snowflake Schemas, we are talking about a dimensional model for a Data Warehouse or a Datamart. The Star schema model gets it name from the design appearance because there is one central fact table surrounded by many dimension tables. The relationship between the fact and dimension tables is created by PK -> FK relationship and the keys are generally surrogate to the natural or business key of the dimension tables. All data for any given dimension is stored in the one dimension table. Thus, the design of the model could potentially look like a STAR. On the other hand, the Snowflake schema model breaks the dimension data into multiple tables for the purpose of making the data more easily understood or for reducing the width of the dimension table. An example of this type of schema might be a dimension with Product data of multiple levels. Each level in the Product Hierarchy might have multiple attributes that are meaningful only to that level. Thus, one would break the single dimension table into multiple tables in a hierarchical fashion with the highest level tied to the fact table. Each table in the dimension hierarchy would be tied to the level above by natural or business key where the highest level would be tied to the fact table by a surrogate key. As you can imagine the appearance of this schema design could resemble the appearance of a snowflake.
Types of Dimensions Tables
Type 1: This is straightforward r e f r e s h . The fields are constantly overwritten and history is not kept for the column. For example should a description change for a Product number,the old value will be over written by the new value.
Type 2: This is known as a slowly changing dimension, as history can be kept. The column(s) where the history is captured has to be defined. In our example of the Product description changing for a product number, if the slowly changing attribute captured is the product description, a new row of data will be created showing the new product description. The old description will still be contained in the old.
Type 3: This is also a slowly changing dimension. However, instead of a new row, in the example, the old product description will be moved to an “old value” column in the dimension, while the new description will overwrite the existing column. In addition, a date stamp column exists to say when the value was updated. Although there will be no full history here, the previous value prior to the update is captured. No new rows will be created for history as the attribute is measured for the slowly changing value.
Types of fact tables:
Transactional: Most facts will fall into this category. The transactional fact will capture transactional data such as sales lines or stock movement lines. The measures for these facts can be summed together.
Snapshot: A snapshot fact will capture the current data for point for a day. For example, all the current stock positions, where items are, in which branch, at the end of a working day can be captured.
Snapshot fact measures can be summed for this day, but cannot be summed across more than 2 snapshot days as this data will be incorrect.
Accumulative: An accumulative snapshot will sum data up for an attribute, and is not based on time. For example, to get the accumulative sales quantity for a sale of a particular product, the row of data will be calculated for this row each night – giving an “accumulative” value.
Key hit-points in ETL testing are:There are several levels of testing that can be performed during data warehouse testing and they should be defined as part of the testing strategy in different phases (Component Assembly, Product) of testing. Some examples include:
1. Constraint Testing: During constraint testing, the objective is to validate unique constraints, primary keys, foreign keys, indexes, and relationships. The test script should include these validation points. Some ETL processes can be developed to validate constraints during the loading of the warehouse. If the decision is made to add constraint validation to the ETL process, the ETL code must validate all business rules and relational data requirements. In Automation, it should be ensured that the setup is done correctly and maintained throughout the ever-changing requirements process for effective testing. An alternative to automation is to use manual queries. Queries are written to cover all test scenarios and executed manually.
2. Source to Target Counts: The objective of the count test scripts is to determine if the record counts in the source match the record counts in the target. Some ETL processes are capable of capturing record count information such as records read, records written, records in error, etc. If the ETL process used can capture that level of detail and create a list of the counts, allow it to do so. This will save time during the validation process. It is always a good practice to use queries to double check the source to target counts.
3. Source to Target Data Validation: No ETL process is smart enough to perform source to target field-to-field validation. This piece of the testing cycle is the most labor intensive and requires the most thorough analysis of the data. There are a variety of tests that can be performed during source to target validation. Below is a list of tests that are best practices:
4. Transformation and Business Rules: Tests to verify all possible outcomes of the transformation rules, default values, straight moves and as specified in the Business Specification document. As a special mention, Boundary conditions must be tested on the business rules.
5. Batch Sequence & Dependency Testing: ETL’s in DW are essentially a sequence of processes that execute in a particular sequence. Dependencies do exist among various processes and the same is critical to maintain the integrity of the data. Executing the sequences in a wrong order might result in inaccurate data in the warehouse. The testing process must include at least 2 iterations of the end–end execution of the whole batch sequence. Data must be checked for its integrity during this testing. The most common type of errors caused because of incorrect sequence is the referential integrity failures, incorrect end-dating (if applicable) etc, reject
records etc.
6. Job restart Testing: In a real production environment, the ETL jobs/processes fail because of number of reasons (say for ex: database related failures, connectivity failures etc). The jobs can fail half/partly executed. A good design always allows for a restart ability of the jobs from the failure point. Although this is more of a design suggestion/approach, it is suggested that every ETL job is built and tested for restart capability.
7. Error Handling: Understanding a script might fail during data validation, may confirm the ETL process is working through process validation. During process validation the testing team will work to identify additional data cleansing needs, as well as identify consistent error patterns that could possibly be diverted by modifying the ETL code. It is the responsibility of the validation team to identify any and all records that seem suspect. Once a record has been both data and process validated and the script has passed, the ETL process is functioning correctly. Conversely, if suspect records have been identified and documented during data validation those are not supported through process validation, the ETL process is not functioning correctly.
8. Views: Views created on the tables should be tested to ensure the attributes mentioned in the views are correct and the data loaded in the target table matches what is being reflected in the views.
9. Sampling: Sampling will involve creating predictions out of a representative portion of the data that is to be loaded into the target table; these predictions will be matched with the actual results obtained from the data loaded for business Analyst Testing. Comparison will be verified to ensure that the predictions match the data loaded into the target table.
10. Process Testing: The testing of intermediate files and processes to ensure the final outcome is valid and that performance meets the system/business need.
11. Duplicate Testing: Duplicate Testing must be performed at each stage of the ETL process and in the final target table. This testing involves checks for duplicates rows and also checks for multiple rows with same primary key, both of which cannot be allowed.
12. Performance: It is the most important aspect after data validation. Performance testing should check if the ETL process is completing within the load window.
13. Volume: Verify that the system can process the maximum expected quantity of data for a given cycle in the time expected.
14.Connectivity Tests: As the name suggests, this involves testing the upstream, downstream interfaces and intra DW connectivity. It is suggested that the testing represents the exact transactions between these interfaces. For ex: If the design approach is to extract the files from source system, we should actually test extracting a file out of the system and not just the
connectivity.
15. Negative Testing: Negative Testing checks whether the application fails and where it should fail with invalid inputs and out of boundary scenarios and to check the behavior of the application.
16. Operational Readiness Testing (ORT): This is the final phase of testing which focuses on verifying the deployment of software and the operational readiness of the application. The main areas of testing in this phase include:
Deployment Test
1. Tests the deployment of the solution
2. Tests overall technical deployment “checklist” and timeframes
3. Tests the security aspects of the system including user authentication and
authorization, and user-access levels.
Conclusion
Evolving needs of the business and changes in the source systems will drive continuous change in the data warehouse schema and the data being loaded. Hence, it is necessary that development and testing processes are clearly defined, followed by impact-analysis and strong alignment between development, operations and the business.

Tuesday, March 20, 2012

BusinessObjects Administration – Custom Access Levels

Hi BOOglers,
Another interesting feature in business objects, Custom access level is going to be the topic of discussion for this blog. Please note the custom access levels are introduced only from Business Objects 3.0 onwards.
As you all know Access levels are groups of rights that users frequently need. They allow administrators to set common security levels quickly and uniformly rather than individual rights to be set one by one. Business Objects comes with several predefined access levels. Beginning with View and ending with Full Control, each access level builds upon the rights granted by the previous level. We can also create and customize your own access levels. This will greatly reduce administrative and maintenance costs associated with security.
Predefined Access Levels
Below four are list of predefined access levels and associated list of right(s).
Access levelDescriptionRights involved
ViewIf set on the folder level, a principal can view the folder, objects within the folder, and each object’s generated instances.
  • View objects
  • View document instances
ScheduleA principal can generate instances by scheduling an object to run against a specified data source once or on a recurring basis. The principal can view, delete, and pause the scheduling of instances that they own.View access level rights, plus:
  • Schedule the document to run
  • Print the report’s data
  • Edit objects that the user owns
View On DemandA principal can refresh data on demand against a data source.Schedule access level rights, plus:
  • Refresh the report’s data
Full ControlA principal has full administrative
Control of the object.
All available rights





* Principle refer to User group or User
Access levels in CMC
Advanced rights
IconRights optionDescription
GrantedThe right is granted to a principal.
DeniedThe right is denied to a principal.
Not SpecifiedThe right is unspecified for a principal. By default, rights set to Not Specified are denied.
Apply to ObjectThe right applies to the object. This option becomes available when you click Granted or Denied.
Apply to Sub ObjectThe right applies to sub-objects. This option becomes available when you click Granted or Denied.
Custom Access Levels
Consider a situation in which an administrator must manage two groups, Marketing managers and Marketing employees. Both groups need to access ten reports in the Business Objects Enterprise system, but Marketing managers require more rights than marketing employees. The predefined access levels do not meet the needs of either group. Instead of adding groups to each report as principals and modifying their rights in ten different places, the administrator can create two new access levels, Marketing Managers and Marketing Employees. The administrator then adds both groups as principals to the reports and assigns the groups their respective access levels. When rights need to be modified, the administrator can modify the access levels. Because the access levels apply to both groups across all ten reports, the rights those groups have to the reports are automatically updated.
We can create a new custom access level either start from the scratch or copy the existing access levels. We can also add/remove set of rights from existing custom access level from the existing custom access level.
Right Click on the Custom Access level and Select Included Rights.
And you will get the screen like below. Select the appropriate rights as per the requirement,
Then click OK to complete.
Finally you can assign the Custom Access level against each User group/User on a particular folder.
Administrator will get the best benefits out of this because they will get-rid of the traditional rights assignment using Advanced rights option. Also It is easy to manage the rights when they are grouped together.
Feel free to leave your comments. Thanks for reading! 

Thursday, March 15, 2012

Automation Tool Selection Recommendation


  • Overview
  • Information Gathering
  • Tools and Vendors
  • Evaluation Criteria
  • Tools Evaluation
  • Matrix
  • Conclusion
  • Overview
“Automated Testing” means automating the manual testing process currently in use. This requires that a formalized “manual testing process” currently exists in the company or organization. Minimally, such a process includes:

–        Detailed test cases, including predictable “expected results”, which have been developed from Business Functional Specifications and Design documentation.

–        A standalone Test Environment, including a Test Database that is restorable to a known constant, such that the test cases are able to be repeated each time there are modifications made to the application.

Information Gathering

Following are sample questions asked to tester who have been using some the testing tools:

How long have you been using this tool and are you basically happy with it?

How many copies/licenses do you have and what hardware and software platforms are you using?

How did you evaluate and decide on this tool and which other tools did you consider before purchasing this tool?

How does the tool perform and are there any bottlenecks?

What is your impression of the vendor (commercial professionalism, on-going level of support, documentation and training)?

Tools and Vendors
  • Robot – Rational Software
  • WinRunner 7 – Mercury
  • QA Run 4.7 – Compuware
  • Visual Test – Rational Software
  • Silk Test – Segue
  • QA Wizard – Seapine Software
Tools Overview

Robot – Rational Software

–        IBM Rational Robot v2003 automates regression, functional and configuration testing for e-commerce, client/server and ERP Applications. It’s used to test applications constructed in a wide variety of IDEs and languages, and ships with IBM Rational TestManager. Rational TestManager provides desktop management of all testing activities for all types of testing.

WinRunner 7 – Mercury

–        Mercury WinRunner is a powerful tool for enterprise wide functional and regression testing.

–        WinRunner captures, verifies, and replays user interactions automatically to identify defects and ensure that business processes work flawlessly upon deployment and remain reliable.

–        WinRunner allows you to reduce testing time by automating repetitive tasks and optimize testing efforts by covering diverse environments with a single testing tool.

QA Run 4.7 – Compuware

–        With QA Run, programmers get the automation capabilities they need to quickly and productively create and execute test scripts, verify tests and analyze test results.

–        Uses an object-oriented approach to automate test script generation, which can significantly increase the accuracy of testing in the time you have available.

Visual Test 6.5 – Rational Software

–        Based on the BASIC language and used to simulate user actions on a User Interface.

–        Is a powerful language providing support for pointers, remote procedure calls, working with advanced data types such as linked lists, open-ended hash tables, callback functions, and much more.

–        Is a host of utilities for querying an application to determine how to access it with Visual Test, screen capture/comparison, script executor, and scenario recorder.

Silk Test – Segue

–        Is an automated tool for testing the functionality of enterprise applications in any environment.

–        Designed for ease of use, Silk Test includes a host of productivity-boosting features that let both novice and expert users create functional tests quickly, execute them automatically and analyze results accurately.

–        In addition to validating the full functionality of an application prior to its initial release, users can easily evaluate the impact of new enhancements on existing functionality by simply reusing existing test casts.

QA Wizard – Seapine Software

–        Completely automates the functional regression testing of your applications and Web sites.

–        It’s an intelligent object-based solution that provides data-driven testing support for multiple data sources.

–        Uses scripting language that includes all of the features of a modern structured language, including flow control, subroutines, constants, conditionals, variables, assignment statements, functions, and more.

Evaluation Criteria

Record and Playback         Object Mapping
Web Testing Object              Identity Tool
Environment Support        Extensible Language
Cost                                            Integration
Ease of Use                             Image Testing
Database Tests                     Test/Error Recovery
Data Functions                    Object Tests
Support

3 = Basic  2 = Good  1 = Excellent

Tool Selection Recommendation

Tool evaluation and selection is a project in its own right.

It can take between 2 and 6 weeks. It will need team members, a budget, goals and timescales.
There will also be people issues i.e. “politics”.

Start by looking at your current situation
– Identify your problems
– Explore alternative solutions
– Realistic expectations from tool solutions
– Are you ready for tools?

Make a business case for the tool

–What are your current and future manual testing costs?
–What are initial and future automated testing costs?
–What return will you get on investment and when?

Identify candidate tools

– Identify constraints (economic, environmental, commercial, quality, political)
– Classify tool features into mandatory & desirable
– Evaluate features by asking questions to tool vendors
– Investigate tool experience by asking questions to other tool users Plan and schedule in-house demonstration by vendors
– Make the decision

Choose a test tool that best fits the testing requirements of your organization or company.

An “Automated Testing Handbook” is available from the Software Testing Institute (www.ondaweb.com/sti), which covers all of the major considerations involved in choosing the right test tool for your purposes.

Automation Tool Selection Recommendation


“Automated Testing” means automating the manual testing process currently in use. This requires that a formalized “manual testing process” currently exists in the company or organization. Minimally, such a process includes:
–        Detailed test cases, including predictable “expected results”, which have been developed from Business Functional Specifications and Design documentation.
–        A standalone Test Environment, including a Test Database that is restorable to a known constant, such that the test cases are able to be repeated each time there are modifications made to the application.
Information Gathering
Following are sample questions asked to tester who have been using some the testing tools:
How long have you been using this tool and are you basically happy with it?
How many copies/licenses do you have and what hardware and software platforms are you using?
How did you evaluate and decide on this tool and which other tools did you consider before purchasing this tool?
How does the tool perform and are there any bottlenecks?
What is your impression of the vendor (commercial professionalism, on-going level of support, documentation and training)?
Tools and Vendors
  • Robot – Rational Software
  • WinRunner 7 – Mercury
  • QA Run 4.7 – Compuware
  • Visual Test – Rational Software
  • Silk Test – Segue
  • QA Wizard – Seapine Software
Tools Overview
Robot – Rational Software
–        IBM Rational Robot v2003 automates regression, functional and configuration testing for e-commerce, client/server and ERP applications. It’s used to test applications constructed in a wide variety of IDEs and languages, and ships with IBM Rational TestManager. Rational TestManager provides desktop management of all testing activities for all types of testing.
WinRunner 7 – Mercury
–        Mercury WinRunner is a powerful tool for enterprise wide functional and regression testing.
–        WinRunner captures, verifies, and replays user interactions automatically to identify defects and ensure that business processes work flawlessly upon deployment and remain reliable.
–        WinRunner allows you to reduce testing time by automating repetitive tasks and optimize testing efforts by covering diverse environments with a single testing tool.
QA Run 4.7 – Compuware
–        With QA Run, programmers get the automation capabilities they need to quickly and productively create and execute test scripts, verify tests and analyze test results.
–        Uses an object-oriented approach to automate test script generation, which can significantly increase the accuracy of testing in the time you have available.
Visual Test 6.5 – Rational Software
–        Based on the BASIC language and used to simulate user actions on a User Interface.
–        Is a powerful language providing support for pointers, remote procedure calls, working with advanced data types such as linked lists, open-ended hash tables, callback functions, and much more.
–        Is a host of utilities for querying an application to determine how to access it with Visual Test, screen capture/comparison, script executor, and scenario recorder.
Silk Test – Segue
–        Is an automated tool for testing the functionality of enterprise applications in any environment.
–        Designed for ease of use, Silk Test includes a host of productivity-boosting features that let both novice and expert users create functional tests quickly, execute them automatically and analyze results accurately.
–        In addition to validating the full functionality of an application prior to its initial release, users can easily evaluate the impact of new enhancements on existing functionality by simply reusing existing test casts.
QA Wizard – Seapine Software
–        Completely automates the functional regression testing of your applications and Web sites.
–        It’s an intelligent object-based solution that provides data-driven testing support for multiple data sources.
–        Uses scripting language that includes all of the features of a modern structured language, including flow control, subroutines, constants, conditionals, variables, assignment statements, functions, and more.
Evaluation Criteria
Record and Playback         Object Mapping
Web Testing Object              Identity Tool
Environment Support        Extensible Language
Cost                                            Integration
Ease of Use                             Image Testing
Database Tests                     Test/Error Recovery
Data Functions                    Object Tests
Support

3 = Basic  2 = Good  1 = Excellent

Tool Selection Recommendation

Tool evaluation and selection is a project in its own right.
It can take between 2 and 6 weeks. It will need team members, a budget, goals and timescales.
There will also be people issues i.e. “politics”.
Start by looking at your current situation
– Identify your problems
– Explore alternative solutions
– Realistic expectations from tool solutions
– Are you ready for tools?

Make a business case for the tool

–What are your current and future manual testing costs?
–What are initial and future automated testing costs?
–What return will you get on investment and when?

Identify candidate tools

– Identify constraints (economic, environmental, commercial, quality, political)
– Classify tool features into mandatory & desirable
– Evaluate features by asking questions to tool vendors
– Investigate tool experience by asking questions to other tool users Plan and schedule in-house demonstration by vendors
– Make the decision

Choose a test tool that best fits the testing requirements of your organization or company.

An “Automated Testing Handbook” is available from the Software Testing Institute (www.ondaweb.com/sti), which covers all of the major considerations involved in choosing the right test tool for your purposes.

Wednesday, March 7, 2012

Performance Counters And Their Values For Performance Analysis


Performance Counters:
Performance counters are used to monitor system components such as processors, memory, network and the I/O devices. Performance counters are organized and grouped into performance counter categories. For instance the processor category contains all counters related to the operation of the processor such as the processor time, idle time, interrupt time and henceforth.  If performance counters are used in the application, they can publish performance-related data to compare them against acceptable criteria.
The number of counter parameters to be considered by the load tester/designers greatly varies based on the type and size of the application to be tested. Some of the Performance Counters and their Threshold values for Hexaware Performance Analysis are as follows:
Memory Counters:
Memory: Available Mbytes –This describes the amount of physical RAM available to processes running on the system.
Threshold to watch for:
Available Mbytes consistent value of less than 20 to 25 percent of installed RAM is an indication of insufficient memory. Values below 100 MB may indicate memory pressure.
Note: This counter displays the last observed value only. It is not an average.
Memory – Pages /sec-Indicates the rate at which pages are read from or written to disk to resolve hard page faults.
Threshold to watch for:
Memory-Pages /sec higher than 5 indicates a possible bottleneck
Process: Private Bytes: _Total -Indicates the current allocation of memory that cannot be shared with other processes. This counter can be used to identify memory leaks in.NET applications
Process: Working Set: _Total - This is the amount of physical memory being used by all processes combined. If the value for this counter is significantly below the value for Process: Private Bytes: _Total, it indicates that processes are paging too heavily. A difference of more than 10% is probably significant.
Processor Counters:
% Processor Time_Total Instance - Percentage of elapsed time a CPU is busy executing a non idle thread (An indicator or processor activity).
Threshold to watch for:
Processor % Time of sustained at or over 85% may indicate that processor performance (for that load) is the limiting factor.
% Privilege Time-Percent of threads running in privileged mode (file or network I/O, or allocate memory)
Threshold to watch for:
Processor % Privilege Time consistently over 75 percent indicates a bottleneck.
Processor Queue Length - Number of tasks ready to run than the processors can get to.
Threshold to watch for:
Processor Queue Length greater than 2 indicates a bottleneck.
Note: High values many not necessarily be bad for % Processor Time. However, if the other processor-related counters are increasing linearly such as % Privileged Time or Processor Queue Length, high CPU utilization may be worth investigating.
  • Less than 60% consumed = Healthy
  • 51% – 90% consumed = Monitor or Caution
  • 91% – 100% consumed = Critical or Out of Spec
System\Context Switches /sec. Occurs when higher priority threads preempts lower priority threads that are currently running, and can indicate when too many threads are competing for processor time. If much processor utilization is not seen and very low levels of context switching are seen, it could indicate that threads are blocked
Threshold to watch for:
As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.
Disk Counters:
Physical Disk (instance)\Disk Transfers/sec
To monitor disk activity, we can use this counter. When the measurement goes above 25 disk I/O’s per second then we got poor response time for the disk (which may well translate to a potential bottleneck. To further uncover the root cause we use the next mentioned counter.
Physical Disk (instance)\% Idle Time
This counter measures the percent time that the hard disk is idle during the measurement interval, and if we see this counter falling below 20% then we will likely get read/write requests queuing up for the disk which is unable to service these requests in a timely fashion. In this case it’s time to upgrade the hardware to use faster disks or scale out the application to better handle the load.
Avg. Disk sec/Transfer - The number of seconds it takes to complete one disk I/O.
Avg. Disk sec/Read - The average time, in seconds, of a read of data from the disk.
Avg. Disk sec/Write - The average time, in seconds, of a write of data to the disk.
Less than 10 msvery good
Between 10 – 20 msokay
Between 20 – 50 msslow, needs attention
Greater than 50 msserious I/O bottleneck
Note:  These three counters in the above list should consistently have values of approximately .020 (20 ms) or lower and should never exceed.050 (50 ms).
Source: Microsoft
Network Counters:
Network Interface: Output Queue Length - This is the number of packets in queue waiting to be sent. A bottleneck needs to be resolved if there is a sustained average of more than two packets in a queue.
Threshold to watch for:
If greater than 3 for 15 minutes or more, NIC (Network Interface Card) is bottleneck.
Network Segment: %Network Utilization - % of network bandwidth in use on this segment.
Threshold to watch for:
For Ethernet networks, if value is consistently about 50%-70%, this segment is becoming a bottleneck.
Conclusion : These values may not exactly depict the threshold limits but provides a consideration to be valued upon for Performance Analysis.