Why Performance Engineering

    Most of the organizations report that they are not satisfied with the performance of business critical applications
  • Application performance impacts corporate revenues by 9%
  • High resource utilizations
  • Cannot scale to support more users accessing the application
  • Availability of the system is a challenge
  • Poor response times
About Performance
Performance engineering is a structured approach and is to isolate and validate whether the established application or product is meeting its non-functional requirements like response time with expected number of concurrent users, hits / second, transactions per second (TPS), resource utilisation is  in acceptable limits, scalability, availability,  etc.

If there are any transactions response time is not acceptable as per the defined NFRs – then performance engineering approach will be adopted to identify the bottleneck, its root cause for the bottleneck and provide solution for the bottleneck without having an impact on transaction / applications functionality (performance team will have multiple rounds of discussion with development team to make sure functionality should not break due to the solution of performance issue resolution).

Drivers for Performance Engineering
Services offered

    Performance Testing / Benchmarking of Application / Product

  • Identify the WLM and execute the performance as per the Load patterns defined
  • With this service bottleneck identification and root cause analysis is out of Scope

    Performance Assessment

  • In addition to performance benchmarking – Identify the bottlenecks and its root cause of problem if identified during performance testing (ex: slow response times, high resource utilisation, out of memory exceptions, Disk IO errors, etc)
  • Provide the recommendations to improve the response times or resource utilisation
  • In-turn performance engineering team should help to run the application in production with NFR compliance
    Scalability Assessment
  • Before getting into scalability assessment – performance team has to make sure that the performance tuning completed for all identified business critical transactions
  • In addition to performance assessment – identify the hardware scalability limits by executing the various set of tests
  • Once the # of concurrent users reaches its hardware limits – validate if there is any feasibility to increase the number of concurrent users by adjusting the configuration parameters and provide the final set of scalability limits
    Capacity Assessment based on future growth projections
  • Identify the business critical transactions and its WLM
  • Once the WLM is identified as a first step Tune all the business critical transactions
  • Perform scalability assessment with the existing hardware
  • Capture the future growth projections (at least 3 years) of the application as per the below attributes
    • # of Transactions and its complexity
    • # of concurrent users & Expected response time
    • DB Volumes
    • Major releases planned
    • Approximate # of business critical transactions growth (it should be added to the existing WLM in the future – and we assume some complexity based on the existing scenarios)
  • By using performance modelling techniques identify service demand and utilization laws w.r.to the following and create the future hardware requirements
    • CPU, Memory, Disk .. etc.,
Generic Approach

Activities for Performance Engineering Phases

  • Performance engineering assessment has been distributed into 6 phases and can be performed in 2 categories

    • 1. Pro-Active performance engineering approach
  • i.Performance engineering team will be involved from the beginning of the development cycle from requirements gathering to production roll-out


    • 2. Re-Active performance engineering approach
    • i. If the application is already in production and facing any performance, scalability, availability issues – these issues can be addressed by directly monitoring in production itself OR by simulating the issue in performance test environment
    • a. For Re-Active approach performance engineering team might need to capture Thread dump, Heap Dump for identifying the bottlenecks and for its root cause analysis (on need basis)
    However, below detailed activities will be customized and performed for Performance Engineering assessment depends on the type of application and the type of approach.


Detalied Activities For Performance Assessment

NFR Gathering and Analysis


Design


Development


Execution


Analysis


Reporting
  • a. Capture and Document nonfunctional requirements
    • b. Capture work load patterns or Work Load Model
    • i. Work load model (WLM) can be captured in 2 ways

  • 1. Directly from business users or application stakeholders, etc if the application is in new and development is in progress
    • 2. If the application is already in production - performance engineering team can capture the application access logs and identify most frequently used business critical transactions and the # of concurrent users for each transaction -
    • ii. WLM will be used for further performance testing and engineering activities
  • A. Prepare performance test strategy document which contains the below details
  • 1. Capture work load patterns or Work Load Model
  • 2. Develop a PoC to finalize on testing tool feasibility as per customer budget
  • 3. Validate the application environment w.r.t hardware and its resources and it should be equivalent to production or it should scaled down version of production
  • 4. Identify tools for monitoring the application technology stack
  • B. Ex: Native utilities like JConsole for Java, Perfmon for windows or .Net applications, vmstat / iostat for linux or unix operating systems, etc
  • 5. Identify the tool for profiling the slow responding transactions

  • a. Ex: JProfiler, JProbe, NetBeans for Java code profiling
  • b. DotTrace, ANTs profiler for .Net based application code profiling
    • c. xHprof / xDebug for PHP bases applications, etc
    • 6. Create the schedule based on the development and UAT

  • 1. Schedule varies depends on the performance engineering approach ( Pro-Active and Re-Active approach )
    • a. Prepare the check list for performance test execution
    • b. Validate the transaction navigation flow as per WLM
    • c. Create performance test scripts as per the navigation flows identified
    • d. Create performance test data with the help of QA / Development team
    • e. Perform dry runs for multiple iterations in the identified performance test environment to validate performance test scripts and its test data to handle dynamic data properly
    • f. Create performance test scenarios
    • g. Deploy monitoring tools / utilities as per the technology stack
  • (Execution and Analysis will be conducted for 2 to 3 iterations depends on the performance bottleneck identification and its resolution)
    • Verify if the application has been deployed correctly (QA team has to certify the functionality)
    • Validate the performance test environment for availability of sufficient data
    • Create the workload pattern designed during design phase
    • Turn on the tools to monitor the servers
      • Execute performance tests as per the scope
      • i. Baseline test
      • ii. Scalability test to identify hardware scalability limits based on the concurrent users
      • iii. Endurance test for at least 8 hours duration to understand the system stability
    • Monitor and collect the Server side Metrics during test execution (if performance engineering team cannot get access - client's admin can help in enabling and providing the data to performance engineering team)
      • The below High Level Monitoring Metrics will be captured and reported for the performance tests
      • i. Client Side Metrics
    • 1. # of Concurrent Users
    • 2. Transactions response time
      • 3. Hits / Second for the test
      • ii. Server Side Metrics
    • 1. OS Resources - CPU, Memory, Disk IO, et
    • 2. Application Server metrics - Session count, Connection pool, Heap usage, # GC Collections, etc
      • 3. DB server metrics - Slow responding queries, buffer hit ratio, Table locks, Indexes, Wait Events, etc
      • iii.Batch Job Metrics
    • 1. Total no of record processed per batch job
    • 2. Total time taken for batch job processing
    • 3. Server Resource Usage during batch job execution
    • 4. Total no of failed/error records during batch execution & their reason for the failure
      • 5. Is batch job execution time within acceptable limits
      • If not, profile the job and try to tune the bottlenecks if any
    • 6. Prepare and present initial test reports and results as and when test executions are complete
  • A. Correlate performance test results, resource utilization metrics and identify bottleneck areas
  • B. Identify slow responding transactions
    • a. Setup profiling tools to profile the slow responding transactions to identify the root cause for the response time slowness
    • b. Identify code blocks and conduct code walk through along with development to fix the performance issue without having an impact on the functionality
  • Provide recommendations to development team for implementation
  • Execute another round of performance test to validate the response time improvement
  • a. Capture and Document nonfunctional requirements
  • b. Analyze and report test results after each execution
  • c. Prepare and present overall performance analysis report to Stakeholders.
  • d. Certify the application for production implementation and provide the scalability limits of the existing hardware
  • e. On need basis provide the capacity requirements based on future projections
  • Challenges in Performance Engineering
    • Budget for performance testing and engineering for any application or software product
    • Limited knowledge and Importance of non-functional requirements
    • We will see attitude “when problem comes” lack of early problem identification
    • Invalid and low volumes of Test data preparation and improper scripting
    • Neglecting Batch jobs and other maintenance jobs as part of work load characteristics
      • Neglecting network bandwidth issues and developing Rich Internet applications, etc
        • When Page size is high and it leads
        • to high network bandwidth usage
        • High # of http round trips
    • Improper work load characteristics and concurrent users definition
    • Lack of knowledge to implement best practices during application development phase
    Deliverables
    S No Deliverables (Document / PPT) Content
    1 NFR Document
    • Performance Test requirements includes, Environment, Architecture, Hardware and Business processes
    2 Performance Test Strategy
    • Overall Test Strategy, Process,  Scenarios, Assumptions, Support required from Customer, etc
    3 Test Scripts (using LoadRunner / JMeter / NeoLoad / Silk Performer, etc)
    • As per the Identification of Business Critical Scenarios
    4 Performance Test Results
    • Performance test results report after each execution of performance test
    5 Final Performance Assessment Report
    • Overall performance test results
    • Performance Benchmark report / Performance before and after performance tuning
    6 Status Reports Weekly Status report contains Weekly Progress, Issues, Dependencies and risks (if any)
    Tools Expertise
    Monitors