LoadRunner CI

LoadRunner CI in earnest - bringing it all together. Latest updates

I�ve spent the last year with HP (and Infuse) trying to get the all new LoadRunner CI working. This page finally brings it all together and I�m now up and running. For context you might want to read LR Slave, LR CI Graphs and maybe look at LR Rules OK before delving into this page.

In practice there are lots of details to get right before a CI performance test process will work for you. Here I have developed a simple solution for a specific project we have in a full commercial environment. We want CI to run regularly to pick up any performance regressions between builds and we want a simple pass/fail answer. This is not actually straight forward.

I�ll be building on my Jenkins solution which has been in use for a couple of years now but in some ways LoadRunner is a bit more clunky. Of course it does bring a lot more functionality with it than jmeter so this is perhaps to be expected. And with a few bash scripts, we get nice click through reports and trend graphs.

For this project my main approach is to replay production log files. The big advantage of this is we get the correct cache hit ratios AND, particularly for my current employer, the end points are kept up-to-date. Here, we have constantly rolling online content and urls go out of scope constantly and new one come online, throughout the day, apart from weekly, monthly etc.

This means, I can�t just run the files, I need to make sure each end point works, otherwise any error conditions can get swamped by expected error rates. My overall process is as follows:

    1. Build the performance test environment (we have jenkins scripts to do this)

    2. Build any test tool images needed in AWS (currently I leave these up as they are in constant use but I do have spark up and tear down scripts. See here.

    3. Run a �check_url.lrs� LoadRunner scenario. This checks the Production log file endpoints and puts any known good ones in VTS (an in-memory DB)

    4. Run the performance test using the data from VTS

    5. Tear down the test tools if required

    6. If the test passes, tear down the performance test environment. If we get a failure we leave the environment up so it can be investigated.

On top of this, we want to control the builds from our main company Jenkins server and have the load test running on our own AWS LoadRunner controllers. The VTS in-memory DB needs to be accessible from all the LoadRunner AWS injectors and of course all the results need to come back to the company Jenkins server as a set of nice trend graphs and click through test reports.

One other thing that will come up shortly will be fitting all our CI performance tests into our calendar. Although each test can have it�s own test environment there are a few services that are shared between them and of course there are some projects that need to run together, all depending on how the shared resources work in production and how the application peak activities cross over.

Lets have some pictures now!

We have a whole series of jobs in Jenkins, step 1 build env, step 2 run test, step 3 tear down:

image00212

We start the process off with a build environment job:

image00309

For our daily CI, these are set off periodically. e.g.:

image00410

If successful, the build job will call the check_url LoadRunner scenario job. However, you first have to setup your LoadRunner controller slave and connect it to the main Jenkins Master (see here).

image00806
image00905
image029

Once setup, we can call the LoadRunner jobs on this slave. First the check url job:

image00612

NOTE: for this script there is an important plugin setting: Errors to ignore. The check url script will produce a lot of errors but the plugin will set the build to fail even on a single error. Unless, you add a few entries in the �Errors to Ignore� list under � LoadRunner specific settings�. AND, note: the help is wrong. You need to add in the text you want to look for WITHOUT quotes. In this case I want to ignore all errors. I do still set SLAs on error rates and throughput to decide whether this step has passed. See below.

image01502

We do still need to check that this step does pass, for example if the site was down or we didn�t get enough data to run the test, we wouldn�t want to continue. With this in mind, I set a throuhgput and an acceptable error rate. I am going to run the script at 50tps so I want o be sure I reach at least 35tps AND I only want to accept error rates up to 25tps. beyond this I probably wont have enough data to work with and will need to investigate, probably renewing my production log files. This is all set within the LoadRunner SLA dialog:

image01602
image017

(Details about checking the SLA results will be given below for the �run-test� job).
If the check URL script works, you now have good quality end points in your in-memory DB:

image00116

NOTE: the production end points are put straight in here as they come up and then the test run pulls them out at random. So we get the correct mix of calls in the final test, as actually seen in Prod. There are some details to be considered here. For example I run my tests for a certain length of time but at absolute peak rate. So i want to get through all the URLs at that peak rate but hit the correct cache ratio. This does vary with the number of URLs you have in the DB. We also need to take into account the length of our cache and number of repeated calls in this DB (remember a lot of calls are stripped from the original log because they have gone out of date).

In practice, you really need to monitor the cache hit ratio during a test run and determine where all these factors line up for your particular application. We have done this here and have tuned the model and data gathering scenarios accordingly.

Now we can trigger the actual test run that makes use of this DB data to give our working test results:

image00511

The Jenkins test run is setup according to this page. Scroll right down the page to get the latest updates. The bash scripts are now working with all the SLAs you can setup in LoadRunner. By the way, your scenario does need SLAs configured AND the user groups in the scenario MUST have a runtime set i.e. you cannot use the �run until complete� option.

 

Scripting notes:

For this project the script creates transactions on the fly, depending on terms found in the URLs:

    if(strstr(newEndPoint,"catchup"))
       sprintf(nextPage, "catchup");
    else if(strstr(newEndPoint,"episode-guide"))
       sprintf(nextPage, "episode-guide");
    else if(strstr(newEndPoint,"articles"))
       sprintf(nextPage, "articles");
    else if(strstr(newEndPoint,"categories") && !strstr(newEndPoint,"all"))
       sprintf(nextPage, "categories");
    else if(strstr(newEndPoint,"categories") && strstr(newEndPoint,"all"))
       sprintf(nextPage, "atoz");

Followed by:

    lr_start_transaction(nextPage);

    web_url(nextPage,
    strtemp1,
    .
    .
    "Mode=HTTP",
    LAST);

    lr_end_transaction(nextPage,LR_AUTO);

(the above code is just a snippet, we do some work with the URLs and we have various tests for pass/fail.)

BUT, there is a problem with this approach. When you come to add your SLAs to the scneario, LoadRunner only recognizes hard coded transaction names, so you can�t add all these variable coded transactions:

image018

The workaround for this is to hard code all your transactions somewhere. I put them in vuser_init() like this:

image020

(I think we might be able to comment these out as well, once the scenario is set up, need to test this yet. Don�t really want these dummy transactions in if we can help it as they could skew our results.)

Now, we see this in the SLA setup screen (you may need to re-open the scenario):

image019

ASIDE: Just come across an issue when trying to get all this working on our corporate (linux) master Jenkins to my LoadRunner (windows) slave. There are some issues working from linux Master to windows Slave with cygwin. Finally worked out how we can get our shell scripts working. I�ll update the other LR CI pages (1, 2) also. You need to add to your �execute shell� scripts the following:

    #!c:/cygwin/bin/bash

right at the top. Then the linux mast jenkins can find bash on the windows slave... Was tricky to find that fix...

So, everything is working now and after the run we end up with pass/fail condition in the xml results file and our �sla_to_jtl.sh� will graph these in Jenkings:

    SLA report:
    calculating SLA
    SLA calculation done
    WholeRunRules : 26
    ================================
    Full Name : Average Hits Per Second
    Measurement : AverageHitsPerSecond
    Goal Value : 35
    Actual value : 74.5433364398882
    status : Passed
    ================================
    ================================
    Transaction Name : articles
    Percentile : 95
    Full Name : Transaction Response Time (Percentile)
    Measurement : PercentileTRT
    Goal Value : 0.3
    Actual value : 0.217
    status : Passed
    ================================
    ================================
    Transaction Name : atoz
    Percentile : 95
    Full Name : Transaction Response Time (Percentile)
    Measurement : PercentileTRT
    Goal Value : 0.3
    Actual value : 0.14199999
    status : Passed
    ================================
     
     
    ================================
    error count
    metric = error_count
    jtl_time (for graph) = 7
    ================================
    ================================
    total number of errors
    metric = total_num_errors
    jtl_time (for graph) = 14
    ================================
image021
image023

And any of these SLAs can trigger a build failure:

image024

In practice I have SLAs for average TPS, acceptable error rates and all the 95th percentile transactions. Typically I ignore individual errors by excluding �error� and �Error� in the plugin setting but at a later date I may change this to only ignore 300s, 400s etc. so 500s are picked up. In this case though, one single 500 error would fail the build and this is a bit drastic for CI so would need experimenting with.

More problems with a linux master and a windows slave

Another issue has come up with using our main Jenkins master which is linux with this LoadRunner (windows) setup:

The workspace folder is getting muddled up. Instead of the linux master setup with the workspace folder as:

z/jenkins/workspace/all4_check_urls_15mins (as done in the windows master),

it needs to be: /cygdrive/z/jenkins/workspace/all4_check_urls_15mins and is then trying to save to:

C:\cygdrive\z\jenkins\workspace\all4_check_urls_15mins\cd3eb4

To get around this, I have done the following:

    1. Go to (or create) c:\cygdrive\z

    2. mklink /D jenkins z:\jenkins

This will redirect the linux master to the correct location and the windows master will be unaffected. Build this into the LoadRunner AWS image.

This solution also gives you click through access to the standard LoadRunner HTML report (follow the links from the build artifacts):

image025
image026
image027
image028

Questions

what about POST requests using this approach (I hear you ask)?

The above approach of replaying log files only really works with GET requests, since all the data in in the access log file. I do have a project where I am catering for POSTS. in this case I have a separate data file with the POST data, obtained from the dev team. I still check this with the check_url step but put the POST data in it�s own set of columns in VTS, then use this in the final test run. The script simply checks the METHOD to decide what to do with a particular log file line, so the method is pulled out with each line.

what about new functionality, not yet in the log files?

My scripts also cater for new functionality not yet in the log files. In this case I have a separate random generator and for certain percentages of script iterations, call my bespoke code for each individual new end point. This way I can cater for various blends of new and old calls. This is working well in practice.

what about more complex calls, needing correlation?

I do have a couple of new functionality calls in one project that needs more work. I need to run a request, get some embedded endpoints out of it and run them as well. I simply cater for this in the script as usual, wrapping everything in if else blocks etc. Note, this sort of thing doesn�t really arise for calls already in the log files, simply because the log file have all the calls in them (!) so the embedded requests are already represented. Of course you may need to watch out for his and not double up any call frequencies by inadvertently adding extra request hierarchies.

NEXT UP

I�ll be looking at some way of getting variables set in the Jenkins Master front end through to the LoadRunner test scripts. I especially want easy access to the environment any particular test points to. I�m working on a solution similar to my Jenkins one, with the scripts init section reading from a file which will be updated by a bash script from jenkins. See here for latest update on this topic.

Adjusting SLAs from Jenkins

Another issue that has come up is adjusting the SLA values in a scenario without having to load up in LoadRunner, go through all the dialog boxes, click click click, save the scenario, get it on to all our controllers etc.

Instead, we just want to change the values directly from Jenkins so we can fine tune the builds to allow for SLA adjustments over time to stop unnecessary build failures. See here for that solution too.

[Home] [About (CV)] [Contact Us] [JMeter Cloud] [webPageTest] [_64 images] [asset moniitor] [Linux Monitor] [Splunk ETL] [Splunk API] [AWS bash] [LR Rules OK] [LR Slave] [LR CI Graphs] [LoadRunner CI] [LR CI Variables] [LR Bamboo] [LR Methods] [LR CI BASH] [Bash methods] [Jenkins V2] [Streaming vid] [How fast] [Finding Issues] [Reporting] [Hand over] [VB Scripts] [JMeter tips] [JMeter RAW] [Dynatrace] [Documents] [FAQ] [Legal]

In the Cartesian Elements Ltd group of companies