JMeter RAW

JMeter bespoke and general request calling module

[See bottom of page for version history] 8th January 2015: Download project

Put these extensions under jmeter/lib/ext: jmeter extensions.zip

Before downloading, please read our terms and conditions

I want to run tests straight from raw log files from our app servers. First steps can be seen here.

The main reason for this approach is to keep test data up-to-date and keep request ratios relevant to recent site activity.

This is mainly for regression testing in our stage environments. I can pull access logs from production and do very little work on them before using them to drive our latest round of testing.

And rather than just replaying logs, this approach provides timings for various groups of calls, allowing us to hone in on any potential issues with the code at an early stage.

So this is a general performance model BUT bespoke to our project needs.

This page will go through the model’s approach and you can download the source code above.

We start with a (cleaned up) application access log file (behind the cache) - typically something like 900,000 rows per server per day:

210.155.zz.uu - - [26/Nov/2014:06:26:44 +0000] "GET /sta/130x73/title4/79866028-e11e-4014-9702-9a2cca39d93a_625x352.jpg  HTTP/1.1" 200 4593  "http://www.<server>.com/programmes/title2/videos/all/s4-ep4-great-fulford" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.1.25 (KHTML, like Gecko)  Version/8.0 Safari/600.1.25" - 0.086

2.20.xx.yy - - [26/Nov/2014:06:26:44 +0000] "GET /proj/160x90/videos/b1f36bd5-5acd-4d76-be9f-ce921bf34b11.jpg HTTP/1.1"  200 3859 "http://www.<server>.com/programmes/title1" "Mozilla/5.0 (iPad; CPU OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53" - 0.039

209.148.vv.ww - - [26/Nov/2014:06:26:44 +0000] "GET /programmes/service/brand/title3.xml HTTP/1.1" 200 1984 "http://www.<server>.com/programmes/title3/on-demand/57726-001"  "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36" - 0.000

We do one thing to this file to drastically cut down our data transfer requirements: we pull out the main field we really need for our jmeter script: field 7:

    cat access.log.12 | awk '{print $7}' | grep ^/programmes > project_urls

This gives us something like this (900,000 rows):

    /programmes/channel/module/channelGuide
    /programmes/luxury-comedy/videos/all/s2-ep4-cool-test
    /programmes/channel/module/channelGuide
    /programmes/channel/T1/module/nowNext
    /programmes/four-rooms-us/episode-guide/series-1/episode-1
    /programmes/hush/episode-guide?intcmp=header_result_search
    /programmes/the-lady-and-the-revamp
    /programmes/faking-it/on-demand/31809-003
    /programmes/the-hunter/episode-guide
    /programmes/service/asset/3790294

Then we come to the jmeter script:

There are several elements to this script but on the whole it is fairly simple.

The intialisation section reads in properties and defines a random timer based on injector number. This is to offset startup of injector load when used with the cloud solution here.

We then read in the csv file line by line, adjust the url if needed (in our case there are some date offsets needed), define our assertions and then play the line.

We can also decide whether to use the bespoke caller which plays the full url including parameters, or just use a standard JMeter HTTP Request module.

NOTE: this script is only aimed at simple GET requests that can eaily be dug out of an access log.

I do have a timer in there just to allow some control on transaction rates and that’s it. A very simple outline but it gives us a full production model of behaviour.

The only thing left that we do need to do is set the transaction rate for the performance test.

image00306

Calculating transaction rates for the model

Basically we are going to replay a log file but for performance testing typically you want a steady state. This allows for metrics to be gathered under repeatable conditions and requirements to be more easily specified. And of course to make life easy, you don’t want to calculate the rates from the log file time stamps on a per line basis.

To do this I load the log file into Excel. In my case I am not actually replaying all the lines in the log file and you may have to watch out for this. About half our logged lines are redirected by nginx to static servers and don’t make it to the app so are filtered out. So I load into excel the same lines I am going to replay.

I load it in such a way that the time is split into hours, minutes, seconds, then do a very simple estimate of TPS distribution as follows:

In column G We’ve got the seconds. In column H we count the number of requests in a given second and then in column I, we pull out the top numbers:

image003a

I then graph column I:

image004a

From this I’m going to take a steady state TPS estimate of 35tps per server (this is a log file from one server). It is interesting to note the fairly steady load on the app server. This is expected as there is a cache in front of it.

A few Model details

A few things worth highlighting: in the ‘adjust url’ section, we do some work on dates. This allows our particular date related urls to work without having to update our data file every day:

image00112

And in the ‘define assertions’ pre-processor you can see how we decide which group to put our urls in:

image00210

And that assertion name is used to name the samplers and the assertions:

image00307
image00409

Worth mentioning also, in the ‘adjust url’ section, we need to cater for GET parameters if we are using a standard request handler. In my case I have just ignored them but you may want to use some code to set some variables up to add to the handler:

image00508

So our standard handler just calls the basic url:

image00609

GUI Mode

As an aside at this point, we see in GUI mode, we get graphs and reports of everything we need, broken down by the assertion labels:

image001b_

More on Input Data

I’m going to run several threads per jmeter machine and across several machines (injectors). To avoid repeating too many calls, I’m going to give each thread it’s own data file. Each injector will still use the same script and same bunch of files per thread but this is a good enough compromise for my particular case. So in fact we’ll have 5 threads per machine, each with their own data file and then maybe 8 injectors, replaying those files. It doesn’t take much work to have a definitive file per thread per injector (40 files) as I do have the injector number available as well as the thread number. That refinement is left to the reader.

To create data files per thread i do the following:

Take my raw log file and split it into hours worth of data:

    START=0
    END=23
    for ((i=START;i<=END;i++)); do
                   if [ "${i}" -lt 10 ] ; then
                                   hour="0"$i
                   else
                                   hour="$i"
                   fi              

                   hour_ending=$i
                   echo $hour

                   grep ":$hour:[0-9][0-9]:[0-9][0-9]" access.log.12 | awk '{print $7}' | grep ^/programmes > bips_urls_$hour_ending
                  

    done

    ls -all bips_urls_*

This gives:

    rw-r--r--+ 1 ngodfrey Domain Users 150778 Dec 17 15:04 bips_urls_0
    -rw-r--r--+ 1 ngodfrey Domain Users  101619 Dec 17 15:04 bips_urls_1
    -rw-r--r--+ 1 ngodfrey Domain Users 1745879 Dec 17 15:05 bips_urls_10
    -rw-r--r--+ 1 ngodfrey Domain Users 1633009 Dec 17 15:05 bips_urls_11
    -rw-r--r--+ 1 ngodfrey Domain Users 1903780 Dec 17 15:05 bips_urls_12
    -rw-r--r--+ 1 ngodfrey Domain Users 2013279 Dec 17 15:05 bips_urls_13
    -rw-r--r--+ 1 ngodfrey Domain Users 1726307 Dec 17 15:05 bips_urls_14
    -rw-r--r--+ 1 ngodfrey Domain Users 1702388 Dec 17 15:05 bips_urls_15
    -rw-r--r--+ 1 ngodfrey Domain Users 2193487 Dec 17 15:05 bips_urls_16
    -rw-r--r--+ 1 ngodfrey Domain Users 2333565 Dec 17 15:05 bips_urls_17
    -rw-r--r--+ 1 ngodfrey Domain Users 1475807 Dec 17 15:05 bips_urls_18
    -rw-r--r--+ 1 ngodfrey Domain Users  902826 Dec 17 15:05 bips_urls_19
    -rw-r--r--+ 1 ngodfrey Domain Users  71300 Dec 17 15:04 bips_urls_2
    -rw-r--r--+ 1 ngodfrey Domain Users  358192 Dec 17 15:05 bips_urls_20
    -rw-r--r--+ 1 ngodfrey Domain Users 1183043 Dec 17 15:05 bips_urls_21
    -rw-r--r--+ 1 ngodfrey Domain Users 1744395 Dec 17 15:05 bips_urls_22
    -rw-r--r--+ 1 ngodfrey Domain Users  694638 Dec 17 15:05 bips_urls_23
    -rw-r--r--+ 1 ngodfrey Domain Users  69246 Dec 17 15:05 bips_urls_3
    -rw-r--r--+ 1 ngodfrey Domain Users  60539 Dec 17 15:05 bips_urls_4
    -rw-r--r--+ 1 ngodfrey Domain Users  39750 Dec 17 15:05 bips_urls_5
    -rw-r--r--+ 1 ngodfrey Domain Users  63445 Dec 17 15:05 bips_urls_6
    -rw-r--r--+ 1 ngodfrey Domain Users 1030071 Dec 17 15:05 bips_urls_7
    -rw-r--r--+ 1 ngodfrey Domain Users 1283950 Dec 17 15:05 bips_urls_8
    -rw-r--r--+ 1 ngodfrey Domain Users 1716518 Dec 17 15:05 bips_urls_9

And from this I decided to use the larger files: bips_urls_7 to bips_urls_18 inclusive (allowing for up to 12 threads)

So in my CSV Data Set Config I use the following for the file name:

    ../data/bips_urls_${__intSum(${__threadNum},6)}
    (NOTE: you need the latest updates to
    cloud project to use variables in the path name in that solution)

image00113

Assertions, results and trend graphs

So we run the model and get one assertions file with all our metrics in it (results-asertions.csv). We then run some bespoke scripts from within Jenkins to get a quick report file and all the jtl files we want (see the full cloud solution for details). Because of the slight change in requirements here, we have some new shell scripts. They use labels rather than assertion file names to determine the data for analysis.

I’ll add the files to the general cloud solution project (as they make more sense there) but the ones your after are:

  • rate-check-bespoke.sh
  • assertion-check-bespoke.sh
  • jmeter-percentiles-v2-bespoke.sh

After running the test you get files in Jenkins as usual:

image00903

And the quick report gives you metrics based on the labels you defined as assertions:

image01003

And of course from all this you get the Jenkins graphs and the pass/fail criteria that make or break the build:

image001d

More on Data Files and Threads

If you want to use data files per thread, check your setting for the sharing mode. Either use one large file and set the sharing mode to ‘All threads’ or use files per thread and set the sharing mode to ‘Current thread’. See below:

image001c
image002c

Version History

8th January 2015
Added a couple of major changes and some minor ones:
1. Changed variable ‘bespoke’ to local. This means we can easily set it to ‘ignore’ for any urls in the production log file that we don’t want to replay.
2. Added to the model to allow calling new urls that are not yet in the production log file. Percentages can be set per new call.
3. Added dynaTrace headers. Also set in the bespoke caller.
4. Added a better overall request counter, with assertion names or new calls named in the output file.

[Home] [About (CV)] [Contact Us] [JMeter Cloud] [webPageTest] [_64 images] [asset moniitor] [Linux Monitor] [Splunk ETL] [Splunk API] [AWS bash] [LR Rules OK] [LR Slave] [LR CI Graphs] [LoadRunner CI] [LR CI Variables] [LR Bamboo] [LR Methods] [LR CI BASH] [Bash methods] [Jenkins V2] [Streaming vid] [How fast] [Finding Issues] [Reporting] [Hand over] [VB Scripts] [JMeter tips] [JMeter RAW] [Dynatrace] [Documents] [FAQ] [Legal]

In the Cartesian Elements Ltd group of companies