PCRF experience: from functional testing to performance monsters or how did we climb the testing hill

4 Nov

Our team is developing PCRF (Policy and Charging Rules Function) product that is a significant component in operator network. We have a successful installation in Yota production network and so two important characteristics that we thoroughly care about from version to version are performance and stability under high load. To reach this goal we passed a long and interesting way from unit testing to functional regression testing and various performance testing in the end. And so here we would like to tell you some of our PCRF testing “secrets”. Below we will tell you about three main steps in PCRF testing (excluding unit testing that is zero and obvious one) that we follow during each version’s stabilisation.

1. Regression testing
We hold on to the Continuous Integration methodology and have Jenkins installed with all builds done there and unit tests and regressions tasks started from it after each build automatically. At present we have:

Component Smoke tests Regression tests
PCRF 270 972
DDF 149 600

Smoke tests are the selection from Regression part. This small set is intended for initial evaluation of the new version and thus includes the main functionality tests with the exception of the negative tests and all non-default and non-trivial configurations. One can also mention that we have independent set for PCRF and DDF (that is Data Distribution Function node – a centralized storage used by all other local PCRFs). The reports in Jenkins are presented in the following way (click to look closer):

Our QA department prepares functional tests using Robot Framework and their own library written on Python as a back-end. To make the testers life easier when working with the Diameter (that is a binary protocol) in Python we have implemented C shared library that supports all Diameter applications known to PCRF and is built together with it so any change in Diameter protocol library is delivered to QA team immediately. It is called libdiameter_converted.so and it makes the translation from XML format to Diameter binary message and back. So test can work with the Python object that can be converted to Diameter binary data and back through the XML step.
Regression tests recreate full PCRF environment emulating PCEF (Policy and Charging Enforcement Function) equipment, various DPIs (Deep Packet Inspection), AF (Application Function), provisioning from BSS (Business Support System) and O&M console operations. Example of one of our test cases together with the Robot Framework test implementation can be found here.

QA team also calculates test coverage. That is 64% for now. But this includes many Diameter messages that are present in library but not used in PCRF application and also some debugging utilities that are compiled together with PCRF and all of this is not covered with tests for now. The unit-test coverage is much better and is shown on the picture:

Rem: On Free PCRF image meanwhile one can start Seagull functional tests by the following commands set:
cd /opt/pcrf_test/simple
./ccr-i_rar_ccr-t.sh
./ccr-i_reject.sh
./ccr-i-t.sh
./ccr-i-u-u-u-t.sh

2. Seagull performance tests
To measure the performance on first-row scenarios we started from the Seagull instrument, that is a free, GPL multi-protocol traffic generator test tool, very popular for many cases, including Diameter testing (compatible with the RFC 3588) . And the significant benefit is that Seagull is optimized for the performance scenarios. And if some new Diameter application is needed this becomes a matter of editing an XML file.
Seagull scenarios that we’ve passed through are the following:
– one PCEF GxCCR-I/GxCCA-I -> GxCCR-T/GxCCA-T
– one PCEF GxCCR-I/GxCCA-I -> GxCCR-U/GxCCA-U -> GxCCR-T/GxCCA-T
– one PCEF GxCCR-I/GxCCA-I -> (GxRAR/GxRAA) * n times -> GxCCR-T/GxCCA-T
– one PCEF GxCCR-I/GxCCA-I -> GxCCR-U/GxCCA-U  with usage monitoring enabled-> GxCCR-T/GxCCA-T

Interesting report done with the help of Seagull instrument for the previous 3.5.2 PCRF version can be found here.

Next let us show some numbers from 3.6.0 version that was released recently:
– 1 000 000 subscribers (IMSIs)
– 50 unique services in PCRF dictionary
– 50 unique policies in PCRF dictionary
– 2 services for each subscriber
– 2 attributes for each service
– servers used: HP ProLiant DL360 G5: Intel Xeon E5420 2500 MHz x 2, 8 Gb RAM
– Lua script runs for policy selection for each subscriber and analyzes subscriber profile, subscriber services, service attributes
– performance run – 11000 TPS
– errors happening during performance testing – 0, no errors.

3. Self-implemented bench tool
Seagull is perfect for simple scenarios (especially when only one outer node is communicating with PCRF), but several more complicated cases also exist and need to be measured in respect of performance. They are:
– PCEF and DPI both establish sessions to PCRF for one subscriber.
– PCEF, DPI and AF (Application Function) establish sessions to PCRF for one subscriber.
– Usage monitoring performance cases with PCEF and DPI sessions established.
– Session validation scenario based on GxRAR sent from PCRF to PCEF and DPI sessions established.
– Session re-validation scenario based on GxCCR-U with revalidation event trigger sent from PCEF and DPI to PCRF.
In all above the order of the messages is not strictly determined. And despite going through the different connections messages are strictly relative to each other in terms of order and data.
When failed to make all of these complicated scenarios with branching logic and mutable messages order on Seagull we’ve decided to implement our own benching solution. This was quite easy since we’ve implemented a Diameter library, transport library and all statistics stuff as an independent components that have high re-use factor. Scenarios were implemented in C/C++ and are based on finite automaton with some non-deterministic parts included to handle mutable message order.
All tests below were done on the following configuration:
– 1 000 000 subscribers (IMSIs)
– 100 unique services in PCRF dictionary, 50 for one PCEF dialect and 50 for other one
– 100 unique policies in PCRF dictionary, 50 for one PCEF dialect and 50 for other one
– 2 services for each subscriber at a time
– 2 attributes for each service
– 250 000 subscribers have 1-3 usage accumulators assigned
– servers used: usual hosts with 4 Intel core processor i7 2.93GHz, 8 Gb RAM
– Lua script runs for policy selection for each subscriber and analyzes PCEF dialect, subscriber profile, subscriber services, service attributes, subscriber accumulator
Scenarios and results summary:

Scenario name Scenario description TPS (messages handled per sec) Simultaneous sessions count Max CPU % on between all PCRF processes
double_procera PCEF and DPI (Procera) both establish sessions to PCRF for one subscriber, usage monitoring is switched on. 9000 500 000 80-85%
double_procera with revalidation The same as double_procera scenario but with revalidation GxCCR messages included. 8500 500 000 80%
rx_simple PCEF, DPI and AF (Application Function) establish sessions to PCRF for one subscriber. 11000 500 000 90-95%

It should be mentioned here that there are no errors of any kind during these scenarios running. And any of these load-scenario can operate for the unlimited amount of time on PCRF cluster without any degradation.

Source: http://freepcrf.com/2013/11/01/pcrf-experience-from-functional-testing-to-performance-monsters-or-how-did-we-climb-the-testing-hill/#!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: