Thursday, August 6, 2015

Load testing SharePoint Add-in (former App) Model

Validate healthy application performance behaviour

Essential for any Enterprise Application is that it can performant and scalable handle the varying usage load by the users. Nothing as embarrasing as a new application that soon after Go-Live, breaks by the enthiousastic usage of the users. To prevent such, you must build trust in the scalability of the application, and establish before Go-Live that the application – application software + system infra - can handle the expected load. Introduce load testing.
This also holds for modern SharePoint application that is composed with the Add-in model, former SharePoint App-model. But loadtesting of AddIn model does bring some extra peculiarities to loadtesting. I enumerated below the ones I encountered.
And note: our loadtesting proofed both valuable and successful: initial the loadtest revealed some performance and scalability problems. We then made some essential changes in the application code (in particular in the applied custom Add-ins / Apps), until we achieved our usage load target goal. And at the crucial moment of Go-Live, the application did not give a blinch, and perfectly handled the usage load of > 14.000 users.

Application Performance health factors

2 health factors monitored:
  1. Responsiveness of the application for the user, measured as Page Download Time
  2. Scalability of the SharePoint infra, measured as CPU, Memory and I/O utilization on the servers

Application Performance validation approach

  1. Identify target goals for application utilization
  2. "Green zone"
  3. Proof the health factors at the target-utilization goals via load testing, to simulate the real usage
  4. Identify the ‘breaking’ point via increased load/stress testing
  5. "Red zone" - performance issues monitored
  6. Determine the rootcause of the issue; this can be non-optimal code, insufficient infra parts (CPU, memory, network throughput, database IOPS)
  7. Fix the issue(s)
  8. Repeat the validation, at step 2

Loadtest execution

loadtest preparation

  1. Identify the usage/application scenarios you will use to build trust. You should select scenarios for which you expect these will be used during typical usage. An heavy transaction that in the normal operation will only be rare executed, will have a neglectable effect on the application load.
  2. Establish the target load. This is the application load for average usage. For web applications, this is typically stated in ‘Page Visits per Second’. Note that this is different from Requests per Second / RPS. In nowadays modern apps, a single page visit encompasses multiple http requests: for the page itself, dependent resources as javascript and css, and javascript calls to execute service calls for data retrieval and application functions.
    The determination/specification of the concrete target value is a challenge on itself. One easily is tempted to overstress the target value - we have 'X' users, so the parallel application usage will be 'X * Y'... However, in reality those 'X' users do not continuously all hit the application: they log on at different times, stay on pages, use other applications, go to the coffee machine, ... In our setup we identified the target value twofold:
    1. Fact: as we were introducing a renewed intranet, we could reuse the application usage statistics of the current intranet;
    2. Prediction: determine the target value via Microsoft (Bill Baer) Capacity Management Formula, an unofficial best practice recommendation
    And in our situation, the 2 values determined via the different paths delivered about the same target value, which confirmed us that we determined a realistic value.
  3. Establish the heavy load: this is abnormal but still foreseenable application usage, in special circumstances.
  4. Determine how-to build trust: manual load testing, custom test software, or utilize a load test tool – e.g. HP LoadRunner, Visual Studio LoadTest.
  5. Get sufficient test accounts to simulate different users. This is also required to prevent cache effect during load test execution. E.g. continuous retrieving user profile values of the same user.
  6. Prepare the test context for the test accounts. E.g. if the application makes use of SharePoint user profile, then the user profile must be provisioned for the test accounts to ensure reasonable load behavior.

Specialities with setting up testscripts for Add-ins / Apps

  1. The load test scenario must join in App authentication flow. In essence, this means that SPAppToken value must be set as FORM POST parameter in submit request to appredirect.aspx. The value is runtime determined in the App launcher, and returned in the initial AppRedirect.aspx response.
    In the Visual Studio webtest recording, the reference is made to this hidden field in the response.
    We encountered that the SPAppToken value is not successful runtime retrieved. This can in some circumstances be corrected by monitoring the traffic via Fiddler, and set SPAppToken to a fixed value that you get from the Fiddler trace.
  2. FormDigest value returned in JSON response from contextinfo call instead of hidden FORM parameter in response body.
    Resolution is augment the Visual Studio loadtest: Add a Text Extraction Rule to extract the value from the /_api/contextinfo JSON response.
  3. Default, Visual Studio LoadTest execution does not mimic browser-cache, resulting that each dependent resource is requested over and over. You can change/fix this by configuring the loadtest script to ‘parseDependentRequests = false’.
  4. Visual Studio LoadTest does not include the execution of javascript in the browser. If required, the activity of the javascript code must be simulated in the test scripts.
  5. With multiple provider-hosted Apps in the load test scenario, the Visual Studio loadtest scenario can make error in the runtime construction of the load test recording and assign a wrong {app_?} value. In such case, you must manually add a '<ContextParameter Name="AppId_1" Value="<APP domain value>" />', and correct the relevant Requests in the script to send request to the correct app-domain:
  6. Visual Studio LoadTest recording misses to set header variable ‘Origin’, which hinders CORS protocol handling.
  7. You can easily overwhelm the usage load by setting the ‘concurrent user’ configuration value. The use of this configuration parameter is misleading: it does not really simulate actual users. It merely sets the threads in the loadtest execution from which to continuously execute the webtest(s) in the loadtest scenario. Per thread, after finishing the webtest, the execution halts for the thinktime value; and then repeats. If you set the thinktime to zero – which is what Microsoft advices on Technet, "Don't use think times…" -, the effect is that continuously requests are fired against your application. The load on the application is then much higher as the value configured in ‘concurrent users’.
  8. Visual Studio loadagent itself can become the limiting factor. If you want to simulate a larger concurrent usage, this results in equal large set of threads in the Visual Studio execution, all of which busy to execute and monitor a webtest instance. The cpu on the load agent grows to 100%, and the load does not linear increase with the number of ‘concurrent users’ aka threads.

Load test monitoring

  • CPU, memory and disk IO per server: WFE, SharePoint backend, AppHost
  • State of the IIS queue on WFE and AppHost
  • Page download times
  • Slowest pages

Interpretation of load test output

  1. The (average) Page Response Time is the summation of the download time for that request, AND augmented with including the download times of all dependent requests beneath that main request.
  2. The RPS / Requests per Seconds output is not fit to determine whether the application + infrastructure can handle the foreseen application usage. The application usage translates in Page Visits per Second, in which each page visit typically encompasses multiple (http) requests: the .aspx request, requests for javascript and css resources. In the App execution model, each App launch on the page is in effect an own page visit. As result, the RPS factor is of little use. You must measure the ‘Page Visits per Second’ factor. Pragmatic way to monitor this is to set the thinktime for webtest on 1 minute; so that each minute the webtest is executed. The ‘Page Visits per Second’ factor then equals the Visual Studio reported 'Test per Second'.

No comments:

Post a Comment