Tuesday, December 29, 2015

The long quest to reinstate SharePoint Explorer View performance

In our company, business end-users like SharePoint Explorer View for its usage familiarity with shared folders access. Our business end-users reported non-performance of the SharePoint Explorer View. It used to perform satisfactory, but since a while the responsiveness structural degraded. Taking 30 seconds or even minutes to open the Explorer View initial. On immediate retry, it typically opens direct. Wait a while, and the problem repeats itself.
It proofed problematic to establish the root cause of this performance degradation. SharePoint Explorer View functionality depends on a stack of IT components, local, network and SharePoint server side.
We identified – based on literature study (internet search), and own common sense – the following list of potential causes:
  1. Slow SharePoint processing (IIS, SharePoint code)
  2. Slow SharePoint content retrieval (SQL, storage)
  3. Slow or blocking network
  4. Slow authentication handling (NT Explorer as web client to SharePoint server, e.g. see Prompt for Credentials When Accessing FQDN Sites From a Windows system)
  5. Slow Web Proxy Auto Detection (see Explorer View very poor performance, Slow response working with WebDAV resources on Windows systems)
  6. SMB protocol blocked (Source: Microsoft Whitepaper "Understanding and Troubleshooting SharePoint Explorer View")
  7. Interference with IIS WebDAV Publication Service at SharePoint server (SharePoint – Open with Windows Explorer – problems, 5. WebDAV Publishing, and Explorer view does not work in some scenarios when the SharePoint farm is on Windows Server 2008 R2)
  8. Conquer with network requests from other local running programs
  9. Network requests blocked or delayed by anti-virus processing
  10. Slowness in local WebDAV client processing (Windows Explorer as WebDAV client)
  11. Outdated local Windows binaries (Explorer binaries (Shell32.dll), WebDAV binaries (Webclnt.dll, Davclnt.dll, Mrxdav.sys) & SMB binaries (Mrxsmb.sys, Mrxsmb20.sys, Rdbss.sys))
  12. Interference with Internet Explorer AddOn's
It was hard to identify whether any of the above really had the negative impact. This is in particular due that it is difficult to get the overall view on the processing of all the involved IT components: local, virus scanner, network, firewall, load-balancer, server side, SQL, storage, … In reality it boils down on inspecting the behaviour of each single component, and then try to correlate it with the behaviour of the other components to gain insight in the overall picture. We inspected IIS logs to detect whether the long times originated on SharePoint processing of the WebDAV protocol, but this did not result in rootcause identification. Then I applied Fiddler to detect what is going on over the wire. And as Fiddler is limited to http protocol, I also used Wireshark to dig deeper in on the wire level, and also including other protocols. None of these exercises did result in cause identification. If any investigation outcome, my cautious conclusion was that the delay is not on the wire nor SharePoint processing, but rather originates on the local client. To investigate that, I used ProcessMonitor; benchmarking scenarios with and without opening Explorer View. I did see some noticeable differences and thus suspects for the delay: AntiVirus processing, extensive additional operation of svchost.exe; but could not make either a final confirmation or refute of this as symptom or cause.
As the progress in problem investigation was stalled, we involved Microsoft Premier Support. The engineers started with investigating our captures on problem occurrences – Wireshark, Fiddler, Process Monitor, Netsh. Their initial analysis confirmed my own finding that it was not due network delay. Next suspect on the Microsoft list was local interference with AntiVirus filters. To conform or exclude this as cause, we uninstalled the AntiVirus on a test client. Afterwards, the problem still manifested on that test client. Via a renewed scan of the network captures, Microsoft identified a 3rd suspect: delay caused on local Webclient due provider priority handling with SMB protocol above WebDAV. I actually though of this myself a few weeks before, inspired by an old (2006!) whitepaper of Microsoft Services Support. The symptoms of our issue namely matched with what is described in that whitepaper. However, for unclear reason – lack of understanding the complexity of the WebClient handling, no trust in whitepaper given its age? – my suggestion on this as problem area was not in-depth investigated by our operations service provider. As now Microsoft suggested basically the same, I convinced our operations to conduct a pragmatic validation to determine or exclude it as the problem cause. The simple first test is to block ICMP traffic on the SharePoint farm, and then re-test the performance of SharePoint Explorer View. But also for this the results were negative: no performance improvement in Explorer View observed after blocking ICMP traffic.
We finally had a breakthrough when we noticed that the Explorer View slowness did not occur in infra scenario in which the laptop via VPN connects to our SharePoint farm. It was then a matter of identifying the differences in the IT stack of components of on-premisse access versus VPN. Also I made Wireshark captures of the 2 different infra scenarios. In the Wireshark captures we noticed that in the on-premisse scenario, the retrieval of the browser Proxy Auto Configuration (PAC) file repeatedly timed-out. In the VPN scenario, this effect did not occur. The explanation of this is the different network path to retrieve the PAC file, and that of the on-premisse situation included a blocking IT node (which actually is a cloud-hosted solution).
As it turned out, this was indeed the root cause. We resolved the blocking of the PAC file through caching it on the network perimeter, and immediate we regained the performance of SharePoint Explorer View.
Bonus: the last mile for accelerating Explorer View performance ...
With this infra correction, performance of using ‘SharePoint Explorer View’ is greatly improved: opens up almost immediately, < 1 second. That is, except for the very 1st time: that takes 4-6 seconds.
The cause of this is on Windows OS level: Windows Explorer utilizes WebClient component to connect via WebDav protocol. And the very 1st time this WebClient component must be instantiated / brought to live, what takes the additional 3-4 seconds.
If you also want to get rid of those seconds, you can set the startup mode of WebClient to ‘automatic’:

No comments:

Post a Comment