Client Analytics at BSH Home Appliances
With the implementation of Nexthink, BSH continuously monitors the operation of its approximately 50,000 clients for malfunctions and threats.
BSH Hausgeräte GmbH had a gap in its visibility into IT systems in the office and manufacturing environments: the client. Despite existing monitoring, the client remained a black box in many situations, and BSH had no way to quantify the user experience. With the introduction of Nexthink, the company now continuously analyzes the operation of its approximately 50,000 clients for disruptions and threats. The fact that this also helped resolve issues in manufacturing IT is a welcome bonus.
About 80 percent of the workforce uses a client device—including not only office staff but also many production workers. The task of protecting and managing these clients falls to Stephan Schmid, Head of Workplace Platform at BSH in Munich. Before the project began, he found himself in a situation where IT knew exactly what quality the services were leaving the data center in. “Like most companies,” says Schmid, “we were pretty much blind to the customer’s perspective. That’s because we had no way to analyze the quality of the services reaching the user. We needed to close this gap, determine the user experience, and improve it based on the results.”
Eliminate blind spots
For BSH, the situation is clear: When a system error occurs, it should be obvious which components, in what combination, caused it. If the issue lies with a server or network problems, this can be determined relatively easily through traditional monitoring. However, when it came to issues whose cause was directly attributable to the client, the IT department still lacked the necessary analytical capabilities—especially when it came to real-time analysis. “We were looking for a solution that would shed light on the blind spot on the client side,” says Schmid.
This is made possible by software from the Swiss company Nexthink. It continuously monitors—at 30-second intervals—all events occurring on the approximately 50,000 clients. This ranges from blue screens to boot times and performance metrics to failed connection attempts to services.
This is intended to reveal malfunctions and performance issues that can have a significant impact on user satisfaction. With the help of this data, the causes of poor performance and stability can be quickly identified. In the best-case scenario, problems or even outages can often be predicted before they occur. BSH chose Nexthink in part because the software goes even a step further: Using so-called self-healing processes, Nexthink not only scans the client environment for recurring issues. Rather, the “Act” module allows these issues to be resolved immediately and either fully or partially automatically. Frequently occurring malfunctions, such as a Microsoft SCCM client that has frozen, can not only be identified but also automatically restarted or repaired using custom-developed scripts if the SCCM’s self-healing fails. And this happens before the client’s performance is affected.
To keep production running
The importance of rapid analysis and troubleshooting becomes particularly clear when production is affected. A production line that comes to a standstill costs a lot of money every minute. This is exactly what happened, for example, at a dishwasher production facility where a line kept coming to a halt. Thanks to Nexthink, BSH was able to determine that this was caused by a control computer that kept rebooting for reasons that were initially unclear. “In this case, the return on investment can be objectively measured—through the avoidance of production downtime in the form of unproduced dishwashers,” Schmid concludes.
The cause of the system freeze was ultimately the rollout of a new version of the antivirus software, which later turned out to be faulty. On devices running the new version, the Microsoft Windows service “Services.exe” crashed, and this was also logged in the log files. However, the connection to the new antivirus version was not apparent from the log files alone. Thanks to Nexthink, the link to the antivirus version was quickly identified as the cause, and the faulty component was replaced.
Another case from BSH’s experience arose during the migration to Windows 10. Users were suddenly unable to launch Outlook. It turned out that certain Microsoft patches had not been installed quickly enough. “Nexthink is always great when I’m dealing with the unknown,” Schmid summarizes, “for example, when a problem occurs for some users but not for others.” Simply determining whether a malfunction is an isolated incident or will develop into a widespread issue can, under certain circumstances, save many hours of work—both at the service desk and for users, who can get back to work more quickly.
A partner by your side in an emergency
When BSH is unable to define the vague aspects of a situation, it turns to the service provider Consulting4IT for assistance. BSH has entered into a managed services agreement with this Nexthink partner. Under the agreement, the partner not only handles Nexthink operations on BSH’s servers but also—through standardized tasks—performs case-specific analysis and aggregation of Nexthink data.
Consulting4IT provides BSH’s problem and change management teams with information that helps ensure stable operations. These queries range from investigating the causes of recurring issues to project-specific dashboards. The latter provide transparency regarding ongoing project success during major changes and serve as quality documentation, as well as a kind of early warning system—for example, for issues related to software rollouts.
A traffic light system for ticket forwarding
For user support, BSH operates a global service desk at various locations with approximately 80 employees who handle first-level support and are fluent in a total of eight languages. With the help of a wiki, they are also able to resolve a significant number of tickets—more than half, according to BSH—on the first attempt. However, they do not replace specialists. In many cases, their main task is therefore to forward user inquiries to the appropriate department.
Often, many tickets are automatically routed to the Workplace experts, even though, for example, the IT infrastructure team might be the better choice. To determine this, the service desk consults the “FASD” (First Aid Service Desk) add-on. The add-on module for Nexthink was developed by Consulting4IT and has been in productive use since 2019.The module automatically consolidates Nexthink data for a specific computer based on both needs and roles for first-level support. Among other features, the module includes a traffic light system that prioritizes the cause of the malfunction or performance issue, but above all categorizes it by topic. The traffic light system simplifies the forwarding of support requests for first-level support, thereby increasing the likelihood of routing them to the correct contact person.
For example, service desk staff can use the traffic light system to see that what a caller thinks is an SAP issue is actually a problem with an unstable browser running the SAP software. The ticket is immediately forwarded to the Workplace team responsible for the browser—without taking a time-consuming detour through the SAP system for everyone involved.
In addition, FASD saves time because the caller’s client data is immediately available the moment the call is received. This includes information such as the time elapsed since the last system startup, as well as current memory and CPU usage. The parameters included in the status indicator can be configured. Last but not least, support can initiate the launch of the Nexthink Act module directly from FASD with a single mouse click—which, of course, is another important factor in improving the first-time resolution rate.
This is because standardized troubleshooting scripts can be applied directly by first-level support staff using FASD in combination with Act, ensuring rapid assistance. The direct mapping of individual scripts to specific status indicators leaves no room for interpretation regarding when to run which script, even for those without extensive technical background. As a result, new service desk employees can be trained quickly and easily.
Attackers leave behind behavioral patterns
BSH also saves valuable time when a virus attack occurs, as it can quickly determine which clients are affected and which potentially unsafe IP addresses they have come into contact with. Here, too, Nexthink supports administrators with real-time information, particularly when dealing with attackers unknown to the virus scanner. The attacker is identified based on their behavioral patterns. “In this context, the counter-check is actually more exciting,” adds Schmid, “that is, the documentation that an attacker is not causing any damage.” The BSH manager recalls a specific case: “The certainty that a ransomware strain that was recently in the news did not cause any damage at our bank has significantly improved our sleep.”
Last but not least, Nexthink provides an overview of the health status of all computers—for example, grouped by location. This information helps identify measures to sustainably improve employee satisfaction: performance issues, vulnerabilities, or sources of error can thus be identified and pinpointed early on as trends.
Mastering economies of scale
Since the former joint venture became a wholly owned subsidiary of the Bosch Group in 2015, the previously separate IT infrastructures have been converging. This, combined with BSH’s continued growth, undoubtedly has an impact on the numerous end devices. “We operate a highly automated environment of managed clients, the number of which has increased by more than 40 percent in recent years,” Schmid notes, “and it takes a lot of effort to keep it running.”
Scaling infrastructure in this way often leads to minor errors becoming magnified. As a result, such major changes can reveal errors that perhaps no one had noticed before. For example, the network connection between the BSH site in Regensburg and the data center in Stuttgart became overloaded due to the absence of a local software distribution server. Instead of installing PCs and software solely on the site’s own network, the software was sourced from the data center. During a software update, the site connection soon became overloaded. Only an analysis of the problem helped identify and resolve the root cause. Nexthink Analysis is now set to be deployed across all BSH locations.
Conclusion
BSH has achieved its goal of shedding light on the black box that is the client by introducing end-user experience analysis with Nexthink. Specifically, this involves early client analysis as part of proactive problem management to prevent disruptions. Continuous data analysis provided as a managed service also plays a key role in this. In addition, the implementation of FASD significantly improved the first-resolution rate through the provision of client data and reduced ticket processing times through automation at the first level.
About BSH
BSH Hausgeräte GmbH was formerly known as Bosch und Siemens Hausgeräte GmbH and was a joint venture between Robert Bosch GmbH and Siemens AG. At the end of 2014, Siemens sold its stake to the Bosch Group. Today, BSH employs nearly 62,000 people, who generated revenue of 13.8 billion euros in 2017.