Automated Monitoring for Windows Services

Have you ever had a windows service that stops working occasionally and wanted a way to be notified when it is non-functional or wanted it to reboot itself as needed?  You can restart windows services in code.   Your code can even restart a server if the service manager is non-responsive.

Service Controller

The .Net framework comes with a class to manage windows services on the same machine or elsewhere on the same domain, called ServiceController.  You can also use this class to get all the services available on a specific machine in your domain.

Check if a Windows Service is Running

A monitoring application would want a simple check to verify if the service is up or down at any given time and handle any given error scenarios that may come up.  The code I’ve written to handle this scenario is as follows.

       protected bool ValidateWindowsService()
        {
            try
            {
                if (string.IsNullOrEmpty(ServiceName)) { return true; }

                ServiceController serviceController;
                if (!string.IsNullOrEmpty(ServiceMachineName))
                    serviceController = new ServiceController(
                    ServiceName, ServiceMachineName);
                else
                    serviceController = new ServiceController(ServiceName);

                if (serviceController == null)
                {
                    LogProvider.ErrorFormat("Failed to connect to service '{0}'.", Name);
                    return false;
                }
                if (serviceController.Status != ServiceControllerStatus.Running)
                {
                    LogProvider.WarnFormat("Service '{0}' is not running.", Name);
                    return false;
                }
                else
                {
                    LogProvider.DebugFormat("Service '{0}' is running.", Name);
                    return true;
                }
            }
            catch (Exception ex)
            {
                LogProvider.FatalFormat("Failed to control service '{0}'.", ex, Name);
                return false;
            }
        }

Simply instantiating a new ServiceController class with the service name, and optional machine name gives you an instance that can answer your question.  If machine name is not specified, then the current machine is assumed.  If the service does not exist or a connection cannot be made, you will get a null reference or an exception. When you have an instance, it is a simple check of the Status property to determine if it is running.

Restarting a Windows Service

Performing a restart is just as easy as checking the status.  However, you may want to take a little caution and ensure a partially stopped windows service is fully stopped before trying to start it.

You can wait for a specific status using the WaitForStatus method of ServiceController.  To start a service just call Start(), and to stop the service just call Stop().

        protected bool RestartService()
        {
            try
            {
                if (string.IsNullOrEmpty(ServiceName)) { return false; }

                LogProvider.WarnFormat("Attempting to restart service '{0}'.", Name);

                ServiceController serviceController;
                if (!string.IsNullOrEmpty(ServiceMachineName))
                    serviceController = new ServiceController(
                    ServiceName, ServiceMachineName);
                else
                    serviceController = new ServiceController(ServiceName);

                if (serviceController == null)
                {
                    LogProvider.FatalFormat("Failed to connect to service {0}.", Name);
                    return false;
                }
                if (serviceController.Status != ServiceControllerStatus.Stopped)
                {
                    serviceController.Stop();
                    serviceController.WaitForStatus(ServiceControllerStatus.Stopped);
                }

                if (serviceController.Status == ServiceControllerStatus.Stopped)
                {
                    serviceController.Start();
                }

                while (serviceController.Status != ServiceControllerStatus.Running)
                {
                    serviceController.WaitForStatus(ServiceControllerStatus.Running);
                }
                LogProvider.WarnFormat("Restarted service '{0}'.", Name);
                return true;
            }
            catch (Exception ex)
            {
                LogProvider.FatalFormat("Failed to restart service '{0}'.", ex, Name);
                return false;
            }
        }

Server Restarts

Sometimes a service may not respond to either a status check or a stop/start command.  In those cases, you may want the server to restart.

To manage a server and reboot it, you need to start with an instance of ManagementScope.  From that you need to build a management query to find the server.  The search result is a ManagementObject on which you make a windows API call to ask it to shutdown.

This can be used to force any machine on your domain to shutdown, even one that a user is interactively logged into; So use it with care.  Coordinate with any other teams in your company that use the same server to make sure this is ok.  I have used this on a machine dedicated to my service, so it was a non-issue for me.

In the monitoring service we just need a simple method signature that returns a true if the restart worked.

        protected bool RestartServer()
        {
            try
            {
                if (ServiceMachineName.Length == 0) { return false; }

                LogProvider.WarnFormat("Attempting to restart server monitored by '{0}'.", Name);

                ManagementScope server = new ManagementScope();
                server.Path = new ManagementPath(FormatServerName(ServiceMachineName));
                server.Options.Impersonation = ImpersonationLevel.Impersonate;
                server.Options.EnablePrivileges = true;

                ObjectQuery oQuery = new ObjectQuery("select name from Win32_OperatingSystem");
                using (ManagementObjectSearcher search = new ManagementObjectSearcher(server, oQuery))
                {
                    using (ManagementObjectCollection items = search.Get())
                    {
                        ManagementBaseObject rebootParams = null;

                        foreach (ManagementObject item in items)
                        {
                            using (rebootParams = item.GetMethodParameters("Win32Shutdown"))
                            {
                                rebootParams["Flags"] = REBOOT + FORCE;
                                rebootParams["Reserved"] = 0;
                                item.InvokeMethod("Win32Shutdown", rebootParams, null);
                            }
                        }
                    }
                }
                LogProvider.WarnFormat("Restarted server monitored by '{0}'.", Name);
                return true;
            }
            catch (Exception ex)
            {
                LogProvider.FatalFormat("Failed to restart server monitored by '{0}'.", ex, Name);
                return false;
            }
        }
        protected string FormatServerName(string name)
        {
            //->  \\computerName\root\cimv2
            if (!name.Contains("\\"))
            {
                name = "\\\\" + name + "\\root\\cimv2";
            }
            else if (!name.StartsWith("\\"))
            {
                name = "\\\\" + name;
            }
            return name;
        }
        private const int REBOOT = 2;
        private const int FORCE = 4;
        private const int SHUTDOWN = 8;

Putting it Together

You can get a windows service project and have your verification step run on an interval automatically by deriving from the WorkerRoleTask in candor-common at github.  This implementation class shown below is also already available.  Just download the candor-common solution, compile in release mode, and deploy.

At the desired interval, just use the methods described above to check if the windows service is running, and if not then either restart the service or if that fails then restart the server.

        public override void OnWaitingPeriodElapsed()
        {
            AssertConfigurationValid();
            if (!IsMonitorEnabled())
            {
                LogProvider.DebugFormat("This service monitor is disabled at this time: {0}", Name);
                return;
            }
            bool ok = ValidateServiceFile() && ValidateWindowsService();

            if (!ok)
                ok = RestartService();
            if (!ok)
                ok = RestartServer();
            if (!ok)
            {
                LogProvider.WarnFormat("Non-functional service could not be restarted, '{0}'.", Name);
            }
            else
            {
                LogProvider.DebugFormat("Functional service '{0}'.", Name);
            }
        }

Notice this class also has a hook if you want an outage windows where the service is not monitored. Just override the IsMonitorEnabled() method in a derived class with your outage window rules. Then configure your derived class in the windows service instead of the one supplied in candor-common.

Configuration of WorkerRole

If you do use the candor-common WorkerRole and the windows service that uses it, then you can use this configuration to use the supplied ServiceMonitorWorkerRoleTask.

  <configSections>
    <sectionGroup name="Candor.Tasks">
      <section name="WorkerRole" type="Candor.Configuration.Provider.ProviderConfigurationSection, Candor.Core"/>
    </sectionGroup>
  </configSections>
  <Candor.Tasks>
    <WorkerRole>
      <providers>
        <add name="serviceMonitor" type="Candor.Tasks.ServiceProcess.ServiceMonitorWorkerRoleTask, Candor.Tasks.ServiceProcess"
             serviceName="YourServiceName"
             serviceMachineName="RemoteServerNameOrBlankForLocal"
             WaitingPeriodSeconds="300" />
      </providers>
    </WorkerRole>
  </Candor.Tasks>

Watching Files

If your windows service generates file output on a regular interval, then you may want to monitor the generation of that file instead of just checking if the service is running.  Check out candor-common to see the code to watch a file.

  <configSections>
    <sectionGroup name="Candor.Tasks">
      <section name="WorkerRole" type="Candor.Configuration.Provider.ProviderConfigurationSection, Candor.Core"/>
    </sectionGroup>
  </configSections>
  <Candor.Tasks>
    <WorkerRole>
      <providers>
        <add name="serviceMonitor2" type="Candor.Tasks.ServiceProcess.ServiceMonitorWorkerRoleTask, Candor.Tasks.ServiceProcess"
             serviceName="YourServiceName"
             outputFileNameToWatch="\\ServerName\Folder\MonitoredFileName.txt"
             outputFileMaxAgeMinutes="40"
             outputFileExpectedAgeMinutes="20"
             WaitingPeriodSeconds="300" />
      </providers>
    </WorkerRole>
  </Candor.Tasks>

References

System.ServiceProcess.ServiceController class (for service monitoring)
http://msdn.microsoft.com/en-us/library/system.serviceprocess.servicecontroller.aspx

System.Management.ManagementScope class (for server restarts)
http://msdn.microsoft.com/en-us/library/system.management.managementscope.aspx

candor-common on Github – all the code from this article.  Portions linked inline above.
https://github.com/michael-lang/candor-common

Simple Installer for Windows Service Using Visual Studio 2012
http://candordeveloper.com/2012/12/28/simple-installer-for-windows-service-using-visual-studio-2012/

About these ads

4 comments

  1. [...] Automated Monitoring for Windows Services (M. Lang) [...]

  2. This is really great stuff. Might want to take it one step further. Powershell is a scripting language that Admin are starting to use more and more. Powershell can create C# classes. So the conversion of the code would be pretty simple. You could even have powershelll fire off an email if a server had to be restarted.

    1. I’m not sure how powershell can be used to take this code further. Powershell is a great tool for server admins, no doubt. But I don’t see how you can enhance this particular code with powershell.

      If you want an email from this service monitor, just configure log4net to send any Warn, Error, or Fatal log messages to an email appender. You can find examples of that all over the internet.

      I have also created other appenders that log directly to an administrator console using an API. Those messages appeared directly on a live dashboard for a 24 hour company wide network team.

  3. [...] Automated Monitoring for Windows Services [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 71 other followers

%d bloggers like this: