...
...
...
...
...
...
...
...
Overview
...
Updated to agent 1.56.1 and above2
SphereShield Agent is a Windows Service to monitor other AGAT service and restart it if needed.
Monitored service can be:
Sip Filter (AgatSipFilter)
Bastion (for LAC, Teams Protector, Webex Protector filters)
Authentication Extender
Casb Adapter (AgatSphereShieldCasbAdapter)
Content Manager (AgatContentManagerService)
Maintenance Service (AgatSphereShieldMaintenanceService)
ADSync Adapter (AgatSphereShieldADSyncAdapter)
Monitoring operation consists of three main parts:
...
checking if monitored service is running and start it if not
checking in DB if monitored service is alive using Service Management mechanism - relevant for all services except Authentication Extender
only for Bastion: sending a health check request to the Bastion and its filters. If the Bastion and filters are not healthy, the agent will try to restart the Bastion service.
...
from 1.6.2, only for ADSync: if enabled, checking in the logs of the service if it contains the alive message
Service name: AgatSphereShieldServiceAgent[CustomerName]
Service display name: AGAT SphereShield Service Agent [Customer Name]
...
Code Block |
---|
> AgatSphereShieldServiceAgent.exe remove |
Configuration
...
Configuration added in Version 1.6.2
...
New config settings
CheckServiceAliveInLog - true/false, enable log file monitoring for alive message. Checks in the logs of the service if it conatains the text “[IS ALIVE]” in the last time set in the next setting. If this is not detected, the agent will attept to restart the service
CheckServiceAliveInLogMinutes - how often check the alive message in the service log
For now this feature is supported only with AdSync version 1.2.0.2
There is an AgatSphereShieldServiceAgent.config file with configuration for the agent. The agent writes to a log file (default at C:\Agat\Logs\ServiceAgent\[CustomerName]) and to Event Log with source "AGAT SphereShield Service Agent".
Example of the configuration file:
...
Code Block | ||
---|---|---|
| ||
<?xml version="1.0" encoding="utf-8" ?>
<appSettings>
<!--===================================================================================================-->
<!-- Logging - The customer needs to enter his company name instead AGAT in CustomerName and LogFileFullName keys-->
<add key="CustomerName" value="AGAT"/>
<add key="LogFileFullName" value="C:\Agat\Logs\ServiceAgent\AGAT\ServiceAgent.log" />
<add key="LogFileMaxSize" value="100" />
<!-- Log level values: off, fatal, error, warn, debug, info, all, alert, critical -->
<add key="LogFileLevel" value="all" />
<add key="EventLogLevel" value="warn" />
<!--===================================================================================================-->
<!-- DB Connection -->
<!-- Agent can work without DB. This mode does not support portal UI operation - service management operation for remote restart
and is designed mainly for Authentication extender monitoring. To work without DB set DBRequired to false. -->
<add key="DBRequired" value="true" />
<add key="ConnectionString" value="Data Source=[SQLSERVER];Initial Catalog=[DataBaseName];Persist Security Info=True;User ID=[username];Password=[password]" />
<add key="Key" value="" />
<add key="IV" value="" />
<!--===================================================================================================-->
<!-- Monitored service -->
<!-- The name of the service to monitor / restart: AgatSipFilter, Bastion, AgatSphereShieldCasbAdapter, AgatContentManagerService -->
<add key="ServiceName" value="AgatSfbSipFilter" />
<!-- How long should agent wait for restart to complete. If does not manage to start, agent will create event in the event log for manual operation to be done. -->
<add key="ServiceRestartTimeoutSeconds" value="30" />
<!-- Define how often will the monitoring happen
Note: Restart will occur only after ServiceMonitorNumberOfAttemptsBeforeRestart consecutive failures. Therefore cycle time should be configured accordingly.
If ConnectionString is set, the following setting will be ignored as the relevant value will be read from DB. -->
<add key="ServiceMonitorFrequencySeconds" value="60" />
<!-- Number of checks before service restart
If ConnectionString is set, the following setting will be ignored as the relevant value will be read from DB. -->
<add key="ServiceMonitorNumberOfAttemptsBeforeRestart" value="3" />
<!--===================================================================================================-->
<!-- Ethical Wall load monitoring - relevant for SIP Filter only -->
<add key="MonitorEthicalWallLoad" value="false" />
<!-- Define how often will the Ethical Wall load monitoring happen -->
<add key="MonitorEthicalWallLoadFrequencyMinutes" value="30" />
<!--===================================================================================================-->
<!-- Bastion healthcheck configuration - relevant for Bastion only -->
<!-- Set to true if Bastion is running as Forward proxy, false if Bastion is running as Reverse proxy. -->
<add key="BastionForwardProxy" value="false" />
<!-- Bastion IP for the healthcheck request -->
<!-- Note: Default port is 443 for Reverse Proxy and 80 for Forward Proxy.
If port other than default is used, please add :<portnumber> to the end of the IP. -->
<add key="BastionIp" value="127.0.0.1" />
<!-- Bastion host for the healthcheck request -->
<add key="BastionHealthcheckHost" value="test.skypeshield.com" />
<!-- Maximum latency for getting healthcheck results. Set value to 0 to disable latency check. -->
<add key="BastionMaxHealthcheckLatencyMilliseconds" value="0" />
<!-- Folder for output of troubleshooting procedure, will include archive of log files. -->
<add key="TroubleshootingOutputFolder" value="C:\Agat\Logs" />
<!-- Set to true to split troubleshooting archive into volumes, useful for email attachments -->
<add key="TroubleshootingSplitIntoVolumes" value="true" />
<!-- Size of troubleshooting archive split volume in MB. -->
<add key="TroubleshootingSplitVolumeSize" value="10" />
<!-- Number of last days to include in troubleshooting archive. -->
<add key="TroubleshootingDaysRange" value="1" />
<!--===================================================================================================-->
<!-- What issues will cause sending email: all, dbConnectionFailure, bastionDbConnectionFailure, restartFailure, restartSuccess
Multiple values may be configured by comma, may be left empty to disable emailing at all.
Note that for any value except empty - SMTP should be configured below. -->
<add key="EmailIssues" value="" />
<!-- SMTP configuration - settings for admin notification when agent detects an issue
If ConnectionString is set, no need to set the following SMTP configuration settings as they are read from DB. -->
<add key="SMTP_HostName" value="" />
<add key="SMTP_Port" value="" />
<add key="SMTP_AccountUsername" value="" />
<add key="SMTP_AccountPassword" value="" />
<add key="SMTP_RequiresSsl" value="false" />
<add key="SMTP_RequiresAuthentication" value="false" />
<add key="SMTP_MailRecipient" value="" />
<!-- The frequency of sending mail notification.
This value depends on the "Service Monitoring Frequency (seconds)" value in Admin Portal (ServiceMonitorFrequencySeconds setting).
For example, if ServiceMonitorFrequencySeconds is set to 60 seconds and SMTP_Sending_Frequency is set to 10 ,
the agent will send mail when issue detected and than additional mail every 10 min ( 60X10 = 600 sec = 10 min )
-->
<add key="SMTP_Sending_Frequency" value="10" />
<!--===================================================================================================-->
<!-- Support emails -->
<!-- What issues will cause sending email to support team: all, dbConnectionFailure, bastionDbConnectionFailure, restartFailure, restartSuccess
Multiple values may be configured by comma, may be left empty to disable emailing at all.
Note that for any value except empty - SMTP should be configured below. -->
<add key="SupportEmailIssues" value="" />
<!-- Support SMTP configuration - settings for support notification when agent detects an issue -->
<add key="SupportSMTP_HostName" value="" />
<add key="SupportSMTP_Port" value="" />
<add key="SupportSMTP_AccountUsername" value="" />
<add key="SupportSMTP_AccountPassword" value="" />
<add key="SupportSMTP_RequiresSsl" value="false" />
<add key="SupportSMTP_RequiresAuthentication" value="false" />
<add key="SupportSMTP_MailRecipient" value="" />
<!-- The frequency of sending mail notification to support team.
This value depends on the "Service Monitoring Frequency (seconds)" value in Admin Portal (ServiceMonitorFrequencySeconds setting).
For example, if ServiceMonitorFrequencySeconds is set to 60 seconds and SMTP_Sending_Frequency is set to 10 ,
the agent will send mail when issue detected and than additional mail every 10 min ( 60X10 = 600 sec = 10 min )
-->
<add key="SupportSMTP_Sending_Frequency" value="10" />
</appSettings> |
Logging
CustomerName - Can install multiple agents for different customers on the same machine and customer name should be different for each one.
LogFileFullName - The path to the agent logs. Need to replace AGAT with customer name. If installed with installer it does the work.
LogFileMaxSize - Defines the maximum size of the log file before the agent will clear out and create a new log.
LogFileLevel - The severity level of the logs generated by the agent. Possible values: off, fatal, error, warn, debug, info, all, alert, critical
EventLogLevel - The severity of the logs sent to the event viewer.
DB Connection
DBRequired - Agent can work without DB. This mode does not support portal UI operation - service management operation for remote restart and is designed mainly for Authentication Extender monitoring. To work without DB set DBRequired to false.
ConnectionString - Is needed when DBRequired is set to true. Need to replace values of SQLSERVER, DataBaseName, username, password.
Key/IV - AES encryption keys needed when DBRequired is set to true.
Monitored service
...
ServiceName - name of the service that agent will monitor. Possible values: AgatSipFilter, Bastion, AgatSphereShieldCasbAdapter[CustomerName], AgatContentManagerService
...
There is an AgatSphereShieldServiceAgent.config file with configuration for the agent. The agent writes to a log file (default at C:\Agat\Logs\ServiceAgent\CustomerName]) and to Event Log with source "AGAT SphereShield Service Agent".
Logging Configuration
CustomerName - Can install multiple agents for different customers on the same machine and customer name should be different for each one.
LogFileFullName - The path to the agent logs. Need to replace AGAT with customer name. If installed with installer it does the work.
LogFileMaxSize - Defines the maximum size of the log file before the agent will clear out and create a new log.
LogFileLevel - The severity level of the logs generated by the agent. Possible values: off, fatal, error, warn, debug, info, all, alert, critical
EventLogLevel - The severity of the logs sent to the event viewer.
DB Connection Configuration
DBRequired - Agent can work without DB. This mode does not support portal UI operation - service management operation for remote restart and is designed mainly for Authentication Extender monitoring. To work without DB set DBRequired to false.
ConnectionString - This is needed when DBRequired is set to true. Need to replace values of SQLSERVER, DataBaseName, username, password.
Key/IV - AES encryption keys needed when DBRequired is set to true.
Monitored service Configuration
ServiceName - name of the service that agent will monitor. Possible values: AgatSipFilter, Bastion, AgatSphereShieldCasbAdapter[CustomerName], AgatContentManagerService
CheckServiceAliveInLog - true/false, enable log file monitoring for alive message. Checks in the logs of the service if it contains the text “[IS ALIVE]” in the last time set in the next setting. If this is not detected, the agent will attempt to restart the service.
For now, this feature is supported only with AdSync version 1.2.0.2CheckServiceAliveInLogMinutes - how often check the alive message in the service log.
For now, this feature is supported only with AdSync version 1.2.0.2ServiceRestartTimeoutSeconds - How long should the agent wait for restart to complete. If it does not manage to start, the agent will create event in the event log for manual operation to be done.
ServiceMonitorFrequencySeconds - Define how often will the monitoring happens happen (in seconds).
Note: Restart will occur only after ServiceMonitorNumberOfAttemptsBeforeRestart consecutive failures. Therefore cycle time should be configured accordingly.
If ConnectionString is set, this setting will be ignored as the relevant value will be read from DB.ServiceMonitorNumberOfAttemptsBeforeRestart - Number of checks before service restart.
If ConnectionString is set, this setting will be ignored as the relevant value will be read from DB.MinutesToWaitBetweenRestarts - How many minutes to wait after restart before the next restart .
If ConnectionString is set, this setting will be ignored as the relevant value will be read from DBin order to avoid continuous restarts in a situation in which restart does not help.
This does not affect the time set how often to check - ServiceMonitorFrequencySeconds.
Ethical Wall load Configuration- SIP Filter only
MonitorEthicalWallLoad - Ethical Wall load monitoring - relevant for SIP Filter only
MonitorEthicalWallLoadFrequencyMinutes - Define how often will the Ethical Wall load monitoring happen
Bastion healthcheck Configuration - Bastion only
BastionForwardProxy - Set to true if Bastion is running as Forward proxy, false if Bastion is running as Reverse proxy.
BastionIp - Bastion IP for the healthcheck request. If the Agent is installed on the Bastion use localhost address. Make sure to use a port which the Bastion listens to (and is used by the required channel).
Note: Default port is 443 for Reverse Proxy and 80 for Forward Proxy.
If a port other than the default is used, please add :<portnumber> to the end of the IP.BastionHealthcheckHost - The host to whom the health check request will be sent to.
BastionMaxHealthcheckLatencyMilliseconds - maximum latency for the health check response. Set 0 to disable latency check.
...
TroubleshootingOutputFolder - Folder for output of troubleshooting procedure, will include archive of log files.
TroubleshootingSplitIntoVolumes - Set to true to split troubleshooting archive into volumes, useful for email attachments
TroubleshootingSplitVolumeSize - Size of troubleshooting archive split volume in MB.
TroubleshootingDaysRange - Number of last days to include in troubleshooting archive.
Email notifications to admin Configuration
Settings for admin notification when the agent detects an issue.
...
SMTP Hostname: SMTP server Address.
SMTP Port: the port the SMTP server is listening on.
SMTP Account Name: Sender Address for the Agent.
SMTP Account Password: If SMTP requires authentication, this is the password for the sender account.
SMTP Requires SSL: Change to True if the SMTP server requires TLS/SSL.
SMTP Requires Authentication: Change to True if the SMTP server requires authentication
SMTP Mail Recipient: Administrator e-mail to receive notifications from the agent, can be multiple emails separated by , or ;
SMTP_Sending_Frequency - The frequency in which a mail notification will be sent.
This value depends on the "Service Monitoring Frequency (seconds)" value in Admin Portal (ServiceMonitorFrequencySeconds setting).
For example, if ServiceMonitorFrequencySeconds is set to 60 seconds and SMTP_Sending_Frequency is set to 10, the agent will send mail when issue detected and than additional mail every 10 min ( 60X10 = 600 sec = 10 min )
Email notifications to support Configuration
Settings for support notification when the agent detects an issue.
...
SMTP settings for support team notification are the same as SMTP settings for admin, starting with Support prefix. Note that these settings are set only in config file and not in DB.
Monitoring Processing in detail
The agent runs the monitoring each defined number of seconds (default 60) and does the following:
[DB mode] write agent alive time in service management table for monitored service row
check if monitored service is running and start it if not
[DB mode] check if monitored service is alive in service management table] check if monitored service is alive in service management table
If log alive enabled, check if alive message ([IS ALIVE]) appears in the service log since the last check
[Bastion] check if Bastion and filters are OK:
Bastion healthcheck procedure:
for forward proxy:
request https://[BastionHealthcheckHost]/healthcheck with proxy BastionIP
for example https://test.skypeshield.com/teams_protection/healthcheck with proxy 127.0.0.1for reverse proxy:
request https://[BastionIp]/skypeshieldhealth with host header BastionHealthcheckHost
for example https://127.0.0.1/skypeshieldhealth with host header test.skypeshield.comif received HTTP 200 status code (during response time of BastionMaxHealthcheckLatencyMilliseconds if set not to 0) - Bastion and filters are OK (no restart is done)
if received other HTTP status or error/exception - except statuses 404 (Not Found), 403 (Forbidden) and 401 (Unauthorized) - will try to restart Bastion service after 3 consecutive failures every 10 seconds - only if already in production mode.
if healthcheck result not OK and the agent is in production mode (received 5 sequence OK results) - consider healthcheck as not passed
otherwise if healthcheck result is OK or the agent not in production mode (not received 5 sequence OK results) - consider healthcheck as passed
the agent will go into production mode (restart on error) only after receiving good result for 5 times indicating the correct operation to avoid misconfiguration in install.
If alive check or bastion healthcheck not passed - restart the monitored service
If failed to start the service X (X = ServiceMonitorNumberOfAttemptsBeforeRestart) times - kill the monitored service
...
RESTART - restart the monitored service
START- start the monitored service
STOP - stop the monitored service
RESTART_AGENT - restart the agent itself
START_TRBL - start troubleshooting process
FINISH_TRBL- finish the troubleshooting process
CRITICAL - service entered critical state
When SipFilter Write to DB - "Critical State", the agent needs to do the following:Shutdown SipFilter
Send mail about it to admin
Write into the event viewer
Troubleshooting Processing
...