Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Integrating with existing monitoring infrastructure

Customers who deploy Sphereshield for MS Teams on Prem may wish to monitor it using their existing monitoring tools.

Below is a list of the relevant SphereShield components and how they can be monitored.

Server/Functionality

Check

Details

Alarm Metric

Bastion/Admin Portal Servers/DB

Free Disk Space

We recommend a minimum of 30GB allocated for logs.

< 10GB free disk space on Bastion Application + Log disks

 

CPU utilisation

 

>95% CPU utilisation for more than 10 mins

 

Memory

 

< 10% Free memory

Teams Protector Filter (Bastion) Health check

HTTP Health Check Request

http://<server-name>/teams_protection/healthcheck/proxytest

This checks Bastion, EW, DLP, DB and internet connectivity.

See here for details: Teams Protector Health Check

Non 200 Response code

Admin Portal Health Check

http://<AP-Server-Name>/rest/v1/Heartbeat

Checks the AP is up and has a DB connection

 

Timeout or non 200 response code

Internal Services AP

http://<host>:1234/api/DBHeartbeat

before version: …

http://<host>:1234/api/Heartbeat

Checks the API is up and has a DB connection

 

Timeout or non 200 response code

 

 

 

 

DB Content Monitoring

Name

Response Type

SQL Query

Alarm Threshold

Info

EW-log-table-count

Number

select count (*) FROM [FEDERATION_POLICY_LOG] WITH (NOLOCK)

500000

Number of records in EW policy log (calculations) is larger than 500k.

The Maintenance service should be processing and clearing this table.

Consider restarting the Maintenance service or truncating the table.

Message-Outbox-Count

Number

select count (*) FROM [MESSAGES_OUTBOX] WITH (NOLOCK)

1000

Number of bot messages waiting to be sent.

Consider restarting the Maintenance Service

Logs Monitoring

It is not recommended to monitor the Bastion or filter logs for errors as this is inefficient and may produce an unreasonable amount of false positives.

If this is nevertheless desired, logs can be scanned for the keyword “[Error-Flag]” which may be prepended to certain significant errors.