EV Observe - Configure Hosts

Last modified on 2023/11/28 14:15

Hosts are information system components monitored by the Box or using services.

Each host:

  • Is associated with a company/site.
  • Is associated with a category, e.g. Wi-Fi terminals, cameras, printers, UPS, etc.
  • Is associated with one or more host templates defined for the category.
  • Inherits the services defined for the host templates. These services are run during host monitoring.
  • Can trigger the sending of notifications when there is a change in status. If the status is not acknowledged by the operations team, this will trigger an escalation to successively higher levels.

Examples

  • The COPCGRE61 server host is associated with the Windows Server template from which it inherits a set of services run during monitoring, such as CPU and RAM monitoring.
  • Notification policy in the event of an incident on the COPCGRE61 server host:
    • Notification timeslots defined:
      • From 5 am to 12 noon for team A
      • From 12 noon to 8 pm for team B
    • An incident occurs on the server:
      • Warning at 10 am: Notification sent to team A.
      • Warning at 1 pm: Notification sent to team B.
      • Warning between 8 pm and 5 am: No team is informed. If the incident is still present at 5 am, a notification will be sent to team A.
  • Escalation policy in the event of an incident on the COPCGRE61 server host:
    • 24/7 notifications sent to the Level 1 On-call contact group.
    • If an incident occurs on the host, notifications will be sent to the On-call contact group every three minutes.
    • Once the warning is acknowledged by a member of the On-call team, notifications will stop and the escalation process will be interrupted.
    • If the warning is still not acknowledged after 15 minutes, this will trigger an escalation to the Level 2 On-call managers contact group who will receive notifications.

Notes

  • Each host is monitored by a single Box.
  • One host can be associated with several host templates. The templates available will depend on the category associated with the host.
  • You must first configure the notification policy for the host before you can configure the escalation process.
  • Thresholds defined for detecting instability must comply with Nagios syntax.

Best Practice

  • Use the modification wizards or run an import to apply changes to an entire group of hosts. Open url.png See the procedure

example  Define a single notification policy for all servers whose business impact is High using the Modify the notification policy wizard

Menu access

Configuration > Hosts > List

Note: Access to the host Detail forms: Monitoring > Monitoring

Screens description

          Host.png

General information

Host name: Name of the monitored host.

IP/DNS: IP address or DNS name of the host.

Associated with: The company/site associated with the host.

Monitored by: Name of the Box monitoring the host.

Host category: Category associated with the host, used to select the host templates available.

example  Wi-Fi terminals, cameras, printers, UPS

Template: List of host templates associated with the selected category.

  • You can tick the relevant boxes to select several host templates.

Best Practice icon.png  Use the search field below the field to filter the list of templates. You can select all templates quickly by ticking the Select all option.

     Equipment - Select templates.png

  • The host will automatically inherit the services defined for each host template selected.

   Certain services may require monitoring account information. If this is the case, you must select the Accounts tab to check that it is correctly configured. Note: When you create a host or modify the list of host templates, the Accounts tab will automatically be refreshed once you save the form.

Business impact: Impact of the host within the corporate information system in the event of failure.

Instruction: User-defined text or automatically clickable link displayed when the status is not OK. This enables the operations team to process the incident faster and more efficiently. Open url.png See Instruction URLs.

Description: Role of the host within the corporate information system.

Additional information: User-defined text.

Document: Similarly to the instruction, this is used to enter additional information to help speed up processing.

Tag: Used to categorize a host based on a specific list in order to filter the IT infrastructure more effectively when searching for and configuring widgets.

Availability and checks

Information on availability rate:

  • Availability rate: Target availability rate for the host.
  • Availability period: Timeslot during which the availability rate is calculated. This usually corresponds to the SLA availability target.
     

Check properties:

  • Check Timeslot: Timeslot during which host monitoring is performed and controls are run.
    • The period must be greater than or equal to the entire timeslot defined for calculating the availability rate.
  • Normal check interval: Interval between the running of two controls (in minutes).
  • Additional checks: Number of times the control is repeatedly run, if its initial status is not OK, before sending the first notification.
    • If additional controls are defined, the status to be confirmed (SOFT) corresponds to the initial control status and the confirmed status (HARD) corresponds to the status returned after the last additional control is run.
    • The sending of notifications and the calculation of the availability rate are based on the confirmed status.
    • Interval: Interval between the running of two additional controls (in minutes).
    • Time before first notification: Time automatically calculated based on the number of additional controls and the interval between the running of two additional controls.

example  Additional controls = 4; Interval = 5

  • If an incident is detected during the initial control, the monitoring control will be run every five minutes, up to a maximum of four times, as long as the status to be confirmed is not OK.
  • The time before the first notification is sent or before the first confirmed status will be equal to 20 minutes (4 * 5).

Check template: Template assigned to the host for running the monitoring control.

example  Ping template used by the control to check that the host can be reached in the network

example  For a virtual host, e.g. Azure PaaS, the Not pingable template is used 

Actions

Action template associated with the host, used to perform an action when there is a change in the host status.

example  Restart a service or run a PowerShell script on the COPCGRE61 server when the status of the server changes from OK Status icon - OK.png (normal operation) to Critical Status icon - Critical.png (non-operational).

  • The parameters to be specified depend on the selected action template.
  • Locked monitoring account:

       This field will appear only if monitoring account information is required for running the action template.

    • By default, the action template will use the monitoring account information inherited from the parent site of the host, or alternatively, from a higher-level site or from the company.
    • If a configuration specific to the host is required, the account must be locked. To do this, select Yes to lock the monitoring account and enter the configuration information specific to the type of account.  The values defined will apply to all current and future services associated with the host.

      example  SNMP authentication credentials for the host different from those inherited from the company

    • To restore the inherited values for the monitoring account, select No to unlock the monitoring account. 

Accounts

List of monitoring accounts required for running services associated with the host.

HostMonitoringAccounts

  • The accounts displayed are automatically updated when a service is added or deleted for the host. You cannot add new ones or delete existing ones manually.
  • By default, the host will use the monitoring account information inherited from the parent site of the host, or alternatively, from a higher-level site or from the company.
  • If a configuration specific to the host is required, the account must be locked.

example  SNMP authentication credentials for the host different from those inherited from the company

  • The lock icon next to each monitoring account indicates whether or not the account is locked.
    • Padlock close icon.png: Monitoring account locked for the host.
    • Padlock open icon.png: Monitoring account defined and locked for the parent site of the host, or alternatively, for a higher-level site or for the company.
    • Padlock red close icon.png: This can mean that the monitoring account must be configured and locked for the host, site or company, or that a monitoring account of the same type must be added for the site or company.
  • Click one of the accounts to display its details.
    • Select Yes to lock the monitoring account for the host. Next, enter the configuration information specific to the type of account.  The values defined will apply to all current and future services associated with the host.
    • To restore the inherited values for the monitoring account, select No to unlock the monitoring account.  The level from which monitoring account information is inherited will appear next to the field.
       

      example  My Company inheritance account

Notifications

Notification policy defined for the host, indicating trigger events and timeslots as well as notification recipients.

      Open url.png See:

Enable notifications: Used to define a notification policy for the host. If you select Yes, you must specify the contextual fields that will appear. You can disable notifications by selecting No.

Fields for defining a notification policy for the host

Notification period: Timeslot during which events occurring on the monitored host will trigger notifications.

  • Events outside this period will not trigger any notification. If the incident is still present when the next notification period is applicable, then a notification will be triggered.

For these events: Type of event that will trigger a notification.

  • Down: Notification sent when the host is down.
  • Up: Notification sent when the host is operating normally again.
  • Unknown: Notification sent when the host status is unknown to monitoring.
  • Unstable: Notification sent when the host is considered to be unstable based on the high and low flapping thresholds defined for detecting instability.
    • The host instability rate is calculated using the last 21 reports stored. It is recalculated each time a monitoring control is run. Older values are weighted less heavily than more recent ones.
    • The host is considered to be unstable when the instability rate exceeds the high flapping threshold.
    • It will once again be considered stable when the instability rate drops below the low flapping threshold.

   Thresholds defined for detecting instability must comply with Nagios syntax.

   When the state of the host is unstable, notifications will be disabled to restrict the number of warnings triggered. They will remain disabled until the state of the host is once again stable.

Best Practice icon.png  You can view the instability rate in real time in the General information tab of the host Detail form (menu Monitoring > Monitoring).

Level 1 contact(s) and contact group(s): List of Level 1 contacts and groups to whom notifications should be sent during the notification timeslots specified.

  • Only active contacts and contact groups will appear.

Escalations

   You must first configure the notification policy for the host before you can configure the escalation process.

      Open url.png See the example

Level 1 escalation: Used to indicate that the notification must be repeated when the status is not acknowledged by the Level 1 operations team, after the number of controls defined is reached.

  • Level 1 contacts are defined in the Notifications tab.

Level 2 escalation / Level 3 escalation: Used to send notifications to the contact or contact groups specified when the status is not acknowledged by the lower-level operations team, after the number of notifications defined is reached. The notification will be repeated if the status is not acknowledged, after the number of controls defined is reached.

Relations

Parent and child relationships for the host.

example  

  • Relationship between a virtual machine (child host) and an ESX (parent host)
  • Relationship between hosts and a router for monitoring a remote site without a Box. If there is no ping response from the site's router, then the central Box will not collect data from the site's hosts.
  • Child hosts must always be monitored by the Box of the parent host.
  • If the status of the parent host is Critical, the status of the child hosts will automatically change to Unknown and notifications will be disabled for the child hosts.

Procedures

How to create a host

Best Practice icon.png  You can run a discovery in the Configuration > Hosts > Discovery menu to detect hosts present among the range of network IP addresses and simplify the implementation of monitoring.

Step 1: Select the company where you want to implement the new host

SelectCompanyInCompanyTree_Procedure

1. Go to the Web app.

2. Select the company from the company tree structure.

Notes:

  • The selected company must be associated with a Box.
  • You can create a new company. Open url.png See the procedure

    Company tree structure.png

Step 2: Create the new host

1. Select Configuration > Hosts > List in the menu.

2. Select the Mode: Box tab or Mode:  Agent tab depending on whether monitoring is performed via a Box or an agent.

3. Click Add.

4. Select each tab and specify the information on the new host.

5. Click Apply.

The host will be created. It will be visible to the company and its lower-level sites.

    Certain services inherited from host templates may require monitoring account information. If this is the case, you must select the Accounts tab to check that it is correctly configured.

Step 3: Set up monitoring for new host

1. Generate the Box configuration to ensure that the new host is taken into account.

  • Select Configuration > General > Loading in the menu.
    All of the Boxes you are authorized to access as administrator and whose configuration is not up-to-date will appear.
  • Click Apply.
  • The Box configuration will be updated.
  • The monitoring of the new host will start on the associated Box.
  • Notifications will be sent as defined in the notification policy.

2. Check that monitoring data for the host is correctly reported in the Box in the Monitoring > Monitoring menu.
 

Step 4: Define the configuration specific to host services

When the initial control is run on the host, the status of services associated with the host may be Unknown. These services require a specific configuration that can only be identified after an initial control is run.

example  Network traffic must be configured with an interface number. This number is known only after the initial control is run because interface numbers vary depending on the type of host, model and operating system.

1. Select Monitoring > Monitoring in the menu.

2. Select the Services item type and the Unknown status and click Search.

3. Select each service associated with the host whose status is Unknown and complete its configuration.

4. Generate the Box configuration to ensure that the modifications to each service are taken into account. You can do this in the Configuration > General > Loading menu.

How to apply changes to multiple hosts at the same time

Best Practice icon.png  You can also run an import. Open url.png See the procedure

1. In the company tree, select the parent company of the hosts you want to modify.

2. Select Configuration > Host > List and select the Mode: Box tab or the Mode:  Agent tab depending on whether monitoring is performed via a Box or an agent.

3. Select the hosts to be modified.

4. Click More in the toolbar and select the wizard you want.

          Mass update for hosts.png

5. Specify the information specific to the wizard.

6. Click Apply.

The modifications will be applied to all of the selected hosts.

7. Generate the Box configuration to ensure that the modifications are taken into account for each host. You can do this in the Configuration > General > Loading menu.

Tags:
Powered by XWiki © EasyVista 2024