Zenoss Monitoring II

  Zenoss

Day 1

  • Video
    • Link: https://zenoss.zoom.us/rec/share/8It3IwK8EQDqhaP7zOgudyKEOE0pbaSFapwIkZFLOrmjp88YGJZ6TzQT1y1kew6S.4dRbOfbs9UmwHG_R
    • p&xdK1G1

Day 1

Zenbatchload

0:40:00

  • Used to add or modify devices
  • Device Classes are created automatically

Configuration Syntax

# Comment
/Devices/Device/Class
host1.name.tld | IP.ADD.RE.SS [ option1=NUMBER, option2=’String’, … ]
host2.name.tld | IP.ADD.RE.SS [ option1=NUMBER, option2=’String’, … ]

Setting Options

  • Any zProperty
    • zSnmpCommunity …
  • setManageIp – Specify a device’s IP
  • setPerformanceMonitor – Define the device’s collector
  • Non-numeric values must be enclosed in matching quotes
  • Lists must be enclosed in square brackets – [‘x1’, ‘x2’, 5 ]
  • Models by default unless --nomodel

Example File

pdf Pg. 43

# Start of zenbatchload file

# Create the parent device class and set the zSnmpCommunity zProp to 'underwriting'
/Devices/Server/Linux zSnmpCommunity='underwriting'

/Devices/Server/Linux
dns.hypothetical.loc

/Devices/Server/Linux/WordPress
dns.hypothetical.loc

/Devices/Server/Linux/MySQL
mysql.hypothetical.loc

Run the file

  • Place the file FILENAME.TXT in a world writable folder in CC Master
  • CD to ^^ folder
  • serviced service shell zope
  • cd /mnt/pwd
  • zenbatchload FILENAME.TXT

Results

# Last 2 lines of output
INFO zen.BatchDeviceLoader: Processed 3 of 3 devices, with 0 errors
INFO zen.BatchDeviceLoader:  Unable to process 0 entries

zenbatchload –sample_configs

#
# Example zenbatchloader file (Groups, Systems Locations, Devices, etc.)
#
# This file is formatted with one entry per line, like this:
#
# /Devices/device_class_name Python-expression
# hostname Python-expression
#
# For organizers (for example, the /Devices path), the Python-expression
# is used to define defaults to be used for devices listed
# after the organizer. The defaults that can be specified are:
#
# * loader arguments (use the --show_options flag to show these)
#
# * zProperties (from a device, use the 'Configuration Properties'
# menu item to see the available ones.)
#
# NOTE: new zProperties *cannot* be created through this file
#
# * cProperties (from a device, use the 'Custom Properties'
# menu item to see the available ones.)
#
# NOTE: new cProperties *cannot* be created through this file
#
# The Python-expression is used to create a dictionary of settings.
# device_settings = eval( 'dict(' + python-expression + ')' )
#

# Defining groups
/Groups/Admin
/Groups/Support

#Defining systems
/Systems/Production
/Systems/Staging

# Defining locations
/Locations/Canada address="Canada"
/Locations/Canada/Alberta address="Alberta, Canada"
/Locations/Canada/Alberta/Calgary address="Calgary, Alberta, Canada"

# If no organizer is specified at the beginning of the file,
# defaults to the /Devices/Discovered device class.
device0 comments="A simple device"
# All settings must be seperated by a comma.
device1 comments="A simple device", zSnmpCommunity='blue', zSnmpVer='v1'

# Notes for this file:
# * Oraganizer names *must* start with '/'
#
/Devices/Server/Linux zSnmpPort=1543
# Python strings can use either ' or " -- there's no difference.
# As a special case, it is also possible to specify the IP address
linux_device1 setManageIp='10.10.10.77', zSnmpCommunity='blue', zSnmpVer="v2c"
# A '' at the end of the line allows you to place more
# expressions on a new line. Don't forget the comma...
linux_device2 zLinks="<a href='http://example.org'>Support site</a>", zTelnetEnable=True, zTelnetPromptTimeout=15.3

# A new organizer drops all previous settings, and allows
# for new ones to be used.
/Devices/Server/Windows zWinUser="administrator", zWinPassword='fred'
# Bind templates
windows_device1 zDeviceTemplates=[ 'Device', 'myTemplate' ], rackSlot=1
# Override the default from the organizer setting.
windows_device2 zWinUser="administrator", zWinPassword='thomas', setProdState=500, rackSlot=2, settingsDevice setManageIp='10.10.10.77', setLocation="123 Elm Street", setSystems=['/mySystems'], setPerformanceMonitor='remoteCollector1', setHWSerialNumber="abc123456789", setGroups=['/myGroup'], # Apply custom schema properties (c-properties) to a device
windows_device7 cDateTest='2010/02/28'

# If the device or device class contains a space, then it must be quoted (either ' or ")
"/Server/Windows/WMI/Active Directory/2008"
windows_device_3 setTitle="Windows AD Server 1", setHWTag="service-tag-ABCDEF", setPriority=2

# Now, what if we have a device that isn't really a device, and requires
# a special loader?
# The 'loader' setting requires a registered utility, and 'loader_arg_keys' is
# a list from which any other settings will be passed into the loader callable.
#
# Here is a commmented-out example of how a VMware endpoint might be added:
#
#/Devices/VMware loader='vmware', loader_arg_keys=['host', 'username', 'password', 'useSsl', 'id']
#esxwin2 id='esxwin2', host='esxwin2.zenoss.loc', username='testuser', password='password', useSsl=True

# The following are wrapper methods that specifically set properties on a device:
# setManageIp
# setPerformanceMonitor
# setTitle
# setHWTag
# setHWSerialNumber
# setProdState
# setPriority
# setGroups
# setSystems

Event Handling

Deduplication

  • Device|Component|Event Class|Event Key|Severity|Summary
    • Event Class can be empty
    • Summary is omitted if Event Key is present.
  • Aged events are eligible for Deduplication.  Clear, Closed and Archived events are not.

Auto-Clear Correlation

  • if the Component UUID field is present
    • ComponentUUID
    • eventClass
    • eventKey
  • If the Component UUID field is NOT present
    • Device
    • Component (can be blank)
    • eventKey (can be blank)
    • eventClass
  • zEventClearClasses are applied on the Clear Event’s event class to tell them which classes they can clear.
    • Start events will clear Stop events.
      • zEventClearClasses = /App/Stop
    • Do not specify the Start class in the Stop event’s zEventClearClasses
    • This property can accept a list of classes these events can clear.

Event Suppression

  • Requirements
    • Must first be enabled via the zL2SuppressIfPathsDown zProp on the device or device class.
    • Requires the zenoss.snmp.ClientMACs modeler plugin is enabled and functioning on all network devices in the path.
  • When a device is modeled, an NMAP is also performed and a network map is discovered.
  • If a device is found to be behind Switch or Router or some other device and that device is found to be down, then events from devices behind these are ‘Suppressed’

Event Mappings

02:57:00

  • Unknown events are compared to Event Mappings.  If a matching Event Mapping is found, the event is sent to that class.
  • Only applied to /Unknown events
    • This generally means “unsolicited” events because we did not actively go out and get them
    • Exception might be events provided via APIs.
  • Normal monitoring does not make /Unknown events.  These event classes are pre-defined in the monitoring templates.
  • Unknown events come from
    • Trap
      • If the MIB is installed, the name of the trap becomes the Event Key
    • Syslog messages
    • Email
  • Uses the Event Class Key to determine how to match.

Event Thresholds

3:42:00

  • Minimum Only: Event created if the value < Minimum.
  • Maximum Only: Event created if the value > Maximum.
  • Minimum and Maximum same: Event created if value != Min and Max
  • Minimum < Maximum: Event created if value < Minimum or value > Maximum
  • Minimum > Maximum: Event created if value > Maximum AND value < Minimum. (event on anything in between)
    • This is good for “Warning” type thresholds where there is another threshold for Critical
    • Disk Usage Warning Min=95 Max=85
    • Disk Usage Critical Max=85

Event Transforms

03:27:00

  • Applied zeneventd.
  • If zeneventd is bottlenecking, the rabbit queue will start growing.

Heartbeat Events

02:48:00

  • Requires a heartbeat to be received before it starts working
  • If no additional heartbeats are received within the timeout value, an event is generated.
  • Advanced > Events > Clear heartbeat events
    • Dangerous.  If a daemon is down and you clear these events, no further heartbeat events will be received.
    • Good if you started a daemon you did not want and then stop it. Use this to clear those events.
serviced service attach mariadb-events
mysql -uroot
use zenoss_zep;
show tables;
select * from daemon_heartbeat;

 

Network Mapping

  • https://help.zenoss.com/in/zenpack-catalog/open-source/layer2
  •  Requirements:
    • Network devices must be monitored by Zenoss
    • Layer 3 – Ping based. IP Addresses
    • Layer 2 – MAC Addresses
      • Must have CDP (Cisco Discovery Protocol) or LLDP (Open Sourced Link Layer Discovery Protocol) enabled to provide information about its neighbors.
      • These use the CISCO-CDP-MIB and LLDP-MIB
      • These protocols are provided by the layer2 ZenPack

 

Day 2

Video:

  • s://zenoss.zoom.us/rec/share/I4MbuDWD0jd-SpFD9EDUXINFLUIGmH0_PJ9i9Fkd67bqQa_u0NjTbdSjfP6O-s21.dO3WRCjbMAVe1frx
  • eW7rP%m

Overview of Day 1

Modeling vs. Monitoring

IP Services

1:10:00

  • Network services that respond on the port defined by an entry in the Global IP Services list
    • Opens a connection to the port and looks for a response.
  • Most NOT monitored by default
  • UDP are detected but cannot be monitored
    • Connectionless.  Point and shoot but don’t check if you hit.
  • TCP services are detected and can be monitored if they bind to
    • an external interface address (not localhost / 127.x.x.x)
    • or all available network interfaces (0.0.0.0)
  • A send string / expect regular expression can be specified for TCP services
  • Cautions
    • This is a very low level service.  If there are other means of monitoring the port (Application monitoring), IP Services really is not required.
    • There are some applications that view this test as a failed login.  After several perceived “failed logins” the application may block all log in attempts (including legitimate ones). (Mysql?) so only use if required and no other means.

Regular Expressions

1:23:00

HTTPS://regex101.com

  • Alphanumerif characters match themselves
    • zebra = zebra, the zebra and zebras
  • . (dot) matches any single character
    • m.st = mist and most but NOT mst
  • * (asterisk) matches any string of zero, one or more characters
    • <.*> = <p>, </p> and <>
    • hap*y = hapy, happy, happppy but not hapxy
  • […] (character set) matches any single character with the brackets
    • [dfc]og = dog, fog and cog but not jog or og
    • Can include a set
    • [a-z]og will also match job, but not og or Dog
    • dev[1-5] = dev1 and dev2 but not dev0, dev6 nor deva
  • \s (any single white space)
    • hey\sthere = hey there, hey\tthere (tab) and hey\nthere (new line)
  • \w (word character)
    • \w = a, B, 1 and _ but not !
  • \b (break) Any word character to a non-word character.
    • Start of string counts!
    • \bgrep\b = grep, grep! /grep/ but not fgrep or greps
  • ^ (caret) = Beginning of string
    • ^grep = grep this and greppers! but not You can't grep this!
  • [^…] (Character set starting with a caret)
    • dev[^1-5] = dev0 and dev6 but not dev1, dev3 nor dev5
  • $ (Dollar sign) End of string
    • zebra$ = my zebra and zebra but not zebras
  • ? (Question mark) Search Lazy
    • .*?mysqld = /usr/lib/mysqld but not /usr/lib/mysqld --myfake=/var/log/mysqld
  • | (vertical bar / pipe symbol) Logical OR
    • answer yes|no = answer yes and answer no but not answer ok
  • (…) (parentheses) Used for grouping or capturing
    • (red|green|blue) = red, green or blue but not yellow and puts the matching string and stores it as /1 (incremented by the number of grouping sets)

Process Monitoring

1:54:00

  • Infrastructure > Processes
  • Organized by Service Name (Apache, MySQL…)
  • Sub-oranized by Process Name (apache2, mysqld…)
  • Processes are pulled by zenprocess
    • snmp?
  • Primary reason for Process Sets was for Oracle Databases.

Matching Rules

Compares the process list pulled to the list of defined processes looking for matches using regex

  • Include processes like: Expression you are looking for
    • /mysqld\s, /mysqld[ ]
  • Exclude processes like: Additional expressions you are NOT looking for
    • “If you find the above expression, ignore it if it also contains this…”
    • \b(vim|tail|grep|tar|cat|bash)\b
  • Replace command line text: Used with ‘With’ field below.  Allows you to use a Regex to replace the full command line with something easier to read.
    • /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=...
    • Replace command line text: `(^/.*?/mysqld) .*`
  • With:
    • \1
    • Result: /usr/libexec/mysqld

Process Count Threshold

Set the number of processes you would expect to see

  • Minimum
    • Often set to 1.
    • If you are running multiple instances (possibly 2 or more instances of MySQL on different ports) then set to the minimum you expect.
    • Good example of multiple processes running is httpd.
  • Maximum
    • Often left blank (infinity) to allow for worker processes
    • If a process goes crazy and launches 100 workers, you might set this to create an event.

Configuration Properties

Inherited from root “Processes” organizer.  These can be overridden by a device’s Configuration Properties

  • Set Local Value: Do you want to monitor this process?
  • Send Event on Restart (zAlertOnRestart)
    • Determined by looking at the process ID.  If it changes between monitoring cycles, the process must have restarted.
    • Default Yes
  • Failure Event Severity (zFailSeverity)
    • Set the severity of failed events
    • Default Error
  • Lock Process Components (zModelerLock)
    • Prevent this process from being deleted if it is not found during the next Modeling cycel
    • Default “Unlocked” (OK to delete)
  • Send event wen action is blocked (zSendEventWhenBlockedFlag)
    • Send an event if Modeler attempts to delete if it is not found and it is locked?
    • Default No

2:14:30

Notes:

  • mysqld_safe
    • Shell script that starts mysqld and if mysqld stops, attempts to restart it.
    • Does NOT process database calls, so should not be counted as a mysqld process.

Application Process Monitoring Examples

2:49:00 / 3:04:00

  • Require ZenPacks : http://zenpacks.zenoss.com
  • Must be configured both on the application side and on Zenoss side.
    • Both are outlined in the link above.
  • You will need to bind the template to the device class.

Best Practices

  • Whenever possible restrict access to the application monitoring to specific hosts or networks.
  • Recommended to use a device class for devices monitoring this application

Addition Application Process Monitoring notes

  • Most Applications are bound at the device level
  • MySql may have multiple instances, so will be added as a component.
    • These Application must have the Application Modeler Plugin added
    • You may also need to add the user credentials
      • Recommend creating a special user/password for these

Virtual Web Host Monitoring

3:32:00

  • Cannot have multiple devices with same IP address
  • Solution: Create placeholder devices (aka psudo-devices) with hostnames that do not resolve in DNS
    • auto.hypothetical.loc become site-auto.hypothetical.loc

Setup

  • Create a subclass under HTTP that inherits the HttpMonitor Template
  • Remove the Device template
  • Bind the DnsMonitor Template
  • Create the psudo-devices in the new class
  • Create the following Custom Properties
    • cDnsExpectedIP
    • cHostName
    • cHttpExpectRegExp
  • Create local copies of the DNS and Http templates
    • These will be modified to use custom properties vs configuration properties
    • Technique here is out of date.  Can create local bindings directly on the Class Details page now.
    • Advanced > Monitoring Templates > Copy / Override Template > Select Http/AgentBlogs
  • Edit the templates to use custom configuration properties
    • DNSMonitor
      • Host Name: ${dev/cHostName}
      • Expected IP Address: ${Dev/cDnsExpectedIp}
    • HttpMonitor
      • Host Name: ${dev/cHostName}
      • IP Address or Proxy Address: <Delete to force to resolve the cHostName>
      • Regular Expression: ${cHttpExpectRegExp}
  • Update the Custom Properties per devices as required

Twill

4:33:00

Overview

  • Twill is a scripting language designed for testing web sites
  • Allows you to simulate a browser session: follow links, fill out and submit forms, check for specific content, etc.
  • The approach is often referred to as monitoring with synthetic web transactions.
  • Monitoring with Twill often starts with a trial-and-error process using the interactive Twill shell.
  • Note: Twill does NOT interpret Javascript.

Run the shell

serviced service shell zope
su - zenoss
# cd /opt/zenoss/ZenPacks/ZenPacks.zenoss.ZenWebTx-3.0.2-py2.7.egg/
cd /opt/zenoss/ZenPacks/*ZenWebTX*
cd ZenPacks/zenoss/ZenWebTx
# All of the required libraries are stored here

#Set the Python Path
export PYTHONPATH=$PYTHONPATH:$PWD
echo $PYTHONPATH

#Run the CLI
python bin/twill-sh

-= Welcome to twill! =-
>>

Commands

  • <Ctrl><D> Exit the editor and save your commands
    • cat .twill-history
    • Shows all command ran during the session.
  • cat .twill-history Show all commands ran during the last session
  • find TEXT – Searches the page for the existence of TEXT
  • follow “link text” – follows a link
  • formvalue FormName_Or_Number FormField FormFieldValue
    • FormName recommended in case a new form is inserted above the desired form
    • Example:
      • formvalue loginform user_name admin
      • formvalue loginform user_pass Zenny123
      • submit
  • go SUB.DOMAIN.TLD
  • help <command>
  • showforms: Show all forms on the current page

Example Script

Note: Script will parse through Tails before sending to Twill, so you can substitute tails expressions in the code.

go http://${dev/cHostName}
follow 'Log in'
formvalue loginform user_login admin
formvalue loginform user_pass Zenny123
submit
find Dashboard
follow 'Log Out'
find 'You are now logged out.'

Create a Twill monitoring template

  • Advanced > Monitoring Templates > [ + ] > Add Template
    • Name
    • Path = Device Class
  • Use left menu pane to locate the new template.
  • Data Sources > [ + ] > Add Data Source
    • Type: WebTX (Web Transactions)
  • Twill Commands: Past in Twill commands
  • Save (Required for testing)

Metrics Collected

  • <name>.available
  • <name>.totalTime

Final Step – Bind the template to the Device Class

Hints

  • Errors created using the Twill CLI will result in events in Zenoss

Appendix A: Additional Information

ZenPacks

  • Commercial: Available to anyone with a commercial license (any support level)
  • Open Source: Available to commercial and community users.
    • Written and supported by Zenoss
  • Community:
    • Written by the Zenoss community or perhaps a vendor
    • Not supported by Zenoss
    • Good chance they are out of date.
  • Subscription: Available with a subscription only.  Requires commercial license.
    • Written by Zenoss Professional Services.

LEAVE A COMMENT