ZSE – Family Wizard Demo

  Zenoss SE

Dashboard

  • Commerce Executive Dashboard

What does the product do

Compare and Contrast old vs. new

  • Old Dashboard
    • Maps
    • Other widgets
    • Finite set of options
  • New Dashboard
    • Underlying infrastructure still the same
      • Monitoring via SSH, WinRM, SNMP, API, etc.
    • Backend systems replaced with newer technology
    • UI modernized

How do we get the infrastructure – Modeling

Reach out to loaded devices and ask them

  • What is it like to be a
    • Cisco UCS
    • Linux OS
    • Virtualized environment
  • Within each device build a model or an internal relationship tree
    • This VM depends on this partition which depends on this logical volume which depends on these hard drives.
  • Also internally monitoring
    • Virtualized infrastructure
    • Storage
    • Compute
  • Apply internal models externally
    • This SQL Database running on this Linux OS depends on this part of your virtualized infrastructure which depends on this storage over here and compute over there…
  • Automatically mining that data and letting you know what the relationship tree really is
    • This is the theory of operations for Zenoss

New Features for Zenoss Cloud

On-Prem and Zaas

  • Limited to proactive monitoring just discussed
  • To know about a new container, you would first need to know about it before you could gather data from it

Zenoss Cloud

  • Ability to stream data in
    • Instantiate small agents that make an API call
    • Send whatever metrics and model data you wish directly from these ephemeral containers.
    • This could be designed into a Kubernetes Cluster and as soon as a new container is spun up, it would start feeding Zenoss with this information
    • Example: Pulling in P1 Incidents from ticketing system
      • External server or container making an API call to ServiceNow
      • Gets the count
      • Streams that data back to Zenoss
    • This is not a typical Data Point, but is displaying information that is just sent to us.

Questions

Are we collecting any telemetry data to improve our product and if so, are you anonymizing at all?

  • Definitions
    • Telemetry data: Data captured by the system then sent back using a “phone home” system that reports statistical data such as usage, latency…
    • Anonymizing: Removing all identifying particulars or details such that the data received is strictly statistical.
  • Yes, we do.
    • Each client’s data can only be accessed within their tenant (URL) or organization
    • We have a separate monitoring template that monitors information such as
      • How many data points are going through the system
      • Are there any backups in the queues
        • These help us refine bottlenecks
        • Add features and functionality
      • Also able to see what are the more popular tiles you might be using
        • Help us to create dashboard templates to reduce re-inventing the wheel
          • Example: Meraki Template

What are these Agents built on (Referring to streaming data)

  • You can basically use any technology you wish
    • Shell scripts
    • Python
    • We have a Kubernetes agent built on StatsD
      • StatsD: A data streaming agent that is very lightweight and separates the streaming from the application.  Data from the application is sent to StatsD using UDP (fire and forget).  If there is an issue with StatsD, it will not affect your application.

The Resource Manager User Interface

Organized by Device Class

Linux SSH

  • Logical Volumes
  • MySQL Databases
  • Could monitor Docker containers

Upgrades

  • In cloud these are incremental and pretty much seamless.  Upgrades are generally performed every (or every other) Tuesday.
  • Never have to worry about “Am I on the latest version?”
  • Upgrades performed by our SRE (Site Reliability Engineering) Team.
    • If an outage is required, the upgrade maintenance window will be scheduled with the client.
      • OS Updates
      • Security Updates
  • Usually everything is performed using a rolling restart so nothing is ever really down.

 

Service Impact

17:10

 

LEAVE A COMMENT