REM Monitoring Demo

September 29, 2010

The first REM Monitor demo is now up:

http://ge01f.com/test/system_monitor/monitor

This demo shows monitoring remote hosts via ping and HTTP.  Host monitoring is off because graphing has not been optimized enough yet to not take up a majority of machine resources when there are hundreds of graphs that can be rendered.

Hosts can be added or removed, and new HTTP alerts can be added.

More to come…  SNMP Host monitoring, SLAs based alerts to role accounts with shifts and prioritized contact lists with delay between escalation pending alert acknowledgment.  Among other things.

You can see the old Local System Monitoring Demo, showing a set of host monitors, here:

http://ge01f.com/sys/system_monitor/


GUI Editor Teaser Image

September 22, 2010

I am about a week away from releasing the “REM Monitoring” package, which will be the first of the product releases from the REM suite.

Currently all the local and remote-local-collection monitoring is done, and I have been wrapping up a GUI editor for easy creation of monitoring, dashboard and other general purpose web development for REM tools.  The GUI relies on jQuery and a lot of awesome UI plugins developed by the community, which I have integrated and added to a common widget rendering Python library called jquery_drop_widget.

The next post should have the Alpha release of the REM Monitoring package and initial documentation on how to use it, including the GUI page and widget layout system.

Here is a screen shot of a full page layout getting a new widget created:


Another Demo: No Relay Chat

August 25, 2010

As another demonstration of how the unidist library can be put to use, primarily using shared state and message queues, I wrote an IRC-like web based chat server named No Relay Chat.

You can download the No Relay Chat Demo here, at procblock package’s Google Code page.  You’ll also need the latest procblock, which is on that page.

Below is a screen shot, but here is an online demo. Log in and make some channels.

There is a bug that appears intermittently with the New Channel pop up dialog,  close and try again or reload the page if you’re still interested at that point.  Also, closing a channel also intermittently seems to not unsubscribe, I may be missing a JS error on some case, or there could be a race condition setting channels again after unsubscribe or something.  The communications has always been solid and reliable, so I’m guessing it’s something like this.  I’ll clean those up over the next few days.

At a later date I’ll wrap this up into a full page with IM, ops, moderation, invite/secret channels, and other IRC goodness so it is more useful, and then I’ll package it up for easy installation via RPM/MSI/etc.


Local System Monitoring Demo

August 22, 2010

I have the first draft of the local system monitoring demo (single node) ready: It can be viewed here.

I’ll be flushing this out more after I finish the monitoring for Linux, and fix the Disk I/O to update properly in FreeBSD and OS X, and fix the View Internals for the RRDs, on RRDs that have multiple targets per type.  Then I’ll add some formatting for the sections, and make the list of items dynamic and so you can turn uninteresting ones off, and then I will ship that demo.  After a few more demos to finish testing out all the different packages that it takes to make up Red Eye Monitor (REM), then I will turn this into a real monitoring software install, that does good things out of the box, and works on single nodes or multiple nodes.


dropSTAR released as Python library

August 19, 2010

dropSTAR has now been released as a stand-alone Python library for creating an HTTP server.

I’ll be putting together RPM/Deb/make packages to get a more functional install for those not interested in the packages, but the functionality.  These will come with installers for modules on the dropSTAR and procblock platform, which will allow services to be packages and downloaded separately and will stay focused on providing functionality, not a lower level development framework.

More to come!


dropSTAR (webserver) procblock demo

August 18, 2010

Now for a demo with a bit more teeth.  This will soon be released as a stand alone open source web server package, named dropSTAR (Scripts, Templates and RPC).  It is designed to easily allow dropping in new dynamic pages, and is focused on system automation instead of the standard end-user applications that normal dynamic web servers are intended to serve.  It can do that kind of work too, but it’s not optimized for millions of page loads, it’s optimized to take as little time as possible to make and modify web pages for system administration use, and RPC for dynamic pages or system automation between nodes or processes.

The demo can be downloaded at the procblock downloads page.  You will need to install procblock as well, which is also available at that link.

A live version of the demo can be played with here.

The demo is a series of tabs, each doing something different:

CPU Monitoring

This tab shows a very simple CPU statistics collection program (runs a Shell command, parses and returns dict of fields), which runs in an Interval RunThread every 5 seconds, and then graphs the results.  The page automatically reloads the graph every 5 seconds so it can be watched interactively.

System Information

This is another very simple shell command, that cats /proc/cpu and puts the columns into an HTML table.

Logs

This tab reads out of the tail of a log file, reverses the lines, and splits the contents to format for HTML in color.  It updates every 5 seconds.

Requests

This is the most complex tab on the page.  It has a monitor for “requests”, which is a counter in the shared resources (unidist.sharedcounter module), and a thread will run with a delay and increment the requests.  The total number of requests since starting are showing in text, and the graph displays the change in this request variable over time.

A slider allows adjustment of the delay for requests, which will be saved in shared state (unidist.sharestate module).  Reloading the page will keep the slider in any changed position, and the graph/counter should correlate to the position of the slider in terms of more or less requests per second.

These is also a button called “Stop Request Poller” which Acquires a lock (unidist.sharedlock module), which stops the poller from incrementing the request counter.  If toggled again, requests will resume.

The right bottom side of the page has not been completed yet, and so just is there to look pretty and take up page space.  Later this will turn into an adjustable SLA monitor which will notify or alert (via the HTML page) that the SLA is near or out of tolerance with regard to requests a second.

Wall

This page shows the use of the message queues (unidist.messagequeue module), which allow messages to be inserted into a queue for later processing.  Any message typed into the input field with an enter key or Write button click will be inserted into the message queue “wall” in the shared message queues.  Any messages older than the last 25 are discarded to keep it only storing useful data.  Messages are not removed from the queue on reading, so that they can be continually re-processed for display.

Then in 5 seconds an RPC call will update the page with all the messages.


Simplest procblock Demo

August 18, 2010

This is the simplest demo I could think of, it features one script that is run by procblock, specified by a YAML file: simplest.yaml:

run:
 - script: simplest.py

Which runs simplest.py:

import random

def ProcessBlock(pipe_data, block, request_state, input_data, tag=None, cwd=None, env=None, block_parent=None):
 """Simplest demo possible."""
 data = {'random':random.randint(0, 100)}
 return data

To invoke procblock, run:

cd demo_simplest
./procblock simplest.yaml

This will invoke the script “simplest.py”, which has a standard module function ProcessBlock().  This is the standardized method for all code process blocks, and allows them to chain together, passing relevant information, and also shared data between them.  In this case, there is only one script, so it is simply returning it’s result.

Example:

monkey:demo_simplest ghowland$ ./procblock simplest.yaml 2> /dev/null
{'run': {'__duration': 0.010447025299072266,
 '__start_time': 1282114491.079386,
 'args': [],
 'random': 99}}
monkey:demo_simplest ghowland$

I redirected STDERR to /dev/null because I am leaving the STDERR logging on until procblock is ready to leave Beta Testing.

For a second example there is a slightly more complicated procblock called: simplest_monitor.yaml

As the name implies, this is the simplest monitor I could think of.  It monitors random numbers simplest.py generates every 5 seconds, stored them in an RRD file, and then graphs it.

Because it runs the script every 5 seconds, it is labled a “long running” process, and so it will continue to run until CTRL-C is pressed, which will send a notification for all threads to exit gracefully, which they will do if they are properly written.

Here is the contents of simplest_monitor.yaml:

run:
 - script: simplest.py
   cache: 5
   thread_id: simplest
   timeseries collect:
     path: simplest.rrd
     interval: 5

     fields:
       random:
         type: GAUGE

     graph:
       - path: simplest.png
         title: Simplest Monitoring Demo
         fields: [random]
         method: STACK
         interval: 10
         vertical label: "Random #s"

__usage:
  name: simplest
  author: Geoff Howland

  # Let this run, so we can monitor it
  longrunning: true

The big additions here are the “timeseries collect” statement, which defines what fields to collect from the run script, and how to graph it, and the addition of a __usage section to the YAML file, which defines the name of the block (simplest), the author, and sets longrunning=True, so that the script wont quit as soon as the thread is created to monitor simplest.py’s results.

Example 2:

Run:

monkey:demo_simplest ghowland$ ./procblock simplest_monitor.yaml 2> /dev/null
Running Thread: Starting: simplest
Waiting for interval thread output: simplest
{'run': {'__duration': 0.41039705276489258,
 '__start_time': 1282114743.028677,
 'random': 6,
 'run_thread.simplest': simplest: Is Running: True  Scripts: ['simplest.py']}}
^CRunning Thread: Quitting: simplest
monkey:demo_simplest ghowland$

The result, after a bit of waiting:


What happens this time is a bit different.  Right away we get a returned object, that shows the duration being quite short, and a single random number, and a field called “run_thread.simplest”.  “simplest” is the thread_id name I gave to this monitor thread.  The value of this is a __repr__() string representation from a RunThread object, and you can see it is running (Is Running: True) and running scripts [simplest.py].  It is also formatted in HTML, because that is where I am using it in testing web server internals right now.  Another Beta artifact.

Nothing else happens in this script, until I press CTRL-C, and it quits.  That is because I have the logging turned off.  With the logging on, it shows what is going on:

monkey:demo_simplest ghowland$ ./procblock simplest_monitor.yaml
DEBUG:20100818000408:procyaml.py:138:ImportYaml: Importing YAML: simplest_monitor.yaml
DEBUG:20100818000408:mainfunctions.py:434:ProcessAndLoop: Long Running Process: Starting...  (CWD: /Users/ghowland/blocks/demo_simplest)
DEBUG:20100818000408:procyaml.py:138:ImportYaml: Importing YAML: /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/procblock-201008.1-py2.6.egg/procblock/data/default_tag_functions.yaml
DEBUG:20100818000408:procyaml.py:138:ImportYaml: Importing YAML: /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/procblock-201008.1-py2.6.egg/procblock/data/default_condition_functions.yaml
DEBUG:20100818000408:rrd.py:48:StoreInRrd: Storing RRD Occurrance: simplest.rrd: 1282115048.28: {'random': 67}
DEBUG:20100818000408:rrd.py:183:GraphRrd: Graphing RRD: simplest.rrd
{'run': {'__duration': 0.4735870361328125,
 '__start_time': 1282115048.26404,
 'random': 67,
 'run_thread.simplest': simplest: Is Running: True  Scripts: ['simplest.py']}}
DEBUG:20100818000413:rrd.py:48:StoreInRrd: Storing RRD Occurrance: simplest.rrd: 1282115053.58: {'random': 93}
DEBUG:20100818000413:rrd.py:183:GraphRrd: Graphing RRD: simplest.rrd
DEBUG:20100818000418:rrd.py:48:StoreInRrd: Storing RRD Occurrance: simplest.rrd: 1282115058.84: {'random': 85}
DEBUG:20100818000418:rrd.py:183:GraphRrd: Graphing RRD: simplest.rrd
DEBUG:20100818000424:rrd.py:48:StoreInRrd: Storing RRD Occurrance: simplest.rrd: 1282115064.14: {'random': 57}
DEBUG:20100818000424:rrd.py:183:GraphRrd: Graphing RRD: simplest.rrd
DEBUG:20100818000429:rrd.py:48:StoreInRrd: Storing RRD Occurrance: simplest.rrd: 1282115069.48: {'random': 79}
DEBUG:20100818000429:rrd.py:183:GraphRrd: Graphing RRD: simplest.rrd
DEBUG:20100818000434:rrd.py:48:StoreInRrd: Storing RRD Occurrance: simplest.rrd: 1282115074.81: {'random': 32}
DEBUG:20100818000434:rrd.py:183:GraphRrd: Graphing RRD: simplest.rrd
^CDEBUG:20100818000435:mainfunctions.py:451:ProcessAndLoop: ProcessAndLoop: Keyboard Interrupt: Releasing lock: __running
DEBUG:20100818000435:mainfunctions.py:470:ProcessAndLoop: Quitting...
Running Thread: Quitting: simplest
monkey:demo_simplest ghowland$

The first 4 lines show procblock starting up, all YAML files that are loaded are logged, to be able to trace what it is doing.  Then the first run of simplest.py is made, and returned in the “run” tag result, along with the RunThread object which is still running in a thread.

Then every 5 seconds, the simplest.py:ProcessBlock() is invoked, and the result ({‘random’:99}) is stored in simplest.rrd, and then simplest.png is graphed.

Then I hit the CTRL-C key and this caused a shared lock called __running to be released, and the thread that was running simplest.py every 5 seconds quit the next time it was invoked, releasing the process and procblock is finished.