Automation Package Editor Screenshot

February 15, 2011

I’m making pretty good progress on the GUI to edit the internals of the system.  I’m sticking to a pretty basic approach, with just a few goals:

  • All data resides in YAML, for optional sysadmin friendly hand editing, but everything can be edited in the GUI
  • Packages in REM are a hierarchy of tagged data by dictionary/hash key, with data indexed underneath.  The bread crumbs in the above picture show this: test.yaml >>> jobs >>> tester >>> tester2
  • Leaf nodes can be grouped in a deep hierarchy, to make it easy to organize the nodes.  Nodes can be copied and pasted to other similar hierarchy types (Schema Sections)
  • A package is specified by a Schema Instance, and then is instantiated by a Schema Instance Item.  There can be any number of items per Schema Instance, so that packages can be defined as a specification of specifications, but then instantiated with custom data and custom usage of the specification so that a general outline can work for many different projects.
  • Packages are essentially equivalent to “distributed programs”, they can specify jobs be run on many different machines, and using many workers per job if desired.  Jobs can return output or save results to a message queue, which can be graphed and analyzed with custom SLAs, which can have alerting or meta-analysis data stored.  This is considered the normal desired case for any job, and is not a theoretical “it can be done”, but it is assumed all jobs will potentially want graphing on results, and alerting or automated responses kicked off on data being out of specified tolerance over a period of time, or meeting a test scripts criteria.
  • Packages can mount web pages and RPC functions specified in them (default Sections, ‘http pages’ and ‘http rpc’), and use other packages as fall backs for misses, so that they can extend custom functionality of base packages, and reuse where standard results are desired.
  • Packages are meant to be used as a data based Domain Specific Language.  Organize the data into actions and groups, and then the Section specifier will process the data, as if it was a language.  In this way plans can be built, and the package substitutes for a normal program’s core architecture starting from Main() and initializing state and running code.  All of this is specified in the Package, and the data’s hierarchy serves as the architecture of when scripts should be called, and what data should be passed to them.

Here’s the data that is being edited:

jobs:
 tester:
   tester1:
    script: /tmp/tester.py
    name: Tester Script 1
    title: Tester Script 1
    command: null
    workers: 1

   tester2:
     script: /tmp/tester.py
     name: Tester Script 2
     title: Tester Script 2
     command: null
     workers: 1

The Schema Section, which is what will be processed, has 2 grouped index layers, the 1st being an actual group, “tester”, and the 2nd layer being the labels for the ‘jobs’ item data.  Grouped indexes can be any depth, and field names are assigned to the indexes, so that the final item data collects new fields along the way, picking up it’s hierarchy position as field information.  In this case, it is specified as:

# Indexes we keep to reference this data, grouped in layers
grouped indexes:
 - index: group
   type: text

 - index: null
   key field: name

Which means that either of these items will also have a field ‘group’ with the value ‘tester’.

Sections can contain other Sections, so can can be layered as deeply as required.  It can also be linked out to other files in several different ways to create various types of relationships.

Sections specify all the scripts association with the section, the most basic being the ‘process’ script, which in the case of the ‘jobs’ section starts up jobs (Python scripts, in this case) through the Job Manager, which can run jobs on the current host, or schedule them to be run on remote hosts and receive the results when they complete, or periodically through message queue replication, if the job is long-running.  Replication is built into REM as a core component, as well as shared state, locks, message queues, counters and time series data storage (for graphing and time series analysis).

Sections also specify their fields, including a type and validation.  Types are high level, and have their own schema definition, like the Section specification, they specify scripts to validate, format, save, serialize and performing other operations on the type of data.  Types are meant to be added whenever new basic functionality on data is desired.

Sections additionally specify their rendering information, for instance the edit dialog above was rendered with the following specification:

edit:
 field sets:
   - Job:
     - name
     - title
   - Execution:
     - script
     - command
     - workers

This specifies the order in which the fields are displayed in the Field Set editing dialog box that comes up when you edit a Schema Section Item.  Note there are two field set groups specified to visually separate the fields: ‘Job’ and ‘Execution’.  This can also be used to create a wizard style interface with multiple pages of field set groups.

I’m still working out all the functionality to make the creation of new packages, adding of sections to the packages, and then creating and moving around items under indexes in the sections.  Once that gets worked out, I’ll go back through all the other features done the non-dynamically-edited way, and migrate all their features to working with data in this new way.

Hopefully I’ll have some screenshots of dynamically creating web pages and widgets as part of the tool building process by next week.  After that I’ll put up a demo on EC2 to show it controlling another EC2 instance as it goes through various stages of configuration and forced failures.


The Delay in Release

February 10, 2011

It’s taking a while to get the release together, and it’s going to be a while longer until it is done.  Current guess is maybe 2 months delay.  Primarily, the majority of my time is now directed on other projects, but in addition I ran into the documentation issue.

For software to really be released, it needs reasonable documentation, and the Red Eye Monitor (REM) project is a large and complex project meant to do large and complex things, so it needs documentation that makes at least it’s basic operations clear before it can really launch.

Since I have developed REM to use very loose hierarchical data structures and a loose pluggable architecture, and both can recurse, documenting how this can be worked with would first require explaining all the methods and motives I used to put the system together, which would take a good deal of explanation.

Instead, I’ve decided I’m going to put time into the front-end GUI, so that I can document using the system through snapshots of GUI pages and explanations of the work flow and the schema at each stage.  This should allow interested parties to quickly install and start playing around with configuring it to do new things, and I can defer writing about the internal structure until after it starts getting some install base.