The last beta release (000) ran, but was not especially useful as some of the required features for monitoring and alerting were missing. The upcoming release (001) will be fully functional and usable for a monitoring and alerting solution (though it is still early in it’s application life cycle).
Things have been delayed a bit, as I have taken the steps to complete the automation platform, and not just the monitoring application.
- Packaging system: Full life cycle management for adding new components, changing things, updating things, and wrapping all the different kinds of stuff needed for operational automation together. This includes: HTTP/RPC registration, a state machine for executing long-running code, a job system for executing scheduled code (distributed worker model included), requiring and importing other packages, a module plug-in system, defining data used by the package, and replication for state between nodes.
- Job Scheduling: One time, recurring, cron-style, worker threads, distributed/remote worker threads. Job control, and result handling management (replication/storage/processing) are included in the Job scheduling model.
- Replication: Simple push/pull model for state and queue data for now. Later this will be expanded by pushing any state changes and slurping back the updates, but for now simple gets the job done and creates an automated flow of information to keep nodes up to date, and deliver results generated locally on nodes to management systems.
These latest modules bring the system from a local Rapid Operations Automation Development System (ROAD), into being a distributed/cluster ROAD.
The Package and Job Schedule system does a much better job of encapsulating code and data to be run on a single system, and make adding more nodes very simple and adds a minimum of complexity. These also provide all the necessary functionality for doing local agent monitoring and automation, which has been a major delay in finishing the monitoring system’s functionality.
I’m not sure I’ve mentioned this here, but I have a policy of working towards Logarithmic Effort. I find that many projects fall into requiring Exponential Effort as they progress: for any given change, it takes an exponential amount of effort in coding/testing/deploying to effect the change.
Creating libraries that allow Logarithmic Effort to produce more and more logical content means using Network Effects to create functionality without there being something to actually facilitate it. The structure and flow of the process creates the effect that might otherwise have been created directly. This is pretty subtle stuff, and probably sounds like BS, but isn’t. I’ll try to figure out how to clearly demonstrate this in some of the documentation examples. Using my system, you get the benefits, as they are wrapped up in the system’s functionality, but I think it would be useful to continue using them in your custom scripts as well.
My goals with infrastructure development are always to work less, but not today, in the future. So each progression of the Red Eye Monitor (REM) system has been developed with that goal in mind. Reduce the effort required to do any piece of work to streer towards logarithmic effort, and away from exponential effort.
Where logarithmic effort cannot be enabled, go for linear effort. The changes can’t be shared, but things can be copy-pasted and changed (using descriptive data, templates and small pieces of isolated-yet-networked code), without side effects or creating more work in the future.
I believe the system I have now is well on it’s way to providing this Logarithmic Effort for creating operational automation, and hoping to be able to start demonstrating that in some articles showing how to build things inside of the REM Package System in the near future.
I’m aiming for having the 001 release and an online demo running this Sunday, and then documentation should begin to flow in after that. This was also my intention last weekend, so slippage may occur.