Actually I think active is probably the wrong word, so changed in the post, but to answer your question:
It's a system based on multiple web services and run across multiple hosts, and I've included all system checks in that number. So we have checks for:
* ping, ssh, disk, load, network io, etc.
* error rates from request logs
* request rates (warning on high values)
* smoke tests for key functionality (i.e. does the search engine return results, can you complete certain forms, etc.)
* connection tests from relevant hosts to relevant services or databases
It's a system based on multiple web services and run across multiple hosts, and I've included all system checks in that number. So we have checks for:
* ping, ssh, disk, load, network io, etc. * error rates from request logs * request rates (warning on high values) * smoke tests for key functionality (i.e. does the search engine return results, can you complete certain forms, etc.) * connection tests from relevant hosts to relevant services or databases