Hi!
I want to know if there is a way to know when a playbook fails, because we have some playbooks that are automatically executed in order to close the incidents, but sometimes the playbook fails for X reason, and if you don't go and take a look at the incidents, you'll never know if the playbook did work or not.
What I would like to get it's something like a "notification" (an email) when a playbook fails. I thought something like a bash or python script that checks every 5 minutes all the pods for each app looking for errors and if there are errors then put them into a file and send it by email. But it's quite difficult, because the email should have something like:
Subject: Playbook X failed on the X incident
Message:
The playbook X failed on the X incident with the following error: <error message>.
Link to the incident: <link>
Is it possible to make this? Any other ideas?
------------------------------
Gabriel Covello
------------------------------