{"__v":45,"_id":"55d3b648f77e6d0d00b1b299","category":{"__v":3,"_id":"55d3b645f77e6d0d00b1b275","pages":["55d3b648f77e6d0d00b1b297","55d3b648f77e6d0d00b1b298","55d3b648f77e6d0d00b1b299","55d3b648f77e6d0d00b1b29a","55d3b648f77e6d0d00b1b29b","55d4dc5e9c4e4a0d00ff67c5","55d51f70e60a2f0d00b88add"],"project":"55c505b41469ad2500fa2ab7","version":"55d3b644f77e6d0d00b1b273","sync":{"url":"","isSync":false},"reference":false,"createdAt":"2015-08-07T19:23:34.369Z","from_sync":false,"order":1,"slug":"getting-started","title":"Getting Started"},"parentDoc":null,"project":"55c505b41469ad2500fa2ab7","user":"55c50f4a7c199a2f00665cbf","version":{"__v":6,"_id":"55d3b644f77e6d0d00b1b273","project":"55c505b41469ad2500fa2ab7","createdAt":"2015-08-18T22:48:36.632Z","releaseDate":"2015-08-18T22:48:36.632Z","categories":["55d3b645f77e6d0d00b1b274","55d3b645f77e6d0d00b1b275","55d3b645f77e6d0d00b1b276","55d3b645f77e6d0d00b1b277","55d3b645f77e6d0d00b1b278","55d3b645f77e6d0d00b1b279","55d3b645f77e6d0d00b1b27a","55d3b645f77e6d0d00b1b27b","55d3b645f77e6d0d00b1b27c","55d3b645f77e6d0d00b1b27d","55d7c2939510f00d007ec6fe","56fac9925df15a20002972a2","56fb2f7668e1d30e00a0b672","583498d411e8af2500f6b334","58e52a180ab7b03b00f4a97a"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.1.0","version":"1.1"},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2015-08-10T23:07:00.258Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":3,"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"Terminology\",\n  \"body\": \"**An alarm :** is a class like definition of metric violation in your monitoring tool\\n**An alert or an incident :** is an instance of an alarm\"\n}\n[/block]\nOnce you integrate your monitoring tools, alerts will start flowing into Neptune and you will be able to see and manage all your incidents at once place\n\n1. **Open incidents :** You will be able to see open incidents which need your immediate attention. You can fix those open incidents right from Neptune without logging into the trigger hosts.\n2. **Resolved incidents :** You can see a history of resolved incidents and some cool analytics around - which incidents occurred the most, which caused the most damage (MTTR). You will be able to automate those top alarms using rules.\n\n## Step 1 : Open incidents\n\nThese incidents are currently open and need your immediate attention. Click on the incident to get more context and details around :\n1. Trigger host \n2. Original JSON from monitoring tool\n3. Graph snapshots & metrics\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/zfKqYxlQRqNGYpXMcGLg_OpenIncidents.png\",\n        \"OpenIncidents.png\",\n        \"1614\",\n        \"890\",\n        \"#750946\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n## Step 2 : Fix-it flow for the open incident \n\nFix-it flow is primary intended to run a quick action (either diagnostic or remediation action) on the trigger host or on a totally different set of host(s).You will be shown the action output right within Neptune. \n\nSample fix-it actions are :\n\n1. Quick commands to know what processes are running on trigger host\n2. Restarting a process or a server quickly\n3. Executing your custom runbooks on a single or cluster of hosts \n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/mcAp7E84Siq432cbTf5w_FixItFlow.png\",\n        \"FixItFlow.png\",\n        \"1616\",\n        \"887\",\n        \"#60b99a\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\nOnce you run a fix-it action, you will be shown the action output. If you like the action output, and if you want to automate the action every time the alarm triggers, you can automate it by creating a rule using the *Run every time* button in the action output window\n\nFor more details about the fix-it flow, please refer to [Fix Incidents](doc:incident-dashboard) section\n\n\n## Step 3 :  All Incidents' analytics\n\nThe analytics are primarily intended to highlight which alarms are causing the most incidents and which alarms are causing the most damage to you in terms of MTTR. (Recall alarm is like a class  and incident is an instance of alarm) \n\n* We recommend that you look at your top alarms (sorted by both count or MTTR) to create rules and reduce your on-call burden\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/7hym3XOaSq69CJQyesZw_IncidentAnalytics.png\",\n        \"IncidentAnalytics.png\",\n        \"1604\",\n        \"883\",\n        \"#780642\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"success\",\n  \"title\": \"Congrats now you that learned about quick fix-it actions and rules, let's finally manage your notifications\",\n  \"body\": \"[Next step : Configure notifications](doc:configure-notifications)\"\n}\n[/block]","excerpt":"","slug":"manage-incidents","type":"basic","title":"Manage Incidents"}
[block:callout] { "type": "info", "title": "Terminology", "body": "**An alarm :** is a class like definition of metric violation in your monitoring tool\n**An alert or an incident :** is an instance of an alarm" } [/block] Once you integrate your monitoring tools, alerts will start flowing into Neptune and you will be able to see and manage all your incidents at once place 1. **Open incidents :** You will be able to see open incidents which need your immediate attention. You can fix those open incidents right from Neptune without logging into the trigger hosts. 2. **Resolved incidents :** You can see a history of resolved incidents and some cool analytics around - which incidents occurred the most, which caused the most damage (MTTR). You will be able to automate those top alarms using rules. ## Step 1 : Open incidents These incidents are currently open and need your immediate attention. Click on the incident to get more context and details around : 1. Trigger host 2. Original JSON from monitoring tool 3. Graph snapshots & metrics [block:image] { "images": [ { "image": [ "https://files.readme.io/zfKqYxlQRqNGYpXMcGLg_OpenIncidents.png", "OpenIncidents.png", "1614", "890", "#750946", "" ] } ] } [/block] ## Step 2 : Fix-it flow for the open incident Fix-it flow is primary intended to run a quick action (either diagnostic or remediation action) on the trigger host or on a totally different set of host(s).You will be shown the action output right within Neptune. Sample fix-it actions are : 1. Quick commands to know what processes are running on trigger host 2. Restarting a process or a server quickly 3. Executing your custom runbooks on a single or cluster of hosts [block:image] { "images": [ { "image": [ "https://files.readme.io/mcAp7E84Siq432cbTf5w_FixItFlow.png", "FixItFlow.png", "1616", "887", "#60b99a", "" ] } ] } [/block] Once you run a fix-it action, you will be shown the action output. If you like the action output, and if you want to automate the action every time the alarm triggers, you can automate it by creating a rule using the *Run every time* button in the action output window For more details about the fix-it flow, please refer to [Fix Incidents](doc:incident-dashboard) section ## Step 3 : All Incidents' analytics The analytics are primarily intended to highlight which alarms are causing the most incidents and which alarms are causing the most damage to you in terms of MTTR. (Recall alarm is like a class and incident is an instance of alarm) * We recommend that you look at your top alarms (sorted by both count or MTTR) to create rules and reduce your on-call burden [block:image] { "images": [ { "image": [ "https://files.readme.io/7hym3XOaSq69CJQyesZw_IncidentAnalytics.png", "IncidentAnalytics.png", "1604", "883", "#780642", "" ] } ] } [/block] [block:callout] { "type": "success", "title": "Congrats now you that learned about quick fix-it actions and rules, let's finally manage your notifications", "body": "[Next step : Configure notifications](doc:configure-notifications)" } [/block]