Incident management is the process of managing the lifecycle of an incident. Phare Uptime provides a fully automated incident management system to manage the lifecycle of an incident, from creation to recovery.
An incident is created when a monitor fails, in respect to the monitor's confirmation configuration and checks (keywords, SSL..). The default alert policy will email you a notification when an incident is created.
An incident is resolved when a monitor succeeds, in respect to the monitor's recovery configuration and checks (keywords, SSL..). The default alert policy will email you a notification when an incident recovers.
Incidents can be classified by their impact on your users. Project members can manually set the impact of an incident to one of the following values:
- Unknown: The impact of the incident is not known yet. (default)
- Operational: The incident has no impact on your users.
- Maintenance: The incident is caused by a planned maintenance.
- Degraded performance: The incident has a degraded performance impact on your users.
- Partial outage: The incident impacts only some users or some parts of your service.
- Major outage: The incident impacts all your users or all your service.
The impact of an incident can be used internally to prioritize the incident resolution, and is reported in status pages if one is linked to the monitor that triggered the incident. In case of incident impact is set to unknown, Phare will set your monitor status to down without additional details in your status page.
The event timeline is a powerful tool to help you understand what happened during an incident. It shows you the status of the monitor at each check, and the alert policy rules that were triggered.
Project members can comment on the incident timeline to share findings and resolve the incident faster. Comments can be written using a rich text editor, and are compatible with the Markdown syntax.