FixFast Architecture & System Overview
FixFast provides alert and incident intelligence with a deterministic, explainable pipeline. This overview describes the system components, data flow, and controls for multi-tenancy and access.
Architecture Overview
FixFast consists of these major components:
- Alert sources
- Ingestion layer
- Alert processing and grouping
- Incident correlation
- Incident storage
- Explainable summaries
- Incident Pattern Intelligence
- API layer
- User interfaces and integrations
- Security and access control
flowchart LR A[Grafana Alertmanager] -->|Webhooks| B[FixFast Ingestion Layer] B --> C[Alert Processing & Grouping] C --> D[Incident Correlation Engine] D --> E[Incident Store] D --> F[Explainable Summaries] E --> G[Incident Pattern Intelligence] G --> H[Aggregated Metrics Store] F --> I[FixFast API] E --> I H --> I I --> J[FixFast Web UI] I --> K[Slack Integration] I --> L[Generic Webhooks] subgraph "Security & Access" M["Org Isolation (org_id)"] N["RBAC: Admin / Editor / Viewer"] end I --> M I --> N
Component Descriptions
1. Alert Sources
- Primary supported source: Grafana Alertmanager
- Alerts are sent via webhooks with severity, service, environment, and labels
- Purpose: Provide reliable alert signals into FixFast
2. Ingestion Layer
- Receives alerts from external systems
- Validates payload structure and applies retry handling
- Associates alerts with the correct organization (
org_id) - Purpose: Ensure secure, reliable alert intake
3. Alert Processing & Grouping
- Normalizes incoming alerts
- Applies deterministic grouping rules using fingerprints, time windows, and shared context
- Purpose: Reduce alert noise and prepare alerts for correlation
4. Incident Correlation Engine
- Groups related alerts into incidents
- Identifies primary and supporting signals
- Records grouping rationale for auditability
- Purpose: Create explainable, trustworthy incidents
5. Incident Store
- Stores active and resolved incidents
- Maintains alert-to-incident relationships
- Enforces retention and deletion policies
- Purpose: Provide a reliable source of incident data
6. Explainable Summaries
- Generates structured summaries per incident
- Includes what happened, why alerts were grouped, probable causes, and recommended actions
- Purpose: Help teams understand incidents quickly and clearly
7. Incident Pattern Intelligence
- Operates on aggregated incident data only
- Produces incident volume trends, exposure analysis, alert noise trends, and recovery speed (MTTR)
- Does not retain raw alerts beyond retention
- Purpose: Enable long-term learning and prevention
8. API Layer
- Provides programmatic access to FixFast data
- Secured via authentication; all requests scoped by
org_id - Enforced by RBAC
- Purpose: Enable UI, integrations, and automation
9. User Interfaces & Integrations
- Web UI for engineers and operators
- Slack integration for notifications
- Generic webhooks for external systems
- Purpose: Deliver insights where teams work
10. Security & Access Control
- Multi-tenancy: Each organization is fully isolated; data is scoped by
org_id; no cross-tenant access. - Role-Based Access Control (RBAC): Admin, Editor, Viewer; each request is validated against role permissions.
- Purpose: Ensure secure, controlled access.
Architecture Principles
- Deterministic behavior
- Explainability over automation
- Separation of real-time and historical analysis
- Strong tenant isolation
- Predictable retention and deletion