### Context
For calendar and message sync job health monitoring, we used to
increment a counter in redis cache which could lead to concurrency
issue.
### Solution
- Update to a set structure in place of counter + use sAdd redis method
which is atomic
- Each minute another counter was incremented on a new cache key ->
Update to a 15s window
- Remove ONGOING status not needed. We only need status at job end (or
fail).
### Potential improvements
- Check for cache key existence before fetching data to avoid useless
call to redis ?
closes https://github.com/twentyhq/twenty/issues/10070
# Health Monitoring for Self-Hosted Instances
This PR implements basic health monitoring for self-hosted instances in
the admin panel.
## Service Status Checks
We're adding real-time health checks for:
- Redis Connection
- Database Connection
- Worker Status
- Message Sync Status
## Existing Functionality
We already have message sync and captcha counters that store aggregated
metrics in cache within a configurable time window (default: 5 minutes).
## New Endpoints
1. `/healthz` - Basic server health check for Kubernetes pod monitoring
2. `/healthz/{serviceName}` - Individual service health checks (returns
200 if healthy)
3. `/metricsz/{metricName}` - Time-windowed metrics (message sync,
captcha)
4. GraphQL resolver in admin panel for UI consumption
All endpoints use the same underlying service, with different
presentation layers for infrastructure and UI needs.
---------
Co-authored-by: Félix Malfait <felix@twenty.com>
Context :
We want to implement some counters to monitor server health. First
counters will track : messageChannel sync status during job execution
and invalid captcha.
How :
Counters are stored in cache and grouped by one-minute windows.
Controllers are created for each metric, aggregating counter over a
five-minutes window.
Endpoints are public and will be queried by Prometheus.
closes https://github.com/twentyhq/core-team-issues/issues/55