Building a Framework-Agnostic Health Check Library for Python Microservices

# architecture# microservices# monitoring# python
Building a Framework-Agnostic Health Check Library for Python MicroservicesTasos Nikolaou

From duplicated /health endpoints to a published PyPI package - an engineering deep dive. ...

From duplicated /health endpoints to a published PyPI package - an engineering deep dive.


The Problem: Death by Copy-Paste Health Checks

In a typical microservice architecture, health endpoints start simple:

GET /health
GET /ready

But over time, reality sets in.

Some services use:

  • Django + PostgreSQL + Redis + Celery

  • FastAPI + SQLAlchemy + Redis

  • BFFs that depend on upstream HTTP services

  • RabbitMQ + background workers

  • Async stacks mixed with sync stacks

Each service needs:

  • Liveness checks

  • Readiness checks

  • Dependency verification

  • Timeouts

  • Structured JSON output

  • Correct HTTP status codes

And before long, every service has its own slightly different HealthService.

Different thresholds.
Different response formats.
Different timeout logic.
Different readiness semantics.

That's when I realized:

Health checks are infrastructure. They should not be rewritten per service.

So I built PulseCheck - a framework-agnostic health and readiness library for Python.


Design Goals

Before writing a single line of code, I defined constraints:

  1. Framework-agnostic core
  2. Pluggable dependency checks
  3. Async-first design (to support FastAPI)
  4. Sync compatibility (for Django)
  5. No forced dependency pollution
  6. Kubernetes-friendly readiness semantics
  7. Optional dependency extras
  8. Clean, structured JSON output
  9. Production-safe timeouts

This wasn't just about code reuse.

It was about architectural consistency.


Architecture: Core + Adapters

The key design decision was separation of concerns.

pulsecheck/
│
├── core/        ← Framework-agnostic health engine
├── fastapi/     ← FastAPI adapter
└── django/      ← Django adapter
Enter fullscreen mode Exit fullscreen mode

1. Core Engine

The core layer contains:

  • Health registry
  • Health aggregation logic
  • Status combination rules
  • Dependency check base class
  • Timeout handling
  • Response schema

It has zero framework dependencies.

The core doesn't know what FastAPI or Django is.


2. Pluggable Checks

Each dependency is implemented as a check:

  • SQLAlchemyAsyncCheck

  • DjangoDBCheck

  • RedisAsyncCheck

  • RedisSyncCheck

  • RabbitMQKombuCheck

  • CeleryInspectCheck

  • HttpDependencyCheck

Each check:

  • Has a name

  • Has a timeout

  • Has a degraded threshold

  • Returns structured results

Example:

registry = HealthRegistry(environment="prod")

registry.register(SQLAlchemyAsyncCheck(engine))
registry.register(RedisAsyncCheck(redis_url))
registry.register(CeleryInspectCheck(celery_app))
Enter fullscreen mode Exit fullscreen mode

No monolithic service class.
Just composition.


Async-First, Sync-Compatible

FastAPI is async.
Django is traditionally sync.

Instead of creating two engines, the core is async-first.

Sync checks are wrapped using:

asyncio.to_thread(...)

This gives:

  • Async compatibility
  • Non-blocking readiness
  • Unified aggregation logic

This avoids duplicating the health engine.


Readiness vs Liveness

This is often misunderstood.

Liveness:

"Is the process alive?"

Readiness:

"Can this service safely receive traffic?"

PulseCheck separates them cleanly:

registry.liveness()
await registry.readiness()
Enter fullscreen mode Exit fullscreen mode

Readiness runs dependency checks.
Liveness does not.

This mirrors Kubernetes probe behavior.


Handling Degraded States

Health isn't binary.

Instead of just UP or DOWN, PulseCheck supports:

  • HEALTHY
  • DEGRADED
  • UNHEALTHY

If a dependency is slow but responding:

{
  "status": "DEGRADED",
  "response_time_ms": 750
}
Enter fullscreen mode Exit fullscreen mode

This gives operational insight without triggering restarts.


Optional Dependencies Done Right

One of the most important design decisions was dependency management.

FastAPI projects already have FastAPI.
Django projects already have Django.

The library must not force unnecessary installations.

In pyproject.toml:

[project.optional-dependencies]
fastapi = ["fastapi>=0.100"]
django = ["Django>=4.2"]
redis_async = ["redis>=5.0"]
rabbitmq = ["kombu>=5.3"]
celery = ["celery>=5.3"]
Enter fullscreen mode Exit fullscreen mode

Now:

FastAPI service:

pip install pulsecheck[fastapi,redis_async]

Django service:

pip install pulsecheck[django,redis_sync]

Clean. Explicit. Controlled.


Hiding Health Endpoints From Swagger

Health endpoints are infrastructure endpoints.

In FastAPI:

@router.get("/health", include_in_schema=False)

They exist.
They work.
They don't pollute public API docs.

Small detail. Big professionalism signal.


Testing Before Publishing

Before uploading to PyPI, I tested:

  • Editable installs (pip install -e .)
  • Wheel builds (python -m build)
  • Installation from built wheel
  • Installation from TestPyPI
  • Optional extras resolution
  • Fresh virtual environments

I also learned something important:

TestPyPI contains junk packages that can interfere with dependency resolution. Always use:

--extra-index-url https://test.pypi.org/simple/

Not --index-url.

Small ecosystem lesson.


Publishing to PyPI

Publishing was straightforward:

python -m build
python -m twine upload dist/*
Enter fullscreen mode Exit fullscreen mode

Important rule:

You cannot overwrite a version on PyPI.

Every change requires a version bump.

This enforces discipline.


Engineering Lessons Learned

  1. Design the API before writing implementation.
  2. Keep core logic framework-agnostic.
  3. Async-first design avoids duplication.
  4. Optional dependencies prevent ecosystem pollution.
  5. Health endpoints are infrastructure, not business logic.
  6. Packaging and versioning require discipline.
  7. Publishing is easier than maintaining.

Why This Matters

Microservices suffer from invisible duplication.

Health checks are often treated as boilerplate.

But consistency in infrastructure code improves:

  • Operational clarity

  • Monitoring integration

  • Kubernetes reliability

  • Onboarding speed

  • Codebase maintainability

PulseCheck turned copy-paste health logic into a reusable, composable abstraction.


Future Roadmap

  • OpenTelemetry hooks
  • Prometheus integration
  • Circuit-breaker awareness
  • Startup probe support
  • Health history tracking
  • Async worker health strategies

Final Thoughts

Publishing a library is not about writing code.

It's about:

  • API design
  • Dependency discipline
  • Versioning strategy
  • Documentation clarity
  • Ecosystem compatibility

PulseCheck started as internal cleanup.
It became a reusable infrastructure layer.

If you're duplicating health logic across services, consider abstracting it.

Your future self will thank you.


Links

PyPI: https://pypi.org/project/pulsecheck-py/
GitHub: https://github.com/tasosCDR/pulsecheck-py


If you'd like feedback on the architecture or want to contribute, feel free to reach out.


"PulseCheck is intentionally minimal today. But its architecture allows deeper observability and resilience integrations"