Skip to content

Vulnerability Scanning

Ephor Scanner is a Kubernetes agent that discovers running workloads, scans their container images for vulnerabilities using Trivy, and reports findings to the Ephor API. It runs as a CronJob and requires no persistent state of its own.

How It Works

The scanner executes a stateless pipeline on each run:

  1. Discover workloads -- queries the Kubernetes API for Deployments, StatefulSets, DaemonSets, and CronJobs across the configured namespaces.
  2. Deduplicate images -- extracts unique container images so each image is scanned only once, regardless of how many workloads reference it.
  3. Scan images -- runs Trivy against each unique image in parallel (configurable concurrency).
  4. Deliver results -- groups findings by namespace and sends them to the Ephor API via POST /api/v1/scans/ingest.

Each run generates a scan group ID (UUID) that links all per-namespace payloads together. Failed image scans are logged but do not block delivery of successful results.

Prerequisites

  • A running Ephor instance (API reachable from the cluster)
  • Helm 3.10 or later
  • Kubernetes 1.25 or later

Installation

The scanner is deployed using its own Helm chart from the ephor-scanner repository:

bash
git clone https://github.com/holbein-io/ephor-scanner.git
cd ephor-scanner

helm install ephor-scanner deploy/helm/ephor-scanner \
  --namespace ephor \
  --set ephor.apiUrl=http://ephor-api:8080 \
  --set scan.namespaces=default,production

The scanner runs on the default schedule (every 6 hours). Check deployment status:

bash
kubectl get cronjobs -n ephor

Triggering a Manual Scan

To run an immediate scan outside the schedule:

bash
kubectl create job --from=cronjob/ephor-scanner ephor-scanner-manual -n ephor

Watch the scan progress:

bash
kubectl logs -f job/ephor-scanner-manual -n ephor

Verifying Results

After a scan completes, findings appear in the Ephor dashboard. You can also check via the API:

bash
curl http://ephor-api:8080/api/v1/vulnerabilities

Configuration

All configuration is through environment variables, mapped via the Helm chart values. The two required settings are:

ValueDescription
ephor.apiUrlBase URL of the Ephor API (e.g., http://ephor-api:8080)
scan.namespacesComma-separated list of Kubernetes namespaces to scan

See the Scanner Configuration Reference for the full list of environment variables and the Scanner Helm Values Reference for all chart values.

Authentication

If the Ephor API requires authentication, configure the scanner with a custom header:

bash
helm install ephor-scanner deploy/helm/ephor-scanner \
  --namespace ephor \
  --set ephor.apiUrl=http://ephor-api:8080 \
  --set ephor.authHeader=X-API-Key \
  --set ephor.authValue=your-api-key \
  --set scan.namespaces=default,production

The header name and value are stored in a Kubernetes Secret.

Air-Gapped Environments

By default, Trivy downloads its vulnerability database from the public OCI registry on each run. In air-gapped or restricted environments:

  1. Mirror the Trivy database to an internal OCI registry.
  2. Configure the scanner to use it:
bash
helm install ephor-scanner deploy/helm/ephor-scanner \
  --namespace ephor \
  --set ephor.apiUrl=http://ephor-api:8080 \
  --set scan.namespaces=default \
  --set trivy.dbRepo=registry.internal/trivy-db

Alternatively, pre-populate the Trivy cache and skip the update entirely:

bash
--set trivy.skipDbUpdate=true

Persistent Cache

The Helm chart creates a PersistentVolumeClaim by default to cache the Trivy vulnerability database across runs. This avoids re-downloading the database (typically 40-80 MB) on every scan.

To disable persistent caching and use an ephemeral emptyDir volume:

bash
--set cache.enabled=false

RBAC

The Helm chart creates a ClusterRole and ClusterRoleBinding that grants the scanner read access to workloads (Deployments, StatefulSets, DaemonSets, CronJobs) and pods across the configured namespaces. No write access to cluster resources is required.

Next Steps

Licensed under AGPL v3