Jaeger (distributed tracing)

Jaeger (distributed tracing)
Name	Jaeger
Developer	The Cloud Native Computing Foundation
Initial release	2016
Programming language	Go
Repository	github.com/jaegertracing/jaeger
License	Apache License 2.0

Contents

Overview
Architecture and Components
Data Model and Instrumentation
Deployment and Scalability
Integrations and Ecosystem
Security and Privacy
History and Development

Jaeger (distributed tracing) is an open-source, end-to-end distributed tracing system used to monitor and troubleshoot complex, microservices-based architectures. It provides tracing, root cause analysis, latency optimization, and service dependency visualization across large-scale environments. Jaeger is widely adopted in cloud-native ecosystems and often deployed alongside technologies that include container orchestration, service meshes, and observability platforms.

Overview

Jaeger was created to address tracing needs in service-oriented landscapes and is commonly deployed with Kubernetes, Docker, Prometheus, Grafana, and Envoy. It supports standards and projects such as OpenTracing and OpenTelemetry and interoperates with observability initiatives led by organizations like the Cloud Native Computing Foundation and Linux Foundation. Enterprises and projects from companies including Uber Technologies and Red Hat have adopted Jaeger for root cause analysis, performance tuning, and distributed context propagation across services written in languages like Go (programming language), Java (programming language), Python (programming language), and Node.js.

Architecture and Components

Jaeger’s architecture includes collectors, agents, query services, and storage backends and is designed to integrate with infrastructures such as Amazon Web Services, Google Cloud Platform, Microsoft Azure, and on-premises data centers running OpenStack. Core components include an agent that runs per host, a collector that ingests spans, a query service that services UI requests, and a UI that visualizes traces and service graphs; these interact with storage systems like Elasticsearch, Cassandra (database), and Apache Kafka. Jaeger also supports sampling, baggage propagation, and context propagation patterns used by projects such as Zipkin and standards like W3C Trace Context. Operators often deploy Jaeger with observability tools from vendors like Datadog, New Relic, and Splunk.

Data Model and Instrumentation

The Jaeger data model captures spans, traces, operations, tags, logs, and baggage; spans are created by instrumented libraries and SDKs from frameworks including Spring Framework, Express.js, Django, gRPC, and Akka. Instrumentation strategies leverage client libraries that implement OpenTracing or OpenTelemetry APIs, and auto-instrumentation agents for runtimes like JVM and Node.js enable minimal-code integration. Trace context propagation across HTTP and RPC is commonly done using headers standardized by W3C, while sampling strategies (probabilistic, rate-limited, adaptive) are configurable to balance storage costs against fidelity in environments managed by platforms such as Istio and Linkerd.

Deployment and Scalability

Jaeger can be deployed as a standalone binary, as containers on Kubernetes with Helm charts or Operators, or as managed services on cloud providers like Amazon EKS, Google Kubernetes Engine, and Azure Kubernetes Service. For high-throughput architectures used by companies like Uber Technologies and Airbnb, Jaeger scales horizontally by sharding collectors and storage, integrating with streaming systems such as Apache Kafka and distributed databases like Cassandra (database), while using indexing backends such as Elasticsearch for query performance. Deployment patterns often reflect practices from projects like Prometheus federation, Fluentd log aggregation, and Thanos-style long-term storage.

Integrations and Ecosystem

Jaeger integrates with a wide ecosystem including observability projects and platforms such as OpenTelemetry, Prometheus, Grafana, Fluentd, Loki, Zipkin, and commercial offerings from Splunk, Datadog, New Relic, and Dynatrace. It connects to service meshes and proxies like Envoy, Istio, and Linkerd to capture sidecar-generated traces and leverages CI/CD systems including Jenkins, GitLab, and GitHub Actions for trace-driven deployment analysis. Language-specific client libraries are maintained alongside frameworks and packages from organizations like Google, Microsoft, Red Hat, and Amazon Web Services.

Security and Privacy

Security considerations for Jaeger deployments include authentication, authorization, encryption in transit (TLS), and secure storage practices; these are typically enforced using platform features from Kubernetes RBAC, HashiCorp Vault, and cloud IAM services like AWS Identity and Access Management, Google Cloud IAM, and Azure Active Directory. Trace data can contain sensitive information, so teams employ redaction, sampling, and access controls aligned with compliance regimes such as General Data Protection Regulation and data governance practices from entities like ISO and NIST. Integrations with logging and SIEM tools from vendors like Splunk and Elastic (company) enable auditability and threat detection.

History and Development

Jaeger originated at Uber Technologies in 2015–2016 to address scaling issues with distributed tracing at ride-sharing scale and was contributed to the Cloud Native Computing Foundation in a process similar to other CNCF projects such as Prometheus and Envoy. The project evolved through contributions from engineers and organizations including Red Hat, Google, LightStep, and community maintainers, adopting standards like OpenTracing and later converging toward OpenTelemetry. Major milestones include the release of production-grade collectors, integration with backends like Elasticsearch and Cassandra (database), and becoming a graduated project within the Cloud Native Computing Foundation.

Category:Distributed tracing