Leveraging OpenTelemetry and Prometheus for Observability
In modern cloud-native applications, observability is crucial for maintaining system reliability and performance. OpenTelemetry and Prometheus are two powerful tools that help achieve this goal. This blog will explore their advantages and demonstrate their integration using a sample configuration.
この記事の目次
What is OpenTelemetry?
OpenTelemetry is an open-source observability framework for cloud-native software. It provides a set of APIs, libraries, agents, and instrumentation to collect metrics, traces, and logs from applications and send them to observability backends like Jaeger, Prometheus, and others.
What is Prometheus?
Prometheus is an open-source monitoring and alerting toolkit. It is designed for reliability and scalability, making it an excellent choice for monitoring containerized applications. Prometheus collects and stores metrics as time-series data, providing a powerful query language and visualisation capabilities.
Sample Configuration
The following configuration sets up an OpenTelemetry collector to receive data from applications, process it, and export it to Prometheus, Jaeger, and logging.
OpenTelemetry Collector Configuration
receivers: otlp: protocols: grpc: exporters: prometheus: endpoint: "0.0.0.0:8889" const_labels: label1: value1 logging: jaeger: endpoint: jaeger-all-in-one:14250 tls: insecure: true processors: batch: service: pipelines: traces: receivers: [otlp] processors: [batch] exporters: [logging, jaeger] metrics: receivers: [otlp] processors: [batch] exporters: [logging, prometheus]
Explanation
- receivers: Defines how the collector receives data. The otlp receiver with the grpc protocol is configured to receive telemetry data.
- exporters: Specifies where the collected data will be sent. In this configuration:
- prometheus exports metrics to Prometheus at endpoint 0.0.0.0:8889.
- logging exports data to logs for debugging.
- jaeger exports traces to Jaeger at jaeger-all-in-one:14250 with TLS configured as insecure for simplicity.
- processors: Defines how the data is processed. The batch processor batches data before sending it to exporters.
- service: Configures pipelines to process and export data:
- traces pipeline receives traces, processes them in batches, and exports to logging and Jaeger.
- metrics pipeline receives metrics, processes them in batches, and exports to logging and Prometheus.
Prometheus Configuration
scrape_configs: - job_name: 'otel-collector' scrape_interval: 10s static_configs: - targets: ['otel-collector:8889']
Explanation
- scrape_configs: Defines how Prometheus scrapes metrics from targets.
- job_name: Name of the scrape job.
- scrape_interval: Interval between scrapes (10 seconds).
- static_configs: Static list of targets to scrape metrics from.
- targets: List of target endpoints (in this case, the OpenTelemetry collector’s Prometheus exporter at otel-collector:8889).
Docker Compose Configuration
version: "2" services: jaeger-all-in-one: image: jaegertracing/all-in-one:latest ports: - "16686:16686" - "14268" - "14250" otel-collector: image: otel/opentelemetry-collector-contrib:latest command: ["--config=/etc/otel-collector-config.yml"] volumes: - ./otel-collector-config.yml:/etc/otel-collector-config.yml ports: - "8888:8888" # Prometheus metrics exposed by the collector - "8889:8889" # Prometheus exporter metrics - "4317" # OTLP gRPC receiver - "55670:55679" # zpages extension depends_on: - jaeger-all-in-one demo-client: build: context: ./client environment: - OTEL_EXPORTER_OTLP_ENDPOINT=otel-collector:4317 - DEMO_SERVER_ENDPOINT=http://demo-server:7080/hello depends_on: - demo-server demo-server: build: context: ./server environment: - OTEL_EXPORTER_OTLP_ENDPOINT=otel-collector:4317 ports: - "7080" depends_on: - otel-collector prometheus: image: prom/prometheus:latest volumes: - ./prometheus.yaml:/etc/prometheus/prometheus.yml ports: - "9090:9090"
Explanation
- version: Specifies the version of the Docker Compose file format.
- services: Defines the services to be run:
- jaeger-all-in-one: Runs the Jaeger all-in-one image to collect and visualize traces.
- Ports 16686, 14268, and 14250 are exposed for Jaeger’s UI, collector, and agent respectively.
- otel-collector: Runs the OpenTelemetry collector configured in otel-collector-config.yml.
- Ports 8888, 8889, 4317, and 55670 are exposed for various endpoints used by OpenTelemetry.
- depends_on ensures that Jaeger is running before starting the collector.
- demo-client: Example client application sending telemetry data to the collector.
- Environment variables specify the OpenTelemetry endpoint and demo server endpoint.
- depends_on ensures that the demo server is running before starting the client.
- demo-server: Example server application sending telemetry data to the collector.
- Port 7080 is exposed for the server endpoint.
- depends_on ensures that the collector is running before starting the server.
- prometheus: Runs the Prometheus image to scrape metrics from the collector.
- Port 9090 is exposed for Prometheus UI.
- Configuration file is mounted to specify scrape configurations.
- jaeger-all-in-one: Runs the Jaeger all-in-one image to collect and visualize traces.
Benefits of Using OpenTelemetry and Prometheus
OpenTelemetry Benefits
- Unified Observability: OpenTelemetry standardizes the collection of telemetry data across metrics, traces, and logs, providing a single framework for observability.
- Vendor-Neutral: It supports multiple backends, allowing easy integration with various observability tools like Prometheus, Jaeger, and others.
- Extensibility: OpenTelemetry is highly extensible, enabling custom instrumentation and integration with third-party libraries.
Prometheus Benefits
- Scalability: Prometheus is designed to handle large volumes of time-series data, making it suitable for monitoring complex, distributed systems.
- Powerful Query Language: PromQL allows for sophisticated queries to analyze and visualize metrics data effectively.
- Integration: Prometheus integrates seamlessly with alerting systems like Alertmanager, providing robust monitoring and alerting capabilities.
Implementing Logging with OpenTelemetry and Prometheus
To implement logging and metrics collection using OpenTelemetry and Prometheus, you need to instrument your application code. Below is an example of how to do this in a Python application using the OpenTelemetry Python SDK.
Python Application Example
Install Dependencies
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-prometheus opentelemetry-exporter-jaeger opentelemetry-instrumentation
Configure OpenTelemetry
from opentelemetry import trace, metrics from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.metrics import MeterProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.sdk.metrics.export import ConsoleMetricExporter, PeriodicExportingMetricReader from opentelemetry.exporter.jaeger.thrift import JaegerExporter from opentelemetry.exporter.prometheus import PrometheusMetricsExporter from opentelemetry.instrumentation.requests import RequestsInstrumentor # Tracing Configuration trace.set_tracer_provider(TracerProvider()) tracer = trace.get_tracer(__name__) jaeger_exporter = JaegerExporter(agent_host_name="localhost", agent_port=6831) span_processor = BatchSpanProcessor(jaeger_exporter) trace.get_tracer_provider().add_span_processor(span_processor) # Metrics Configuration metrics.set_meter_provider(MeterProvider()) meter = metrics.get_meter(__name__) prometheus_exporter = PrometheusMetricsExporter() metric_reader = PeriodicExportingMetricReader(prometheus_exporter) metrics.get_meter_provider().add_metric_reader(metric_reader) # Instrumentation RequestsInstrumentor().instrument() # Example Span with tracer.start_as_current_span("example-span"): print("This is an example span") # Example Metric counter = meter.create_counter("example_counter") counter.add(1, {"key": "value"}) Expose Prometheus Metrics from prometheus_client import start_http_server # Start Prometheus metrics server start_http_server(8888) # Keep the application running to collect metrics import time while True: time.sleep(1)
Conclusion
Integrating OpenTelemetry and Prometheus provides a robust observability solution for modern applications. OpenTelemetry offers a unified framework for collecting and exporting telemetry data, while Prometheus excels in monitoring and alerting. Together, they enable comprehensive visibility into application performance, aiding in efficient troubleshooting and ensuring system reliability. By following the provided configuration and instrumentation examples, you can effectively leverage these tools in your own projects.
カテゴリー: