Leveraging Azure Application Insights with OpenTelemetry: Distributed Tracing Done Right

2025/03/13

Leveraging Azure Application Insights with OpenTelemetry: Distributed Tracing Done Right

Summary:
Azure Application Insights has long supported distributed tracing, but its recent migration to Open Telemetry takes observability to a new level. For applications using older Application Insights SDKs, migrating to the modern Azure.Monitor.OpenTelemetry.AspNetCore library provides seamless integration with minimal code changes, unlocking vendor-neutral observability and richer telemetry capabilities. This article covers the architectural patterns, migration strategies, and practical examples for developers and architects, highlighting the simplicity of ILogger integration, cross-platform support, and unified data analysis via Log Analytics. Whether running .NET apps, Azure Databricks jobs, or other services, leveraging Azure’s Open Telemetry distribution ensures consistent, comprehensive, and scalable observability across your Azure solutions.

We’ll cover some architectural discussion on how Azure Application Insights supports distributed tracing, its migration to Open Telemetry, and how it integrates with .NET applications and when needed even Azure Databricks.

Overview of Distributed Tracing in Azure Application Insights

Azure Application Insights (AI) provides distributed tracing to follow a transaction across different services. Originally, Application Insights SDKs handled tracing by generating a unique operation ID for each request and propagating it via HTTP headers like Request-Id and Request-Context. This allowed linking telemetry from separate components – for example, a front-end web app and a back-end API – even if they reported to different AI instances (Things I wish I knew earlier about Distributed Tracing in Azure Application Insights - Brett McKenzie). The front-end would include its AI App ID in the outbound call headers, and the back-end would respond with its own App ID. Application Insights used these IDs to stitch together a single end-to-end trace in the portal, showing the relationship between requests and dependencies across components (Things I wish I knew earlier about Distributed Tracing in Azure Application Insights - Brett McKenzie). This proprietary correlation protocol worked, but it was tied to AI-specific headers (Request-Id, Request-Context) and required all components to use Application Insights SDKs.

Over time, the industry moved toward standardizing distributed tracing. Azure Application Insights began transitioning to the W3C Trace Context standard, which defines traceparent and tracestate headers for vendor-neutral trace propagation. Modern AI SDKs support W3C Trace Context (with backward compatibility to the old Request-Id system). In parallel, the industry converged on OpenTelemetry (OTel) – an open-source CNCF project formed by merging OpenTracing and OpenCensus in 2019. OpenTelemetry provides standard APIs for traces, metrics, and logs, solving the lack of cross-platform standards in observability.

Transition to Open Telemetry

Azure Monitor (which underpins Application Insights) is adopting Open Telemetry as the instrumentation layer. Microsoft’s Azure Monitor Open Telemetry Distro is essentially a supported wrapper around the standard OTEL SDKs. The move brings several benefits:

By migrating to OpenTelemetry, Application Insights essentially becomes a backend for standard OTel data. The trace collection is now open-standard: any OTel-compatible tracer can send distributed traces to AI. This transition preserves AI’s powerful analysis capabilities while making instrumentation more flexible. In summary, Application Insights started with its own tracing mechanism and has evolved to adopt OpenTelemetry for a more open, interoperable, and future-proof distributed tracing solution (Introducing Azure Monitor OpenTelemetry Distro - InfoQ).

Industry References

Open Telemetry’s origins as a merger of Open Tracing and Open Census in 2019 under CNCF highlight the industry’s push for a unified standard. Today, all major cloud vendors and APM providers have embraced OTel. Azure’s adoption of OpenTelemetry (including creating the Azure Monitor OpenTelemetry Distro) is a concrete example of this industry trend (Introducing Azure Monitor OpenTelemetry Distro - InfoQ). It demonstrates how cloud platforms are aligning with open-source standards to improve integration and reduce the burden of instrumenting distributed systems.

Application Architecture Using Application Insights

Integrating Application Insights into a distributed application can be done in a couple of patterns. The goal is to capture telemetry (logs, traces, metrics) from all components of the system in a way that you can easily diagnose issues across service boundaries. Key design choices include whether to use a separate Application Insights instance per component or a shared instance for multiple components, and how to leverage Log Analytics for centralized data analysis.

Separate vs. Shared Application Insights Instances

In a microservices or multi-tier architecture, one option is to give each application or service its own AI resource (separate instances). Another option is to have multiple services report into a single, shared AI resource. Each approach has pros and cons:

In practice, a multi-component application can be monitored either way. Microsoft now leans toward using a single workspace-based Application Insights for related components, because Log Analytics can segregate or filter by role name easily, while providing a unified query surface. If separate AI instances are used, you can still link them: historically, the Request-Context header with appId was used to tie together separate AI instances in a trace (Things I wish I knew earlier about Distributed Tracing in Azure Application Insights - Brett McKenzie). Today, with OpenTelemetry and W3C tracecontext, as long as trace IDs are propagated, the distributed trace will exist; but to view it end-to-end in the Azure portal, the telemetry from each service needs to be in the same workspace or linked via the Application Map (which still uses the concept of App ID or connected AI resources).

Role of Log Analytics

Log Analytics acts as the central storage and analysis layer for Azure Monitor telemetry. In newer Application Insights (workspace-based), each AI resource is backed by an Azure Monitor Log Analytics workspace. All telemetry – whether it’s incoming request traces, application logs, exceptions, or custom metrics – is stored as records in that workspace (in tables like AppTraces, AppRequests, etc.). This means you can use Kusto Query Language (KQL) in Log Analytics to query and join across all data types. It also means multiple Application Insights resources can write to the same Log Analytics workspace, consolidating data. For example, you might have separate AI instances for frontend and backend apps, but both send their data to a single Log Analytics workspace. You can then run a single query to see combined logs or correlated operation IDs across both. Azure Monitor’s design encourages using Log Analytics as the unified data layer: “Logs: Retrieve, consolidate, and analyze all data collected into Azure Monitor Logs” (Application Insights OpenTelemetry overview - Azure Monitor | Microsoft Learn). Whether using separate or shared AI resources, if they share a workspace or you utilize cross-resource queries, Log Analytics allows SREs and developers to get a holistic view.

Below are diagrams illustrating two architectural approaches:

Separate Application Insights per Service

In this approach, each microservice or application has its own Application Insights instance. The distributed tracing correlation is achieved via operation IDs and App ID exchange between services. Telemetry for each service is siloed in its respective AI resource, but you can enable the Application Insights Application Map to visualize the connections (it uses the cloud role name or App ID to link the nodes). All telemetry ultimately lands in Log Analytics (either each AI has its own workspace or they all send to one workspace).


flowchart LR
    subgraph Client
        User
    end
    subgraph Services
        Frontend["Frontend Service"] -->|HTTP call| Backend["Backend Service"]
    end
    User -->|request| Frontend
    Backend -->|response| Frontend -->|response| User
    Frontend == "Telemetry" ==> AI_Front[(App Insights - Frontend)]
    Backend == "Telemetry" ==> AI_Back[(App Insights - Backend)]
    subgraph Azure_Monitor["Azure Monitor / Log Analytics"]
        AI_Front --> LogW[(Log Analytics Workspace)]
        AI_Back --> LogW
    end
    style AI_Front fill:#FFD,stroke:#333,stroke-width:1px;
    style AI_Back fill:#FFD,stroke:#333,stroke-width:1px;
    style LogW fill:#E6F5E1,stroke:#333,stroke-width:1px;

Diagram: Two services (frontend and backend) use separate Application Insights. Correlation is done via trace context (the frontend passes a trace ID and its AI App ID to the backend). Telemetry from each goes to its own AI resource, which are connected in the Application Map. Both AI resources can send data to a single Log Analytics workspace for unified querying.

In the above scenario, the frontend and backend each have distinct instrumentation keys (or connection strings). When the frontend calls the backend, the AI SDK (or OTel instrumentation) adds a header like Request-Context: appId=<Frontend_AppId> and a traceparent with the operation ID (Things I wish I knew earlier about Distributed Tracing in Azure Application Insights - Brett McKenzie). The backend’s instrumentation picks that up, knows the call came from an external component, and includes the caller’s appId in its telemetry (in the dependency.target or request.source fields) (Things I wish I knew earlier about Distributed Tracing in Azure Application Insights - Brett McKenzie) (Things I wish I knew earlier about Distributed Tracing in Azure Application Insights - Brett McKenzie). In Application Insights, this appears as a linked request and dependency. We get a distributed trace that spans both AI instances. In Log Analytics, you could query by the operation ID (which is shared) to get combined data. Each AI resource can be managed independently (different alert rules, permissions, retention), which is useful if, say, the backend is a shared service used by multiple frontends.

Shared Application Insights for Multiple Applications

In this approach, multiple services report into one Application Insights instance. For example, an Azure Function and an App Service may use the same AI resource. Telemetry is distinguished by a role name or instance name. The advantage is simplicity: a single place to search all telemetry, and built-in correlation since everything shares one instrumentation key.


flowchart LR
    subgraph Services
        ServiceA["Service A"] -->|calls| ServiceB["Service B"]
    end
    ServiceA == "Telemetry" ==> AI_Shared[(Application Insights)]
    ServiceB == "Telemetry" ==> AI_Shared
    AI_Shared --> LogWS[(Log Analytics Workspace)]
    style AI_Shared fill:#FFD,stroke:#333,stroke-width:1px;
    style LogWS fill:#E6F5E1,stroke:#333,stroke-width:1px;

Diagram: Two services use a single, shared Application Insights resource. The AI resource tags data with each service’s cloud_RoleName to differentiate them. All telemetry is stored together in one Log Analytics workspace.

In this setup, you might configure each service’s telemetry initializer (or via OTel Resource configuration) to set a unique cloud_RoleName. For instance, Service A might appear as “frontend-web” and Service B as “orders-api” in the data. Application Insights automatically populates cloud_RoleInstance (e.g. machine or host name) and cloud_RoleName for many Azure services, or you can set it manually. The Application Map in the portal will group telemetry by role name, showing the two services as separate nodes even though they share one AI resource. Searching logs or using Analytics is straightforward since all records are in one place – you can simply filter by cloud_RoleName to separate concerns. One thing to watch is volume and access control: all teams pushing to a shared AI need coordinated access. Also, if one service generates a huge volume of telemetry, it could impact the ingestion volume of the shared AI (hitting caps or making it harder to manage retention for just one service).

Log Analytics as the Analysis Layer: In both patterns, Log Analytics is crucial. It provides a query engine to slice and dice telemetry. If using workspace-based Application Insights, each AI instance is tied to a workspace – potentially the same one – making cross-service queries seamless. If using classic AI (non-workspace-based), each AI has its own store, but you can use Azure Monitor Cross-resource queries (using the app("AI Resource Name").requests syntax in KQL) to query across resources. Log Analytics also enables creating workbooks and dashboards that pull in data from multiple sources (for example, combining metrics from App Insights and Azure platform logs). Many Azure Monitor features, like alerts and workbooks, ultimately run queries against Log Analytics, reinforcing its role as the unified data layer.

In summary, architectural integration patterns for Application Insights can vary, but leveraging cloud role names and Log Analytics allows flexibility. A typical enterprise might use one AI instance per microservice during development (for isolation), but in production, consolidate telemetry into fewer instances or a single workspace for a holistic view. The choice often comes down to how tightly coupled the services are, who needs access to the data, and organizational preferences on tenancy.

Migration from Application Insights SDK to Azure.Monitor.OpenTelemetry.AspNetCore

Application Insights’ classic SDK for .NET (e.g. Microsoft.ApplicationInsights.AspNetCore and related packages) provided automatic instrumentation and a TelemetryClient API for custom telemetry. While functional, the classic SDK had some limitations: it was a proprietary implementation, required separate SDKs per language, and had a fixed set of features that evolved slowly. For instance, developers often had to explicitly initialize the SDK (calling AddApplicationInsightsTelemetry() in ASP.NET Core or setting up configuration files in .NET Framework), and use TelemetryClient to track custom events or dependencies. Extending it (beyond what was automatically collected) meant learning AI-specific concepts like Telemetry Initializers, Telemetry Processors, and the old correlation headers. There were also quirks – e.g., ILogger and AI’s TelemetryClient could both send logs, sometimes leading to duplicate or uncorrelated entries if not configured carefully.

The migration to Azure.Monitor.OpenTelemetry.AspNetCore represents a shift to the OpenTelemetry-based pipeline for .NET applications. This Azure Monitor OpenTelemetry Distro (distribution) encapsulates the OTel SDK with presets for Azure. The new package addresses many of the old SDK’s limitations:

Migration Steps & Complexity: Migrating an ASP.NET Core application typically involves:

  1. Remove the old SDK – e.g. uninstall Microsoft.ApplicationInsights.AspNetCore NuGet and any related AI packages. Also remove any .UseApplicationInsights() or AddApplicationInsightsTelemetry() calls in your code, and delete AI config files if present.
  2. Install the Azure Monitor OpenTelemetry Distro – add the NuGet package Azure.Monitor.OpenTelemetry.AspNetCore (latest version 1.x).
  3. Configure as needed – You can customize the setup via OpenTelemetry APIs if necessary. For example, if you need additional instrumentation (like gRPC client, or a custom ActivitySource), you can use builder.Services.ConfigureOpenTelemetryTracerProvider(...) to add sources or instrumentations after calling UseAzureMonitor() (Azure Monitor Distro client library for .NET - Azure for .NET Developers | Microsoft Learn). The distro is designed to cover most common needs without extra code, but it’s extensible. You might also adjust logging filters or add processors (e.g., to redact sensitive data – which can be done via OTel processors or config, as shown in Microsoft docs).
  4. Test and validate – After migration, you should run and ensure that requests, dependencies, exceptions, and custom logs appear in Application Insights as before. The telemetry types may have slight name changes (for instance, AI “Trace” telemetry corresponds to OTel logs). It’s good to verify that the distributed trace correlation still works (it should, as OTel uses the same trace IDs but ensure any custom dependency tracking is updated to use OTel APIs if needed).

Enable OTel instrumentation – in your Program.cs or Startup, add the OTel initialization. For example:

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddOpenTelemetry().UseAzureMonitor(o => 
{
    o.ConnectionString = "<Your AI Connection String>";
    // o.SamplingRatio = 1.0; // (optional) adjust sampling if needed
});
// ...other services like controllers
var app = builder.Build();
app.Run();

This single call sets up the TracerProvider, MeterProvider, and LoggerProvider under the hood. By default, it will instrument ASP.NET Core requests and dependencies. If you had custom telemetry (like TrackEvent calls), you’ll handle those differently (more on that below).

The level of complexity in migration is moderate. For straightforward web apps that mainly relied on automatic monitoring, it’s largely a drop-in replacement with minimal code changes. The Azure docs indicate you can expect a similar experience after migration. However, if you had deep customization (custom telemetry initializers, extensive use of TelemetryClient for manual events, etc.), you’ll need to replace those with OpenTelemetry equivalents. For example:

Automatic instrumentation vs. manual: The new OTel SDK with Azure Monitor provides automatic instrumentation for a lot of common scenarios – HTTP, SQL, Azure SDK calls, etc., similar to the old SDK. If you need to instrument custom operations, you’d use OpenTelemetry’s API (e.g., ActivitySource.StartActivity in .NET to create custom spans). The Azure distro includes an ActivitySource named Microsoft.AspNetCore.Hosting for incoming requests, among others. It also captures exceptions and logs automatically. Configuration can be done via code or via appsettings.json/environment variables as shown in docs (for example, you can set AzureMonitor.ConnectionString in config instead of code, and it will be picked up).

Summary of Benefits: By moving to Azure.Monitor.OpenTelemetry.AspNetCore, you benefit from being in line with the industry-standard approach. You reduce dependency on proprietary SDKs and gain access to the broader OTel ecosystem (which means easier instrumentation of libraries and the ability to direct telemetry elsewhere if needed). Microsoft will also likely add new features to the OTel pipeline first. Given that Microsoft’s own reasoning is that OTel SDKs are more performant and flexible (OpenTelemetry help and support for Azure Monitor Application Insights - Azure Monitor | Microsoft Learn), adopting this should improve the telemetry pipeline’s efficiency. Additionally, the maintenance burden shifts – you no longer worry about the intricacies of the AI SDK internals, and instead configure standard OTel components (for which a lot of community knowledge exists). The migration is designed to be as smooth as possible, with Microsoft providing documentation and a mapping of features from the old API to the new one. Most developers report only minor changes were needed in application code, mostly around how they log or track custom events.

Separation of Logging and Tracing in .NET Applications

In .NET, the ILogger abstraction (from Microsoft.Extensions.Logging) is commonly used for logging within applications. It provides methods like LogInformation, LogWarning, etc., and it’s indifferent to the logging backend – it could be console, file, Application Insights, etc. When it comes to Application Insights and OpenTelemetry, it’s important to understand how logging (via ILogger) and distributed tracing relate, and how a developer’s experience changes (or doesn’t) when migrating to the new system.

ILogger and Application Insights: In the classic Application Insights SDK, an ILogger could be configured to send logs to AI by adding the ApplicationInsightsLoggerProvider. This meant every ILogger.LogX call became a “Trace” telemetry in Application Insights (with severity mapping to AI’s Trace severity). Under the hood, ApplicationInsightsLoggerProvider would use the same TelemetryChannel as other AI telemetry, ensuring your logs went to the same place as your requests and dependencies. Many developers got used to just writing ILogger logs in their ASP.NET Core apps and having them show up in the AI portal under the Traces table or Search.

With OpenTelemetry, the idea is similar: logging is integrated via ILogger. The Azure Monitor OpenTelemetry Distro includes an OpenTelemetry Logging component that listens to ILogger providers. In fact, the distro automatically captures logs created with Microsoft.Extensions.Logging (which is what ILogger is). For the developer, this means you continue to use ILogger as before – you do not need to call a special AI API to log things. When you switch to the OTel-based SDK, you might remove the old builder.Logging.AddApplicationInsights() line (that registered the old provider), but the new UseAzureMonitor() will internally register an OTel logger provider that hooks into the ILogger pipeline. All your existing LogInformation, LogError, etc. calls will be captured by OpenTelemetry and exported to Azure Monitor as log records.

Logs vs. Traces (Distributed Traces): It’s important to clarify terminology: In Application Insights, a “Trace” usually referred to a log message (text log). But in distributed tracing context, a “Trace” is a sequence of spans (operations) across services. OpenTelemetry uses the term Log to mean log messages, and Trace to mean the distributed trace. To avoid confusion, consider “Application Insights trace telemetry” = log entries. In the new world, your ILogger logs become OTel LogRecords which are sent to Azure Monitor – they’ll still appear in the “traces” table in Log Analytics (since that’s historically where AI stores log messages). The distributed trace spans (requests, dependencies, etc.) are handled by the OTel Tracer and show up in AI as requests/dependencies with correlation.

For developers, tracing (the distributed kind) is often implicit. For example, when a web request comes in, the instrumentation automatically starts an Activity (span) and tracks the incoming request and outgoing calls. Developers typically didn’t manually create spans in the AI SDK (except for custom scenarios where they might use TelemetryClient.StartOperation). In .NET, the System.Diagnostics.Activity class is the core of tracing – and OTel builds on that. With the OpenTelemetry SDK, if a dev does nothing special, they still get distributed traces: the HTTP libraries create Activity objects for outgoing requests, ASP.NET Core instrumentation creates one for the inbound request, etc. These get recorded and sent. If a developer wants to add custom spans, they would now use the OpenTelemetry API or ActivitySource to do so, rather than AI’s TelemetryClient. This is a shift in approach but aligns with the .NET platform’s direction (System.Diagnostics is built-in, whereas TelemetryClient was external).

Impact on Software Developers (Switching to OTel): For most developers, switching to the OpenTelemetry-based implementation has minimal impact on how they write logging and tracing code:

So developers may need to refactor those usages. But importantly, these changes push developers toward using framework-neutral APIs (ILogger, Activity, OTel API) instead of Application Insights-specific ones. This decouples application code from the telemetry backend. For example, a developer can instrument an Activity for a custom operation; if later the company switches telemetry backend, that code is still standard OTel and will work.

ILogger Abstraction: ILogger itself doesn’t know about traces vs. metrics – it’s purely for logs. But the presence of Activity.Current when logging ties the log to a trace. In practice, after migration, developers might notice that the category name of the ILogger (usually the fully qualified class name) appears differently or is missing in AI. There have been some reports that OTel logging initially did not include the logger category as a property in AI (unlike the old provider which included a SourceContext). This is a minor schema difference that could impact how you filter logs. The OpenTelemetry team has been addressing such gaps. For example, making the ILogger category available via a property (like LogCategory) in the exported data is on the roadmap. For now, developers might add the category name as part of the log message or use scopes if needed for filtering.

In summary, logging and tracing are now more separated conceptually but unified under OpenTelemetry’s umbrella. You log with ILogger (goes to logs pipeline), you let the automatic instrumentation handle tracing (or use ActivitySource for custom spans). The benefit is that as a developer, you mostly continue with the same patterns – using DI to get ILogger<T> for logging and relying on the framework to handle request and dependency tracking. The heavy lifting of correlating logs with traces is handled by the telemetry system, just as before. So the impact of switching to the OpenTelemetry-based AI is largely under the hood. You’ll spend less time dealing with Application Insights SDK specifics (like no more TelemetryInitializer classes to set role name – OTel resource does it, etc.), and more time with either standard .NET features or configuration. For an engineer, this feels more natural as part of the .NET ecosystem rather than a bolt-on.

Integration with Azure Databricks and Other Azure Services

Application Insights isn’t just for .NET or web apps – you can use it to gather telemetry from data and compute platforms like Azure Databricks, as well as functions, containers, etc. With the move to OpenTelemetry, Microsoft provides new options to integrate these services. Let’s focus on Azure Databricks, which often involves Spark jobs in Python or Scala, and discuss how to send logs/metrics/traces to Application Insights. We’ll also touch on deciding between separate or shared AI instances in such scenarios, and mention other services briefly.

Azure Databricks (Python) with Azure Monitor OpenTelemetry: Azure Databricks is a Spark-based analytics platform, and you can run custom code (PySpark, Scala, .NET Spark). To instrument Databricks workloads and send telemetry to Application Insights, you can leverage the OpenTelemetry SDK for Python along with the Azure Monitor exporter. Microsoft provides the azure-monitor-opentelemetry package (Azure Monitor OpenTelemetry Distro for Python) to streamline this integration. This package acts as a one-stop solution: it bundles the OpenTelemetry Python SDK, some common instrumentation libraries, and the Azure Monitor exporter. Using it in Databricks typically involves:

  1. Install the package in your Databricks environment (e.g., via %pip install azure-monitor-opentelemetry in a notebook or as a cluster library). It requires Python 3.8+ which Databricks supports.
  2. Automatic instrumentation in Databricks: The Python distro will automatically instrument HTTP clients (like the requests library), database clients (PostgreSQL psycopg2), Azure SDK calls (azure-core tracing), etc., if your code uses them (azure-monitor-opentelemetry · PyPI). If you’re writing a typical data pipeline that calls external services or Azure data SDKs, those calls can be traced without manual code. It also hooks into the Python logging module so that if you log via the standard logging library, those logs will be captured and sent to AI (unless you disable log capture). Metrics can also be collected if you use OTel metrics APIs or if some instrumentations produce metrics.
  3. Flush and completion: In long-running jobs, the OTel SDK will periodically batch and send telemetry (by default every 5 seconds for traces and logs in Python, as configured by environment variables like OTEL_BSP_SCHEDULE_DELAY). When the job finishes or the cluster shuts down, you’d want to ensure all telemetry is flushed. The SDK should flush on program exit, but you can manually force a flush if needed by shutting down the TracerProvider or ensuring the program runs to completion.

Custom instrumentation: If your Databricks code does something that isn’t automatically covered, you can use OpenTelemetry APIs directly in Python. For example, you can create a custom tracer and span:

from opentelemetry import trace

tracer = trace.get_tracer("my_databricks_app")
with tracer.start_as_current_span("custom_step"):
    # ... your code ...
    logger.info("Processing batch X")  # this log will be inside the span's context

The azure-monitor-opentelemetry package ensures that any spans created with the OpenTelemetry API and any logs recorded via Python logging get exported to Application Insights.

Initialize the OTel pipeline in your code by calling configure_azure_monitor(...). For example:

from azure.monitor.opentelemetry import configure_azure_monitor

configure_azure_monitor(connection_string="<Your AI Connection String>",
                        enable_live_metrics=False)

This one call sets up automatic instrumentation for supported libraries and configures exporters to send data to Azure Monitor (Application Insights). The connection string is the one you get from your Application Insights resource (it contains the instrumentation key and endpoints). If you don’t explicitly pass it, the library will look for the APPLICATIONINSIGHTS_CONNECTION_STRING environment variable.

Databricks-specific considerations:

Other Azure Services: The question specifically highlights Databricks, but a similar approach applies to other services:

Sending Logs, Metrics, and Traces: In summary, using OpenTelemetry on these services means you can send all three kinds of telemetry to Application Insights:

All telemetry from Databricks or other services goes to your Application Insights resource, identified by the connection string/instrumentation key. On the AI side, you’ll see it intermingled with other app telemetry (if sharing) or isolated (if separate).

Separate vs. Shared AI Instances for Databricks: Deciding whether to use a separate Application Insights for Databricks or to share with other apps depends on use case:

From a technical perspective, using a separate AI or shared doesn’t change the Databricks instrumentation code – it’s just a different connection string. It’s more about organizational clarity and data management. A rule of thumb: if different systems have different lifecycles or owners, separate instances make sense; if they are two parts of one system and frequently need combined analysis, a shared instance can simplify that.

For example, if an Azure Databricks job writes results that are immediately used by an API, and you want to trace an end-to-end transaction including the batch processing, using one AI resource helps. On the other hand, if the Databricks is doing offline analysis unrelated to the live app monitoring, keeping its telemetry separate avoids cluttering the app’s AI with batch logs.

Integrating with Azure Databricks (Summary): Azure Monitor’s OpenTelemetry offerings make it feasible to instrument Databricks jobs with minimal effort – a single library and function call in Python. You get the benefit of centralizing observability: your cluster’s activities can be observed in Azure Monitor alongside other components. If needed, Microsoft’s samples even cover more advanced scenarios (like capturing Spark metrics via an OpenTelemetry Collector). The decision on telemetry instance should align with how you want to query and manage the data.

Technical Guide for Developers

This section provides some practical configuration examples and code snippets to help developers migrate to and set up OpenTelemetry-based Application Insights in .NET and integrate it in Azure Databricks (Python). The goal is to illustrate the changes in code and configuration.

Migrating a .NET app to Azure.Monitor.OpenTelemetry.AspNetCore:

Program.cs / Startup.cs Example (ASP.NET Core 6+ minimal hosting):

using Azure.Monitor.OpenTelemetry.AspNetCore;  // Namespace for UseAzureMonitor extension

var builder = WebApplication.CreateBuilder(args);

// Add OpenTelemetry instrumentation and Azure Monitor exporter:
builder.Services.AddOpenTelemetry().UseAzureMonitor(o =>
{
    o.ConnectionString = "InstrumentationKey=XXXXXXXX-....;IngestionEndpoint=https://..."; 
    // Alternatively, set env var APPLICATIONINSIGHTS_CONNECTION_STRING instead of hard-coding here.
    o.SamplingRatio = 1.0f; // 100% sampling (adjust if you want to reduce volume)
});

// (Optional) Add additional OpenTelemetry configuration:
builder.Services.ConfigureOpenTelemetryTracerProvider((sp, tracerBuilder) =>
{
    tracerBuilder.AddSource("MyCompany.MyProduct.MyLibrary");  // capture custom sources
    tracerBuilder.AddGrpcClientInstrumentation();              // example: add gRPC instrumentation
});
builder.Services.ConfigureOpenTelemetryMeterProvider((sp, meterBuilder) =>
{
    meterBuilder.AddAspNetCoreInstrumentation(); // example: if you want extra metrics
});

// Add other services (MVC, etc.)
builder.Services.AddControllers();

var app = builder.Build();
app.MapControllers();
app.Run();

In the snippet above:

Configuration via appsettings.json: Instead of configuring in code, you can also use configuration files: In appsettings.json:

{
  "AzureMonitor": {
    "ConnectionString": "InstrumentationKey=...;IngestionEndpoint=...;"
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft": "Warning"
    }
  }
}

And in code simply call builder.Services.AddOpenTelemetry().UseAzureMonitor(); without the lambda. It will read the connection string from config by convention (Azure Monitor Distro client library for .NET - Azure for .NET Developers | Microsoft Learn). Logging levels can be adjusted to control what gets sent (as usual with ILogger).

After this setup, run your application. Verify in Azure Portal (Application Insights -> Live Metrics or Logs) that you see data. You can query using KQL, e.g.:

AppRequests | order by TimeGenerated desc | take 5

to see recent requests, or

AppTraces | where Message contains "Starting"

to see logs.

Databricks (Python) Integration Example:

Suppose you have a Databricks notebook or job in Python that you want to instrument. Here’s a simple example:

First, install the Azure Monitor OTel package (do this in a notebook or cluster init):

# In a Databricks notebook, you can use %pip
%pip install azure-monitor-opentelemetry

Next, in your application code:

from azure.monitor.opentelemetry import configure_azure_monitor
import logging

# Configure Azure Monitor OpenTelemetry
configure_azure_monitor(connection_string="InstrumentationKey=...;IngestionEndpoint=...;LiveEndpoint=...")

# (Optional) Configure Python logging level and logger
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("MyDatabricksJob")

# Example usage
logger.info("Databricks job started")
# Simulate an external HTTP call which will be auto-tracked if using requests
import requests
response = requests.get("https://api.example.com/data")
logger.info(f"Fetched data, response code {response.status_code}")
# If we want to add a custom trace span for a computation block:
from opentelemetry import trace
tr = trace.get_tracer("MyDatabricksJob")
with tr.start_as_current_span("data_cleaning"):
    # ... data cleaning logic ...
    logger.info("Cleaned the data")

logger.warning("Databricks job finished with some warnings")

A few notes on this example:

Databricks Spark Metrics: If you need Spark-level metrics, one approach is to use the Java OpenTelemetry JAR. The Azure Samples repo shows attaching the Application Insights Java agent to the Databricks cluster (for the JVM). This can capture Spark telemetry (like stages, shuffle, etc.) via JMX. However, that’s an advanced scenario. Most developers can start with the Python SDK approach for logs and traces of their job’s custom code.

Environment and Configurations:

Validation: After configuring, run a test job. Then in Application Insights, use Transaction Search or Logs to find data. If using a separate AI, you’ll only see Databricks data there. If shared, use filters (like operation name or cloud role) to isolate it.

Telemetry Separation or Union: If sending Databricks and app telemetry to the same AI, ensure you label the telemetry source. In .NET, as mentioned, cloud_RoleName is automatically like your app name. In Python, by default it might not set a service name unless you provide one. The configure_azure_monitor function takes a resource argument where you can specify service.name and other attributes. Setting service.name="DatabricksJob" will cause that value to appear as cloud_RoleName in Application Insights. This is very useful for distinguishing telemetry from different sources when they share the same AI. If you don’t set it, all custom telemetry might show up with a generic role name (which could be the hostname or “unknown_service:python”). So it’s recommended to set service.name or use the environment variable OTEL_SERVICE_NAME for clarity.

Example KQL queries:

This technical setup shows that moving to OpenTelemetry doesn’t add a lot of burden – in many cases it simplifies configuration (one line to add instrumentation, one line in Python to configure). The key is understanding the new knobs (like environment variables and OTel configuration) which replace the old ApplicationInsights.config or .config settings.

Using Log Analytics for Observability

Log Analytics is the workhorse for storing and querying telemetry in Azure Monitor. Once your distributed application is sending logs, traces, and metrics to Application Insights (and thus to Log Analytics), the next step is to leverage that data for observability. Different personas in an organization use Log Analytics in various ways to glean insights:

Consolidation of Data: Because Log Analytics aggregates all telemetry types, users can correlate across them. For example, an SRE investigating a spike in latency can query both logs and metrics in one go: maybe retrieve the server CPU metric and the request durations and see if there’s a correlation. Or join traces with metrics to see if a particular customer ID (found in a log property) correlates with high resource usage. The unified schema (where operation_Id can tie together traces, dependencies, logs) is extremely powerful. Even when using multiple AI resources, if they share a workspace, you can do cross-app joins easily. This is far better than having siloed logs in different tools.

Using Log Analytics Effectively:

Log Analytics Query Examples by Persona:

Product Manager: “How many sign-ups per day and how many completed profiles per day” (assuming those are custom events):

customEvents
| where name in ("UserSignedUp", "ProfileCompleted")
| summarize count() by name, date=format_datetime(bin(timestamp, 1d), 'yyyy-MM-dd')
| render columnchart

This uses Log Analytics to produce a quick chart of those business events.

Architect: “Show average duration of each operation name per cloud role (service) in last 1h”:

AppRequests
| where timestamp > ago(1h)
| summarize avg(duration) by cloud_RoleName, name

This helps spot if a particular service or API method is slower than others.

SRE: “What are the top 5 exceptions in the last 24h across all services?”:

AppExceptions 
| where timestamp > ago(24h) 
| summarize count() by type = innermostExceptionType, problemId 
| top 5 by count

This might show, for example, that NullReferenceException happened 100 times in OrdersAPI, etc.

User Experiences: Different personas might access this data through different interfaces:

Closing notes on Observability: Log Analytics is the backbone that consolidates logs, traces, and metrics (Application Insights OpenTelemetry overview - Azure Monitor | Microsoft Learn). By having all this telemetry in one queryable store, Azure Monitor supports a comprehensive observability approach:

Each persona contributes to the feedback loop: SREs ensure reliability (which feeds into product quality), developers fix and optimize using telemetry evidence, and business owners steer the product with insights drawn from usage data. Log Analytics, combined with Application Insights and OpenTelemetry instrumentation, provides the data and tooling to make this possible. It shifts observability from being just an ops concern to a cross-disciplinary practice where everyone can get the information they need from the running system, in near real-time, with the flexibility of powerful query and analysis capabilities. (Application Insights OpenTelemetry overview - Azure Monitor | Microsoft Learn) (Application Insights OpenTelemetry overview - Azure Monitor | Microsoft Learn)


More Posts