Skip to content

[Proposal] Add opt‑in MSAL.NET metrics enrichment#5877

Draft
ssmelov wants to merge 2 commits intoAzureAD:mainfrom
ssmelov:feature/add_msal_metrics_enrichment_v2
Draft

[Proposal] Add opt‑in MSAL.NET metrics enrichment#5877
ssmelov wants to merge 2 commits intoAzureAD:mainfrom
ssmelov:feature/add_msal_metrics_enrichment_v2

Conversation

@ssmelov
Copy link
Copy Markdown

@ssmelov ssmelov commented Mar 23, 2026

Changes proposed in this request
This PR proposes an extensibility mechanism to enrich existing MSAL OpenTelemetry metrics with additional dimensions, without changing or replacing MSAL’s canonical metric set.

This PR is intentionally scoped as a proposal to illustrate the API shape and discuss trade‑offs before committing to a final design.

Copilot AI review requested due to automatic review settings March 23, 2026 16:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR proposes an opt-in extensibility mechanism for MSAL.NET OpenTelemetry metrics, allowing callers to enrich MSAL’s existing metric emissions with additional tags (dimensions) on a per-request basis.

Changes:

  • Introduces a new public TokenAcquisitionResult context object and a public builder extension WithOtelTagsEnricher(...) to register a per-request tag enricher callback.
  • Threads the enricher through request execution (including proactive refresh background flows) into the OpenTelemetry instrumentation layer.
  • Updates OpenTelemetry instrumentation interfaces/implementations to append enriched tags to emitted counters/histograms.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/client/Microsoft.Identity.Client/TokenAcquisitionResult.cs Adds a new public context type for success/failure information passed to enrichers.
src/client/Microsoft.Identity.Client/TelemetryCore/OpenTelemetry/NullOtelInstrumentation.cs Updates null instrumentation signatures to accept context/enricher and extra tags.
src/client/Microsoft.Identity.Client/TelemetryCore/OpenTelemetry/IOtelInstrumentation.cs Updates internal interface signatures to accept context/enricher and extra tags.
src/client/Microsoft.Identity.Client/Platforms/Features/OpenTelemetry/OtelInstrumentation.cs Implements tag enrichment by appending extra tags to emitted OTel metrics.
src/client/Microsoft.Identity.Client/Internal/Requests/SilentRequestHelper.cs Passes enricher into proactive refresh background telemetry emission.
src/client/Microsoft.Identity.Client/Internal/Requests/Silent/CacheSilentStrategy.cs Plumbs enricher into proactive refresh helper invocation.
src/client/Microsoft.Identity.Client/Internal/Requests/RequestBase.cs Uses TokenAcquisitionResult for success/failure telemetry and forwards enricher.
src/client/Microsoft.Identity.Client/Internal/Requests/OnBehalfOfRequest.cs Plumbs enricher into proactive refresh helper invocation.
src/client/Microsoft.Identity.Client/Internal/Requests/ManagedIdentityAuthRequest.cs Plumbs enricher into proactive refresh helper invocation.
src/client/Microsoft.Identity.Client/Internal/Requests/ClientCredentialRequest.cs Plumbs enricher into proactive refresh helper invocation.
src/client/Microsoft.Identity.Client/Internal/Requests/AuthenticationRequestParameters.cs Exposes OtelTagsEnricher from common parameters for request flows.
src/client/Microsoft.Identity.Client/Extensibility/AbstractConfidentialClientAcquireTokenParameterBuilderExtension.cs Adds public WithOtelTagsEnricher(...) extension method to register the enricher.
src/client/Microsoft.Identity.Client/ApiConfig/Parameters/AcquireTokenCommonParameters.cs Stores the per-request enricher callback in common request parameters.
src/client/Microsoft.Identity.Client/PublicApi/netstandard2.0/PublicAPI.Unshipped.txt Tracks new public APIs (TokenAcquisitionResult, WithOtelTagsEnricher).
src/client/Microsoft.Identity.Client/PublicApi/net8.0/PublicAPI.Unshipped.txt Tracks new public APIs (TokenAcquisitionResult, WithOtelTagsEnricher).
src/client/Microsoft.Identity.Client/PublicApi/net8.0-ios/PublicAPI.Unshipped.txt Tracks new public APIs (TokenAcquisitionResult, WithOtelTagsEnricher).
src/client/Microsoft.Identity.Client/PublicApi/net8.0-android/PublicAPI.Unshipped.txt Tracks new public APIs (TokenAcquisitionResult, WithOtelTagsEnricher).
src/client/Microsoft.Identity.Client/PublicApi/net472/PublicAPI.Unshipped.txt Tracks new public APIs (TokenAcquisitionResult, WithOtelTagsEnricher).
src/client/Microsoft.Identity.Client/PublicApi/net462/PublicAPI.Unshipped.txt Tracks new public APIs (TokenAcquisitionResult, WithOtelTagsEnricher).

Comment on lines +99 to +100
var extraTags = new List<KeyValuePair<string, object>>();
tagsEnricher?.Invoke(context, extraTags);
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allocates extraTags and invokes the optional tagsEnricher unconditionally for proactive refresh. If the enricher is null, this is an unnecessary allocation; if the enricher throws, it will fault the fire-and-forget task (risking UnobservedTaskException). Consider allocating/invoking only when non-null and wrapping the callback in try/catch.

Suggested change
var extraTags = new List<KeyValuePair<string, object>>();
tagsEnricher?.Invoke(context, extraTags);
IList<KeyValuePair<string, object>> extraTags = null;
if (tagsEnricher != null)
{
extraTags = new List<KeyValuePair<string, object>>();
try
{
tagsEnricher(context, extraTags);
}
catch (Exception ex)
{
logger.ErrorPiiWithPrefix(ex, "Proactive refresh tags enricher callback failed.");
}
}

Copilot uses AI. Check for mistakes.
/// <see cref="Exception"/> that caused the request to fail — never both at the same time.
/// </summary>
public sealed class TokenAcquisitionResult
{
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TokenAcquisitionResult has an implicit public parameterless constructor, but its properties are internal set, so external consumers can create instances that can never represent success/failure. Consider following the pattern used by Extensibility.ExecutionResult (internal constructor + internally-populated properties) to avoid a confusing public surface and better enforce the “success XOR failure” invariant.

Suggested change
{
{
internal TokenAcquisitionResult()
{
}

Copilot uses AI. Check for mistakes.
Comment on lines 206 to +216
/// <summary>
/// Specifies extra claims to be included in the client assertion.
/// Registers a callback that is invoked just before OpenTelemetry metrics are recorded for this
/// token acquisition call. Use this to append tags to every metric emitted for the request.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="builder">The builder to chain options to.</param>
/// <param name="otelTagsEnricher">
/// A delegate that receives the <see cref="TokenAcquisitionResult"/> (carrying either the
/// successful <see cref="AuthenticationResult"/> or the failure <see cref="Exception"/>)
/// and a mutable tag list. Add <see cref="KeyValuePair{String, Object}"/> entries to the list
/// to include them as dimensions on all metrics emitted for this call.
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML docs for this enricher don’t call out important constraints: avoid adding secrets/PII (e.g., access tokens) to metric tags, keep tag cardinality low, and ensure the delegate is fast and non-throwing (or clarify that MSAL will ignore exceptions). Adding these warnings would help prevent accidental data exposure and high-cardinality metric explosions.

Copilot uses AI. Check for mistakes.
Comment on lines +219 to +226
public static AbstractAcquireTokenParameterBuilder<T> WithOtelTagsEnricher<T>(
this AbstractAcquireTokenParameterBuilder<T> builder,
Action<TokenAcquisitionResult, IList<KeyValuePair<string, object>>> otelTagsEnricher)
where T : AbstractAcquireTokenParameterBuilder<T>
{
builder.CommonParameters.OtelTagsEnricher = otelTagsEnricher;
return builder;
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior/API surface (WithOtelTagsEnricher + extra tags propagation) doesn’t appear to have corresponding test coverage. Consider adding/adjusting telemetry unit tests (e.g., in OTelInstrumentationTests) to verify extra tags are appended to all emitted metrics and that enricher exceptions are handled as best-effort.

Copilot uses AI. Check for mistakes.
if (tagsEnricher != null)
{
extraTags = new List<KeyValuePair<string, object>>();
tagsEnricher.Invoke(context, extraTags);
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagsEnricher.Invoke(...) is executed without any exception handling. If a consumer-provided enricher throws, it will bubble up and can turn a successful token acquisition into a failure (or break telemetry logging). Consider wrapping the callback in a try/catch, swallowing exceptions, and logging at Verbose/Warning so telemetry enrichment cannot affect the auth flow.

Suggested change
tagsEnricher.Invoke(context, extraTags);
try
{
tagsEnricher.Invoke(context, extraTags);
}
catch (Exception ex)
{
// Telemetry enrichment should not interfere with the auth flow.
logger?.WarningPii(
"An exception occurred while executing OpenTelemetry tags enricher. Telemetry enrichment will be skipped for this call.",
ex);
}

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +119
// Invoke the enricher once and collect extra tags; null when no enricher is set (common case).
List<KeyValuePair<string, object>> extraTags = null;
if (tagsEnricher != null)
{
extraTags = new List<KeyValuePair<string, object>>();
tagsEnricher.Invoke(context, extraTags);
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when OpenTelemetry instruments are disabled, this code will still allocate extraTags and invoke tagsEnricher (which may be expensive or have side effects). Consider checking whether any relevant instruments are enabled before invoking the enricher / allocating the tag list.

Copilot uses AI. Check for mistakes.
if (tagsEnricher != null)
{
extraTags = new List<KeyValuePair<string, object>>();
tagsEnricher.Invoke(context, extraTags);
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagsEnricher.Invoke(...) is not guarded. If it throws, it can crash the metrics path (and potentially impact the caller if invoked on the request thread). Wrap the callback in try/catch and ignore failures (optionally log) to keep telemetry enrichment best-effort.

Suggested change
tagsEnricher.Invoke(context, extraTags);
try
{
tagsEnricher.Invoke(context, extraTags);
}
catch (Exception)
{
// Swallow exceptions from telemetry enrichment to keep metrics best-effort.
}

Copilot uses AI. Check for mistakes.
new(TelemetryConstants.MsalVersion, MsalIdHelper.GetMsalVersion()),
new(TelemetryConstants.Platform, platform),
new(TelemetryConstants.ApiId, apiId),
new(TelemetryConstants.CallerSdkId, callerSdkId ?? string.Empty + "," + callerSdkVersion ?? string.Empty),
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CallerSdkId value expression has unintended precedence with ?? and + and appears to drop callerSdkVersion in most cases (and can produce values like ","). Consider emitting TelemetryConstants.CallerSdkId and TelemetryConstants.CallerSdkVersion as separate tags, or at least parenthesizing/formatting the concatenation logic so null-handling is correct.

Suggested change
new(TelemetryConstants.CallerSdkId, callerSdkId ?? string.Empty + "," + callerSdkVersion ?? string.Empty),
new(TelemetryConstants.CallerSdkId, (callerSdkId ?? string.Empty) + "," + (callerSdkVersion ?? string.Empty)),

Copilot uses AI. Check for mistakes.
new(TelemetryConstants.Platform, platform),
new(TelemetryConstants.ErrorCode, errorCode),
new(TelemetryConstants.ApiId, apiId),
new(TelemetryConstants.CallerSdkId, callerSdkId ?? string.Empty + "," + callerSdkVersion ?? string.Empty),
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same CallerSdkId null-coalescing/concatenation issue as in the success counter tags: this can generate incorrect tag values and loses callerSdkVersion. Consider emitting CallerSdkId and CallerSdkVersion as separate tags or fixing the null/concat precedence.

Suggested change
new(TelemetryConstants.CallerSdkId, callerSdkId ?? string.Empty + "," + callerSdkVersion ?? string.Empty),
new(TelemetryConstants.CallerSdkId, (callerSdkId ?? string.Empty) + "," + (callerSdkVersion ?? string.Empty)),

Copilot uses AI. Check for mistakes.
/// Carries either a successful <see cref="AuthenticationResult"/> or the
/// <see cref="Exception"/> that caused the request to fail — never both at the same time.
/// </summary>
public sealed class TokenAcquisitionResult
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already exists ExecutionResult for similar purposes. Try to reuse that.

new(TelemetryConstants.CacheRefreshReason, authResultMetadata.CacheRefreshReason),
new(TelemetryConstants.TokenType, authResultMetadata.TelemetryTokenType)
};
AppendExtraTags(ref tags, extraTags);
Copy link
Copy Markdown
Member

@bgavrilMS bgavrilMS Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main concern is max numbers of dimensions that a Record can have. Based on prev experience, this is 8. So adding to existing ones is problematic.

Maybe a better design is to fire 1-2 new Records ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, having >8 tags causes performance issues. This is a known constraint of otel. I would suggest to create another counter. That can have the additional tags and any other already used tags that are useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants