Tracing SCOM Workflows with PowerShell



To run an agent trace from the SCOM Console, see this post
SCOM – Trace Workflow Agent Task


There have been a number of documents written on tracing over the years but have made it this easy to trace SCOM agent workflows. Tracing is a way to spy on agent activity. This is very helpful, essential even, when troubleshooting, authoring, or testing workflows. This is not meant to be a deep dive into debugging. This article is geared towards those folks who have at least dabbled with management pack authoring and have a basic understanding of how workflows are designed, how they run, and are looking for tools to learn, grow, and troubleshoot… better.

Here’s a script that you can use to initiate tracing on any server where Microsoft Monitoring Agent (HealthService) is running. There are basically two ways to initiate a trace: general (common) and specific. If no Type is specified in the configuration area of the script, the user will be presented with an option to initiate a general trace or a specific workflow trace.

General Trace

The trace data (type of data collected) is controlled by the file names that are active (uncommented) in the script. These file names represent various common types of workflows. Think of these as categories.

This hash table, shown below, references the .txt files that are located in the agent Tools folder. The referenced .txt files contain sets of GUIDs that tell the agent what type of data to collect while the trace is running. You don’t need to worry much about where the files are located or what they contain. The most common workflow types (that you will likely care about) are already uncommented/active as shown below; native, script, and managed.

Simply comment/uncomment each line to enable/disable the trace type.

Be sure to set the “TraceSeconds” value for slightly higher/longer than the interval (seconds) of the workflow you are trying to observe, assuming it’s a timed workflow.

Let the trace run. The script will automatically format the data log files for you.

Specific Workflow Trace

To trace a specific workflow, there must exist a very special override for a specific instance. You cannot create this override in the Console with the standard override screen/wizard. The easiest way to create the override is described below:

  • Locate the workflow in the Console (Discovery, Rule, or Monitor).
  • Override the workflow for a specific instance (not a Class or Group):
    Enabled = True. Yes, True, regardless if it is already True. Just do it.
  • Save to a new management pack; name it “TRACE”.
  • Export the new management pack. It might take a minute to appear in inventory so don’t worry if it doesn’t appear in the MP view immediately.
Get-SCOMManagementPack -Name 'TRACE' | Export-SCOMManagementPack -Path C:\Temp
  • Edit the exported file: TRACE.xml.
    Change “Enabled” to “TraceEnabled” as shown below.
  • Save the file and import it back into SCOM.
Example: override your workflow to Enabled=True for a specific instance.

There won’t be much in the TRACE.xml file. Locate “Enabled” and modify it as shown.

Save the .xml file. Do not change the name of this file. Then import into SCOM. Within a few minutes you should see the MP get delivered and digested on the server where the instance lives.

Import back into SCOM

Now you are ready to run the “specific” trace script. You run this script on the computer where the target object instance lives. Example: If you are tracing a SQL database monitoring workflow, run this script on the server where the database exists. If you are spying on a timed workflow you should set the “TraceSeconds” value in the script for slightly higher/longer than the interval of the workflow which you are trying to observe. This will ensure that the trace is already running when the workflow executes at its next interval.

The script begins to collect data. A countdown timer is displayed for your convenience.

Regardless of the type of trace that you choose, when the timer expires the tracing session will stop and the data log files will get formatted and output to the path specified in the script:

In this example, the specific workflow trace data appears in the “MyCustomerWorkflowTrace.LOG” file at the path shown below.

Below is what typical trace output would look like for a specific workflow trace. You might see a DataItem which is the chunk of Name/Value pairs that were added to a property bag by a datasource script and burped out of the workflow for the agent to evaluate. You may also see the agent evaluating that data within an expression matching filter.

Notepadd++ (XMLTools plugin) can format the DataItem so that it is easier to view:

Typically what I look for is any kind of failures, errors, or unexpected behavior from expression matching. The majority of the time my problems are found in expression matching; the values being compared are incorrect or missing. What I mean by that is that I might use a condition detection to filter out instances whereby I reference a property bag value or perhaps a data item from a performance collection but perhaps I made a typo or got the syntax/path wrong. (this happens to me more than I’d like to admit). Sometimes that unexpected value comes from a mistake in the script that produced the property bag.

Here’s an example of a mistake I might make when writing a condition detection expression filter. Notice in the code example below how $Config/ProductCode is missing a trailing dollar ‘$‘.

The correct parameter syntax is:
$Config/ProductCode$

This would cause a match to fail, the condition would not be detected in the workflow, and thus the workflow would not function correctly.

<ConditionDetection ID="FilterOnProductCode" TypeID="System!System.ExpressionFilter">
<Expression>
  <SimpleExpression>
	<ValueExpression>
	  <XPathQuery Type="String">Property[@Name='IdentifyingNumber']</XPathQuery>
	</ValueExpression>
	<Operator>Equal</Operator>
	<ValueExpression>
		# MISSING TRAILING DOLLAR '$' IN VARIABLE.
		# SHOULD BE: $Config/ProductCode$
	  <Value Type="String">$Config/ProductCode</Value>
	</ValueExpression>
  </SimpleExpression>
</Expression>
</ConditionDetection>



Download

Download “Start-SCOMTrace” Start-SCOMTrace.2021.01.28.1728.zip – Downloaded 38 times – 5 KB

# This snippet will configure the tracing commands for the SCOM agent to generate trace files, then format the trace files into .log for humans.
<# 
    Script: Start-SCOMTrace.ps1
    Author: Tyson Paul
    Description: This will initiate SCOM agent tracing, will output trace files, then format the trace files into .log for humans.
    Version History:
    2020.11.20.1727 - v1
#>

#==============================================================================
#       CONFIGURE THESE VARIABLES AS NEEDED
#============================================================================== 
# Set the duration for 20 seconds longer than your workflow IntervalSeconds
$TraceSeconds = 300

# Choice can be hardcoded to 'SPECIFIC' or 'GENERAL'. Set to $NULL to prompt user at runtime.
$Type = $NULL
# Name of the logfile. Used for a specific trace scenario only.
$SpecificTraceName = 'MyCustomWorkflowTrace'
#Max log size (circular)
$MaxMB = 1024 
# This is the standard SCOM trace logging folder
$TraceFolder = 'C:\Windows\Logs\OpsMgrTrace' 


$hashGuidFileNames = [ordered]@{
  'TracingGuidsNative' = 'TracingGuidsNative.txt'
  'TracingGuidsScript' = 'TracingGuidsScript.txt'
  'TracingGuidsManaged' = 'TracingGuidsManaged.txt'

  # Uncomment to include additional tracing GUIDs as needed.
  #  'TracingGuidsAdvisor' = 'TracingGuidsAdvisor.txt'
  #  'TracingGuidsAPM' = 'TracingGuidsAPM.txt'
  #  'TracingGuidsApmConnector' = 'TracingGuidsApmConnector.txt'
  #  'TracingGuidsBID' = 'TracingGuidsBID.txt'
  #  'TracingGuidsConfigService' = 'TracingGuidsConfigService.txt'
  #  'TracingGuidsDAS' = 'TracingGuidsDAS.txt'
  #  'TracingGuidsFailover' = 'TracingGuidsFailover.txt'
  #  'TracingGuidsNASM' = 'TracingGuidsNASM.txt'
  #  'TracingGuidsUI' = 'TracingGuidsUI.txt'
}
#============================================================================== 
#==============================================================================
 


# Clean out previous/old log files
# Get-ChildItem -Path $TraceFolder| Remove-Item -Force #-ErrorAction SilentlyContinue ;
 
# Locate tools directory
$setupKey = Get-Item -Path 'HKLM:\Software\Microsoft\Microsoft Operations Manager\3.0\Setup'
$InstallDirectory = $setupKey.GetValue('InstallDirectory') | Split-Path
$ToolsPath = Join-Path -Path $InstallDirectory -ChildPath 'Server\Tools'
If (-NOT(Test-Path -Path $ToolsPath -PathType Container) ) 
{
  $ToolsPath = Join-Path -Path $InstallDirectory -ChildPath 'Agent\Tools'
}
Set-Location $ToolsPath

Write-Host "Any existing traces will first be stopped." -F Gray
# Stop any previous tracing
& .\Stoptracing.cmd

Write-Host -Object "`n`n`n"
While ($Type -notmatch '^1$|^2$') 
{
  Write-Host -Object 'Start tracing for:' -ForegroundColor Gray
  Write-Host -Object '1) Common workflows (typical agent activities)' -ForegroundColor Cyan
  Write-Host -Object "2) A specific workflow (requires 'TraceEnabled' override for specific workflow)" -ForegroundColor Cyan
  $Type = Read-Host -Prompt '>'
}
"`n"
Switch ($Type) {
  {
    $_ -match '1|^GENERAL$'
  } 
  {
    #.\StartTracing.cmd VER
    @($hashGuidFileNames.Keys) | ForEach-Object -Process { 
      $outFile = (Join-Path -Path $TraceFolder -ChildPath "$($_).etl")
      .\TraceLogSM.exe -start $_ -flag 0x1F -level 6 -f $outFile -b 64 -ft 10 -cir $MaxMB -guid $($hashGuidFileNames[$_])
    }
    
    Continue
  }
    
  {
    $_ -match '2|^SPECIFIC$'
  } 
  {
    # This hash variable is used for the stop/format commands at the end.
    $hashGuidFileNames = [ordered]@{
      $SpecificTraceName = $SpecificTraceName
    }
    $MySpecificTraceFile = Join-Path -Path $TraceFolder -ChildPath "$($SpecificTraceName).etl"
    # Start fresh tracing session for specified duration/seconds
    & .\TraceLogSM.exe -start $SpecificTraceName -flag 0xFF -level 5 -ft 1 -rt -GUID '#c85ab4ed-7f0f-42c7-8421-995da9810fdd' -b 1024 -f $MySpecificTraceFile
  }
}

Write-Host -Object "`nSleeping for $TraceSeconds seconds..." -F Yellow
$TraceSeconds..1 | ForEach-Object -Process {
  If ($_%10 -eq 0) 
  {
    Write-Host -Object "`n$($_)" -NoNewline -ForegroundColor Yellow
  }
  Else 
  {
    Write-Host -Object '.' -ForegroundColor Cyan -NoNewline
  }
  Start-Sleep -Seconds 1
}
 
# If script has been stopped manually for some reason, no problem! You can run the below commands to stop the TRACES and format the logs.
# Stop tracing session and format logs for humans to read
@($hashGuidFileNames.Keys) | ForEach-Object -Process {
  Write-Host -Object "`nStopping trace of session: $_" -ForegroundColor Cyan
  & .\TraceLogSM.exe -stop $_ 
}
 
@($hashGuidFileNames.Keys) | ForEach-Object -Process {
  Write-Host -Object "`nFormating: $(Join-Path -Path $TraceFolder -ChildPath "$($_).etl")" -ForegroundColor Green
  & .\TraceFmtSM.exe $(Join-Path -Path $TraceFolder -ChildPath "$($_).etl") -tmf .\all.tmf -o $(Join-Path -Path $TraceFolder -ChildPath "$($_).LOG")
}


Workflow Analyzer

The Workflow Analyzer is a tracing tool that was introduced long ago with Operations Manager 2007 R2 but has not worked correctly since around version 2012SP1? Good news!! Apparently the System Center Product Group is refurbishing the Workflow Analyzer and will release an update to the public very soon. I suspect they will include a few other tools as well (down the road, possibly Q1 2021). This will be another valuable tool for those of us who enjoy poking at SCOM.

Here’s a sneak peak. The interface is likely to change slightly before official release. “Fetch HS” will likely be renamed to “Connect to MS”. We may see a connection status icon as well as a button to display Help content/tips.

With this tool you can initiate the trace from either the agent machine or the mgmt server. However, you can only observe the workflow trace output from the actual agent machine where the workflow is running.
Select the instance for the workflow, then Start.

In the example screenshot below we can see in the trace data how the expression filter has identified that a property bag variable is missing/incorrect. I expected this variable to have a value but it is empty. This tool makes debugging even easier!

There are a few bugs to be worked out yet but I’m very pleased to see that the PG is refurbishing this valuable tool and I’m excited to see the new official release. Stay tuned!

7 Replies to “Tracing SCOM Workflows with PowerShell”

  1. I’ve been trying to make this script work with no success. Tried the new script but it just exits without showing anything on screen no matter what parameters I use when it’s started. Used the old script which ran, gave me the countdown, and created the logs. But the logs only have entries like the following:

    Unknown(197): GUID=dbdc66e4-eb06-f736-d8de-556e5f264791 (No Format Information found).

    My environment is SCOM 2012 R2 (maybe that’s the issue?) and I’ve been running this on my management server

    1. @Scott,
      The updated script file contains a function only. If you run the script as it is, it will only define the function so that it will exist in the script session. That’s it. The function will exist in memory momentarily, then poof, it’s gone when the script exits. The previous version was interactive but it needed improvement. I modified the script and turned it into a proper function. Now it is also baked into the SCOMAgentHelper management pack. One major problem with the previous version is that it didn’t check for the existence of the ‘all.tmf’ file which is required to format the logs. The new function will ensure that the tmf files gets created correctly if it doesn’t already exist.
      To use this function you’ve got some options:
      1) dot source. You can launch the script as is but without the script running in a child scope; it will run in the current scope (likely your console or ISE session). When the script exits, the function will still exist in your console session so that you can now use the function.
      Example 1:
      . C:\test\Start-SCOMTrace.ps1
      notice the dot/period preceding the path to the script/ps1 file? This causes the script to run in the current scope, after which, the function will exist in the session so that now you can use the function.
      Now you can simply use the function like this:
      Example 2:
      Start-SCOMTrace -General -TraceSeconds 600 -Verbose

      2) ISE. You can open the script with your editor and simply add the example command #2 (shown directly above) to the bottom of the script, then run the script. Running the script will define the function so it will exist in the session, then call the function which will initiate the trace.

      3) Profile. You could add this function to one of your PowerShell profiles so that the function would become defined/exists every time you open a PowerShell console or ISE. Configuring profiles is beyond the scope of this article but they are documented very well all over the interwebs.

      1. Thank you for the follow up – not sure why I didn’t notice that it was a function. It ran properly using your instructions. However, the logs don’t appear to contain anything useful. They are full of entries like this:

        Unknown(197): GUID=dbdc66e4-eb06-f736-d8de-556e5f264791 (No Format Information found).

        Any thoughts?

        1. @Scott,
          It sounds like the ‘all.tmf’ file does not exist. Can you verify that the it exists at YOUR System Center path:
          C:\Program Files\Microsoft System Center 20XX\Operations Manager\Server\Tools\all.tmf

          If the tmf does not exist, try running the function from an elevated Posh console. The function should automagically create the tmf file if it does not exist.

          Otherwise run this .cmd file from an elevated command prompt at YOUR System Center path. This batch file should create the tmf as well.
          C:\Program Files\Microsoft System Center 20XX\Operations Manager\Server\Tools\FormatTracing.cmd

          1. I checked and the all.tmf file was there but it was a 0 byte file. I ran the FormatTracing.cmd which created a ~38MB file. From there the script ran perfectly. Thanks for your help.

Leave a Reply

Your email address will not be published. Required fields are marked *