ServiceNow Predictive Intelligence: Using ML Models for ITOM Anomaly Detection
ServiceNow Predictive Intelligence sounds like something only data scientists would care about. But for ITOM and ITSM teams running complex infrastructure, it's one of the most practical additions to the platform in recent releases. It takes the operational data you already have — events, metrics, incident history — and turns it into forward-looking signals that let you act before problems escalate.
Here's how it actually works and what you can do with it on a live instance.
What Predictive Intelligence Does
At its core, Predictive Intelligence analyzes patterns in your historical operational data to predict future anomalies. It uses ML models to detect deviations from established baselines — things like unusual CPU spikes, memory exhaustion trends, or service degradation patterns that historically led to outages.
The output isn't a dashboard you'd open manually. It's structured alert data that flows into your ITOM and ITSM processes: incidents, tasks, notifications, and automated remediation workflows.
Setting Up Your First ML Model
Before you start, make sure the Predictive Intelligence plugin (com.snc.predictive_intelligence) is activated on your instance. You'll also need historical data — the more clean, structured records you have, the better.
1. Create a Training Dataset
Navigate to Predictive Intelligence > Training Data and create a new dataset. Pick the table you want to analyze — for ITOM anomaly detection, this is typically cmdb_ci joined with event or metric data.
Table: cmdb_ci
Filters: type = 'server' AND operational_status = '1'
Features: cpu_usage_pct, memory_usage_pct, disk_usage_pct, incident_count_30d
Label: critical_incident_raised (boolean)
The "label" column is what the model learns to predict. You can generate this from historical incident data — a CI that had a critical incident in the last 7 days gets true, otherwise false.
2. Configure and Train the Model
Go to Predictive Intelligence > Models and create a new model. Point it to your training dataset. ServiceNow handles the algorithm selection and training automatically — you don't specify the ML algorithm.
Key settings to configure:
- Prediction type: Classification (for yes/no outcomes like "will this CI go critical?") or Regression (for continuous values like "what will CPU usage be in 3 days?")
- Prediction window: How far ahead to predict (e.g., 7 days)
- Refresh frequency: How often to retrain the model (weekly is typical for operational data)
Once configured, trigger the training job. Depending on your dataset size, this runs in the background and you'll get a notification when it's done.
3. Evaluate Model Accuracy
After training, ServiceNow shows accuracy metrics. A useful number to check is the F1 score — it balances precision (how many predictions were actually correct) against recall (how many actual events were caught).
For ITOM anomaly detection, aim for F1 above 0.7 before trusting the model in production. If it's lower, you likely need more training data or better feature engineering on your dataset.
Routing Predictions into ITSM Workflows
Here's where Predictive Intelligence becomes genuinely useful: the predictions aren't just displayed — they're records in a table that you can trigger on.
The key table is ml_prediction. You can create a Business Rule or a Flow that triggers when a new prediction record is created:
Trigger: ml_prediction record inserted
Condition: prediction_score > 0.8 AND prediction_type = 'anomaly'
Action:
1. Create Incident with assignment group = 'L3 Infrastructure'
2. Set priority based on CI business criticality
3. Notify assigned group via Virtual Agent
In Flow Designer, use the Predictive Intelligence trigger. It gives you the predicted label, confidence score, and affected CI — all available as flow variables to route logic however you need.
Real-World Pattern: Proactive Disk Space Remediation
Here's a concrete example of what this looks like in practice:
- Training: Use
syseventand metrics data to train a model that predicts "disk will exceed 90% in 14 days" - Prediction: Every night, the model scores all CIs and writes prediction records for those likely to hit the threshold
- Flow: When a high-confidence prediction arrives, Flow Designer creates a planned task for the infrastructure team, assigns it 10 days out, and attaches the CI details
- Result: The team remediates before the disk fills — no incident, no outage
This is theITOM promise fulfilled: not just reactive incident management, but operations that stay ahead of the curve.
Common Pitfalls
Not enough clean data. Predictive Intelligence needs structured historical records to learn from. If your CMDB is full of duplicate or stale CIs, or your incident data is inconsistent, the model will produce noise. Invest time in data quality first.
Ignoring the confidence threshold. Every prediction comes with a confidence score. Routing every prediction to the team will create alert fatigue. Set a threshold (0.75–0.85 is a good starting point) and only act on high-confidence predictions.
Forgetting to retrain. Operational patterns change — a model trained on last quarter's data may not reflect this quarter's infrastructure changes. Set a recurring schedule to retrain models, especially after major changes like migrations or upgrades.
Getting Started
If you're on a recent Orlando+ instance and have ITOM activated, Predictive Intelligence is available today. Start small: pick one CI class (servers, storage arrays, or network devices), define one prediction outcome (critical incident within 7 days), and build the workflow end-to-end. That's enough to demonstrate value and build confidence before scaling out.
The ML side sounds intimidating, but the ServiceNow implementation is built for practitioners — you configure, it learns, you act.
