ServiceNow Scheduled Jobs: Background Processing Patterns for Reliable Automation
Scheduled jobs are the backbone of automated maintenance in ServiceNow — from nightly data purges to SLA recalculations to external system syncs. Yet they're a common source of silent failures, duplicate processing, and performance headaches when done wrong. This guide covers the patterns that keep background jobs reliable, cluster-safe, and easy to monitor.
Scheduled Script Execution vs. System Jobs
ServiceNow exposes two main job types for developers:
- Scheduled Script Execution (SSE): Runs a specific server-side script on a defined cron-style schedule. This is what developers create when they need custom logic to run periodically.
- Scheduled System Jobs: Built-in jobs that drive platform features — workflow engines, email notifications, report generation, plugin maintenance tasks. These run automatically and are visible under System Definition > Scheduled Jobs.
Most custom automation work uses SSEs. The key to a reliable SSE is treating it like production code: error handling, logging, idempotency, and cluster awareness aren't optional — they're requirements.
Building a Reliable Scheduled Script Execution
Here's the basic anatomy of an SSE that won't silently fail:
(function scheduledScript() {
var log = gs.log;
var jobName = 'Weekly_Data_Purge';
try {
// 1. Early-exit guards
if (gs.getMaintenanceDate()) {
log('SKIP: Maintenance window active', jobName);
return;
}
// 2. Cluster-safe singleton lock
var lockGR = new GlideRecord('sys_lock');
lockGR.addQuery('name', jobName);
lockGR.query();
if (lockGR.next()) {
var elapsed = parseInt(gs.minutesAgo(lockGR.sys_updated_on.getDisplayValue()));
if (elapsed < 5) {
log('SKIP: Another instance holds the lock', jobName);
return;
}
}
// 3. Process with progress logging
log('START: Processing weekly data purge', jobName);
var purged = 0;
var gr = new GlideRecord('u_temp_records');
gr.addQuery('u_created', '<', gs.daysAgo(30));
gr.query();
while (gr.next()) {
gr.deleteRecord();
purged++;
}
log('DONE: Purged ' + purged + ' records in ' + jobName, jobName);
} catch (e) {
gs.logError('FAILED: ' + e.getMessage(), jobName);
}
})();
The pattern here does three things right:
- Maintenance window guard — exits cleanly during blackout periods
- Singleton lock — prevents duplicate processing across cluster nodes
- Structured logging — makes debugging from system logs straightforward
Understanding the Singleton Lock Pattern
On a clustered ServiceNow instance, every node evaluates the SSE schedule independently. If your script runs without coordination, you risk duplicate processing — the same operation running on all active nodes simultaneously.
The singleton lock pattern solves this by writing a lock record and checking whether another node has locked it recently:
function acquireLock(jobName, ttlMinutes) {
var lock = new GlideRecord('sys_lock');
lock.addQuery('name', jobName);
lock.query();
if (lock.next()) {
var age = parseInt(gs.dateDiff(lock.sys_created_on.getDisplayValue(), gs.nowDateTimeString(), true) / 1000 / 60);
if (age < ttlMinutes) {
return false; // lock is fresh
}
}
// Acquire or refresh the lock
lock.name = jobName;
lock.locked = gs.nowDateTimeString();
lock.retain = ttlMinutes;
lock.insert();
return true;
}
Use ttlMinutes to set a maximum expected runtime. If the lock is older than ttlMinutes, treat it as stale and proceed — the previous execution probably crashed.
Designing for Idempotency
A scheduled job should be safe to run multiple times. Idempotent jobs are easier to recover from failures and don't accumulate side effects over time.
Non-idempotent (bad):
// This RUNS EVERY TIME and accumulates duplicates
var inv = new GlideRecord('u_inventory');
inv.u_count = inv.u_count + 1;
inv.update();
Idempotent (good):
// This checks first — no duplicate side effects
var inv = new GlideRecord('u_inventory');
inv.addQuery('u_item_id', itemId);
inv.query();
if (!inv.next()) {
inv.u_item_id = itemId;
inv.u_count = 1;
inv.insert();
}
If your job creates records, use a coalescing key or a unique identifier check before inserting. If it updates records, always filter to the specific set you intend to modify rather than touching everything.
Handling Errors Gracefully
A scheduled job that swallows exceptions will run silently forever — you'll only discover the failure when the data is wrong or a customer complains.
Always log errors using gs.logError() which writes to the system log and triggers email alerts when configured. For jobs that need to notify on failure, use the Notifications system — trigger one from a Error Handler - SQE job:
/sys_emailer_list.do?sysparm_query=u_subjectLIKEScheduled Job Error
For long-running jobs, break the work into batches and commit between them. ServiceNow's transaction timeout defaults to 120 seconds. If your job exceeds this, split it across multiple executions or use a cursor pattern:
(function batchProcess() {
var BATCH_SIZE = 500;
var offset = parseInt(gs.getSession().getAttribute('batch_offset') || 0);
var gr = new GlideRecord('u_large_table');
gr.addNotNullQuery('u_processed');
gr.setLimit(BATCH_SIZE);
gr.query();
var processed = 0;
while (gr.next()) {
// work here
gr.u_processed = true;
gr.update();
processed++;
}
if (processed === BATCH_SIZE) {
gs.getSession().setAttribute('batch_offset', offset + BATCH_SIZE);
gs.eventQueue('u/reprocess_batch', null, null, null);
} else {
gs.getSession().setAttribute('batch_offset', 0);
gs.log('Batch job complete', 'Weekly_Batch_Process');
}
})();
Monitoring Job Health
The System Logs > Scheduled Jobs page ( /sys_auto_refresh_logs.do) shows execution history, duration, and success/failure status. Use this to spot jobs that are taking longer than expected.
For programmatic monitoring, query sys_history_line filtered by job name to get execution timing over time:
var stats = new GlideHistoryLine();
stats.addFilter('element', 'STARTSWITH', 'Weekly_Data_Purge');
stats.addNotNullQuery('field', 'DURATION');
// Aggregate average duration by day
You can also expose job health via a simple REST endpoint in your scoped app:
// In a scoped REST API
(function processRequest(request, response) {
var result = [];
var job = new GlideRecord('sys_auto_refresh_log');
job.addQuery('name', 'STARTSWITH', 'Weekly_');
job.orderByDesc('sys_created_on');
job.setLimit(10);
job.query();
while (job.next()) {
result.push({
name: job.name.getDisplayValue(),
duration: job.duration,
state: job.state.getDisplayValue(),
executed: job.sys_created_on.getDisplayValue()
});
}
response.setBody(result);
response.setContentType('application/json');
})(request, response);
Best Practices Summary
- Always use error handling —
try/catch+gs.logError()on failure - Cluster-safe your jobs — singleton lock pattern or ECC queue
- Make jobs idempotent — safe to run more than once without side effects
- Batch long operations — respect transaction timeouts
- Log at start and end — structured logs make debugging trivial
- Monitor execution history — set up alerts for repeated failures
- Use naming conventions —
Weekly_,Nightly_,Hourly_prefixes make job management cleaner
Scheduled jobs that follow these patterns run reliably for months without intervention. The upfront investment in cluster safety and idempotency pays for itself the first time a node restarts or a job runs during a peak traffic spike.
