TiDB TTL stale rows: 'job is timeout' and all-zero job_history (root cause & fixes)
A redacted incident: TTL jobs keep hitting 'job is timeout' because scan uses an expensive plan; includes mechanism, diagnosis, remediation, and tuning checklist
1. Symptoms
Two huge tables (billions of rows) had TTL enabled (e.g. TTL=created + INTERVAL 6 MONTH, TTL_JOB_INTERVAL='1h'), but expired rows kept accumulating.
One important reminder: TiDB TTL does not guarantee “delete immediately at the deadline”. Backlog is allowed by design. The real issue is when the backlog does not trend down over time.
A typical starting point is:
SELECT *
FROM mysql.tidb_ttl_job_history
ORDER BY create_time DESC
LIMIT 10;And the job summary kept looking like (which easily leads to “TTL deletes nothing”):
{"total_rows":0,"success_rows":0,"error_rows":0,"total_scan_task":0,"scheduled_scan_task":0,"finished_scan_task":0,"scan_task_err":"job is timeout"}You may also see one or both of the following:
-
Slow log / statement summary shows internal TTL scan SQL (
Is_internal=true) that is extremely slow, andIndex_namesis oftencreated(created). -
TiDB log shows scan statements getting interrupted (often from cancel/kill path):
execute query for ttl scan task failed ... [tikv:1317] Query execution was interrupted
2. Quick diagnosis
This checklist is intentionally “outcome-first”, to avoid being misled by the all-zero job summary.
-
Outcome evidence (most reliable): periodic
COUNT(*)with a clear cutoff (expired rows going down, or at least no longer increasing).SELECT COUNT(*) AS expired_rows FROM db.t WHERE created < NOW() - INTERVAL 6 MONTH; -
Eliminate “it’s not running” (global switch / schedule window / table-level enable):
SELECT @@tidb_ttl_job_enable;SHOW VARIABLES LIKE 'tidb_ttl_job_schedule_window%';SHOW CREATE TABLE db.t;(checkTTL_ENABLE='ON',TTL_JOB_INTERVAL)
-
Inspect runtime state, not only history (strongly recommended while a job is running):
-
Table status (current/last job):
SELECT table_schema, table_name, current_job_id, current_job_owner_id, current_job_owner_hb_time, last_job_id, last_job_finish_time, last_job_summary_text FROM mysql.tidb_ttl_table_status WHERE table_schema='db' AND table_name='t'; -
Task-level evidence (TTL cleans
mysql.tidb_ttl_taskafter finishing a job, so you must capture it during runtime):SELECT job_id, scan_id, status, owner_id, owner_hb_time, state FROM mysql.tidb_ttl_task WHERE job_id = '<current_job_id>' ORDER BY scan_id;
-
-
External evidence: internal TTL scan/delete statements in slow logs / statement summary + Grafana TTL throughput trends.
- If you rely on statement summary, enable internal SQL:
SET GLOBAL tidb_stmt_summary_internal_query = ON;
- If you rely on statement summary, enable internal SQL:
-
Key comparison: take a real TTL scan SQL and compare
EXPLAIN ANALYZEdefault plan vs an id-driven plan (PRIMARY(id)or(id, created)):- If default uses
created(created)with hugetotal_process_keysand long latency, that’s often the bottleneck. - If
PRIMARY(id)/(id, created)reduces scanned keys/latency by an order of magnitude, the plan choice is the primary issue.
- If default uses
-
Interpretation of “all-zero + job is timeout” (a common v8.5.1 pitfall)
On the job timeout path, TiDB may write onlyscan_task_err="job is timeout"intosummary_textand keep counters as0. This is an observability gap, not proof that nothing was scanned/deleted. Job finish also cleansmysql.tidb_ttl_task, making post-mortem harder.
3. Quick mitigation
The fastest way to get a predictable cleanup rate (without changing table schema) is usually:
- Apply global plan binding to force TTL scan to be id-driven (
PRIMARY(id)). - Keep
tidb_ttl_delete_rate_limitconservative first (stability), then increase gradually while watching CDC/downstream lag.
If a one-time schema change is acceptable, a very common improvement for this TTL scan shape is a covering composite index and binding TTL scans to it:
ALTER TABLE db.t ADD INDEX idx_id_created (id, created);Reason: TTL scan needs both ORDER BY id and created < expire. (id, created) often makes scans id-ordered, avoids unnecessary lookups, and enables early-stop.
4. Full resolution
- Ensure the scan plan stays correct over time (binding consistently hit; verify via evidence). Target: scans no longer hit long-tail timeouts/cancels.
- Only after scans are fast, tune knobs based on the real bottleneck:
tidb_ttl_scan_worker_count/tidb_ttl_scan_batch_sizetidb_ttl_delete_worker_count/tidb_ttl_delete_batch_size/tidb_ttl_delete_rate_limittidb_ttl_running_tasks(make concurrency bounded and predictable)
- Close the observability loop: don’t rely solely on
job_history.deleted_rows; prefer cutoff counts + TTL metrics/slow logs + runtime task states.
5. Root-cause chain
5.1 Summary
TTL jobs repeatedly hit 'job is timeout' (job-level timeout is ~6 hours). The direct cause is an expensive scan plan: TTL scan is planned as created(created)-driven. With a huge backlog and low selectivity on created < expire, ORDER BY id LIMIT 500 degenerates into scanning a massive range (often with TopN/sort). Slow scans increase job timeouts/cancels, cancel in-flight scan statements, and make job_history look “all-zero”.
5.2 Detailed mechanism
This is typically a combination of insufficient execution throughput and misleading observability.
-
A simplified end-to-end TTL flow:
- leader timer fires and creates a job for each physical table;
- a job is split into multiple scan tasks (often ~64) in
mysql.tidb_ttl_task; - TaskManager on all TiDB instances claims tasks and runs scan workers + delete workers using internal SQL;
- runtime progress is mainly in
mysql.tidb_ttl_task.state(JSON), final summaries inmysql.tidb_ttl_job_history.summary_text.
-
TTL scan SQL (simplified) contains both
created < expireandORDER BY id LIMIT 500:SELECT LOW_PRIORITY SQL_NO_CACHE id FROM t WHERE id > ? AND id < ? AND created < FROM_UNIXTIME(?) ORDER BY id LIMIT 500; -
If the plan is
created(created)-driven, it may scan an enormouscreated < expirerange and then satisfyORDER BY id LIMIT 500(TopN/sort), making scans slow and scan-limited. -
Slow scans surface as two kinds of signals:
- job-level timeout: a job exceeding the ~6-hour threshold is force-finished with
'job is timeout'; - statement cancel/kill: on timeout/cancel, TiDB kills in-flight scan statements (often logged as
[tikv:1317] ... interrupted).
- job-level timeout: a job exceeding the ~6-hour threshold is force-finished with
-
On the timeout path,
job_history.summary_textmay record onlyscan_task_err="job is timeout"and keep counters at zero; job finish also cleansmysql.tidb_ttl_task, so partial progress evidence must be captured during runtime.
5.3 Visualization
flowchart TB
backlog["Huge backlog + low selectivity on created<expire<br/>→ stale rows accumulate"]
scan_sql["scan.go<br/>TTL scan SQL shape<br/>WHERE id range AND created<expire<br/>ORDER BY id LIMIT N"]
plan_bad["Bad plan choice: index=created(created)<br/>→ scans massive keys + TopN/sort"]
slow_scan["Very slow scan batches<br/>high Cop_time / Process_keys<br/>→ scan-limited throughput"]
timeout_job["job_manager.go:updateHeartBeatForJob<br/>job-level timeout (~6h)<br/>→ finish-by-timeout path"]
kill_stmt["session.go:KillStmt / cancel<br/>→ in-flight scan interrupted"]
tikv1317["tidb.log<br/>[tikv:1317] Query execution was interrupted"]
obs_gap["Observability gap in job_history<br/>summary_text='job is timeout' + counters may stay 0<br/>job.go:finish cleans tidb_ttl_task"]
no_progress["Unstable or near-zero effective progress<br/>→ expired rows do not trend down (stale rows persist)"]
fix["Fix: bind scans to PRIMARY(id)/(id,created)<br/>or add idx(id,created)<br/>→ early-stop → stable scan throughput"]
backlog --> scan_sql --> plan_bad --> slow_scan --> timeout_job --> kill_stmt --> tikv1317
timeout_job --> obs_gap
slow_scan --> no_progress
obs_gap --> no_progress
fix -.-> plan_bad
fix -.-> no_progressAppendix: Global plan binding (force PRIMARY)
If EXPLAIN ANALYZE proves FORCE INDEX(PRIMARY) is dramatically faster, apply a global binding so internal TTL statements also match it:
CREATE GLOBAL BINDING FOR
SELECT LOW_PRIORITY SQL_NO_CACHE `id`
FROM `db`.`t`
WHERE `id` > ? AND `id` < ?
AND `created` < FROM_UNIXTIME(?)
ORDER BY `id` LIMIT ?
USING
SELECT LOW_PRIORITY SQL_NO_CACHE `id`
FROM `db`.`t` FORCE INDEX (`PRIMARY`)
WHERE `id` > ? AND `id` < ?
AND `created` < FROM_UNIXTIME(?)
ORDER BY `id` LIMIT ?;Validation options:
SHOW GLOBAL BINDINGS;(enabled)- Run the original SQL once, then
SELECT @@LAST_PLAN_FROM_BINDING;returns1
Appendix: Incident timeline (redacted)
This timeline is included to make the “evidence → hypothesis → verification → remediation” path reusable. Exact numbers/settings are not prescriptions.
- 2026-01-20 ~ 2026-01-21: stale rows/backlog became obvious;
mysql.tidb_ttl_job_historyrepeatedly showedscan_task_err="job is timeout"with an all-zero summary. - Same window: TiDB logs showed TTL scan tasks being interrupted (
[tikv:1317] Query execution was interrupted), suggesting a cancel/kill path rather than “SQL/privilege issue”. - Config evidence: global TTL switch and schedule window were OK, but throughput settings were conservative (example: low
delete_rate_limit,scan/delete worker_count=4, small batch sizes). - 2026-01-22: captured a real TTL scan SQL from slow log and compared
EXPLAIN ANALYZEdefault plan vs an id-driven plan (PRIMARY(id)), confirming the default often pickedcreated(created)and scanned an enormous number of keys. - Remediation:
- apply global plan binding to force an id-driven scan plan (
PRIMARY(id)or(id, created)), making scans fast and stable; - then gradually raise throughput knobs (example:
scan/delete worker_countfrom 4 → 8,delete_batch_size100 → 200,scan_batch_size500 → 1020, and increasedelete_rate_limitas needed).
- apply global plan binding to force an id-driven scan plan (
- Validation: use cutoff
COUNT(*)as the primary truth, and use TTL metrics + internal SQL (slow log/stmt summary) to confirm throughput and plan stability.
Appendix: Runtime evidence checklist (bookmark this)
-
Quick view of all TTL-related sysvars:
SHOW VARIABLES LIKE 'tidb_ttl%'; -
Capture runtime tasks (because job finish cleans
mysql.tidb_ttl_task):SELECT table_schema, table_name, current_job_id FROM mysql.tidb_ttl_table_status WHERE table_schema='db' AND table_name='t'; SELECT scan_id, status, owner_id, owner_hb_time, state FROM mysql.tidb_ttl_task WHERE job_id = '<current_job_id>' ORDER BY scan_id; -
Don’t rely on a single signal:
job_historyanswers “timeout/finish patterns”task.state(JSON)answers “are we progressing right now?”- slow logs / TTL metrics answer “real throughput and bottleneck shifts”
-
System tables cheat sheet (the 4 you’ll use most):
mysql.tidb_ttl_table_status: one row per physical table (partitioned tables have more); current/last job + owner heartbeat + summariesmysql.tidb_ttl_task: runtime tasks +state(JSON)(cleaned when job finishes)mysql.tidb_ttl_job_history: final job summaries (watch the all-zero timeout pitfall)mysql.tidb_timers: persisted TTL scheduling (interval/enable/watermark)
-
Common sysvars (from “running” to “fast enough”):
- switches/windows:
tidb_ttl_job_enable,tidb_ttl_job_schedule_window_start_time,tidb_ttl_job_schedule_window_end_time - concurrency:
tidb_ttl_scan_worker_count,tidb_ttl_delete_worker_count,tidb_ttl_running_tasks - batch/throttling:
tidb_ttl_scan_batch_size,tidb_ttl_delete_batch_size,tidb_ttl_delete_rate_limit - internal SQL visibility:
tidb_stmt_summary_internal_query(controls whether internal statements appear in statement summary)
- switches/windows: