🚀TiDB
TiCDC

TiKV CDC component: LazyWorker and task model (Legacy notes)

Notes on TiKV CDC endpoint internals: LazyWorker/Worker, scheduling, and representative task types.

This page is a “code-reading note” for the TiKV CDC component. It’s meant to be a map of what to look at when you are debugging CDC behavior.

1. What “lazy worker” means

In TiKV, a “lazy worker” pattern generally means:

  • The worker runtime is created early, but work execution is scheduled and driven by messages/tasks.
  • A scheduler (sender) feeds tasks into a receiver loop.
  • The worker runs tasks on an internal thread pool (e.g. Yatp), with backpressure controls.

2. Worker / LazyWorker (conceptual)

Key ideas:

  • A worker usually wraps a thread pool plus scheduling primitives.
  • “Pending capacity” is often a soft guardrail: it warns when overloaded but does not always strictly prevent enqueuing.
  • Task counters can help detect overload and tune concurrency.

3. CDC endpoint tasks (what you actually debug)

Operationally, most CDC issues boil down to how tasks are scheduled, processed, and backpressured.

Common categories of tasks include:

  • Region lifecycle: region updates, destroys, register/deregister
  • Periodic progression: advance resolved ts / register advance events
  • Data path: changelog events, scan locks, apply-index based scanning
  • Configuration changes

When debugging, focus on:

  • Where tasks are produced (who enqueues)
  • Where tasks are consumed (runner loop)
  • Any channel size / backpressure points
  • Latency and queue depth under load

References (code pointers)