TiCDC
TiKV CDC component: LazyWorker and task model (Legacy notes)
Notes on TiKV CDC endpoint internals: LazyWorker/Worker, scheduling, and representative task types.
This page is a “code-reading note” for the TiKV CDC component. It’s meant to be a map of what to look at when you are debugging CDC behavior.
1. What “lazy worker” means
In TiKV, a “lazy worker” pattern generally means:
- The worker runtime is created early, but work execution is scheduled and driven by messages/tasks.
- A scheduler (sender) feeds tasks into a receiver loop.
- The worker runs tasks on an internal thread pool (e.g. Yatp), with backpressure controls.
2. Worker / LazyWorker (conceptual)
Key ideas:
- A worker usually wraps a thread pool plus scheduling primitives.
- “Pending capacity” is often a soft guardrail: it warns when overloaded but does not always strictly prevent enqueuing.
- Task counters can help detect overload and tune concurrency.
3. CDC endpoint tasks (what you actually debug)
Operationally, most CDC issues boil down to how tasks are scheduled, processed, and backpressured.
Common categories of tasks include:
- Region lifecycle: region updates, destroys, register/deregister
- Periodic progression: advance resolved ts / register advance events
- Data path: changelog events, scan locks, apply-index based scanning
- Configuration changes
When debugging, focus on:
- Where tasks are produced (who enqueues)
- Where tasks are consumed (runner loop)
- Any channel size / backpressure points
- Latency and queue depth under load
References (code pointers)
- TiKV server starts CDC worker (entry point varies by version):
- CDC endpoint task enum (for the list of task types):