Skip to content

Sitting Duck v1.7.4

Release date: 2026-04-14

A substantial release covering CSS selector correctness, a schema consolidation that unlocks major perf wins, and a 17× speedup on selector queries against large codebases.

Highlights

  • scope STRUCT replaces scope_id / scope_stack (breaking). One column replaces two and carries much more information — including precomputed scope.function, scope.class, scope.module shortcuts on every node. See Breaking changes below.
  • ~17× faster CSS selector queries on large codebases. Two independent bugs combined to make ast_select pathologically slow: compound filter leak (.func#name matched every node with that name, not just functions) and combinator EXISTS decorrelation (each query paid for all 4 combinator branches regardless of shape).
  • Call-graph queries on DuckDB-scale corpora went from 20+ seconds (or OOM) to sub-second. ast_callers / ast_callees now use the new scope.function column for direct hash lookups.
  • Six CSS selector bug fixes, full multi-language test coverage (687 CSS assertions across Python/JS/Rust/Go + Rosetta).
  • is_semantic_type() predicate speedup — replacing literal-pattern calls with semantic_type = 'FAMILY_NAME' (SEMANTIC_TYPE equality) is 29–73× faster; cascade reorder by call frequency gives another 11–19× on the remaining dynamic cases.

Breaking changes

scope_id + scope_stackscope STRUCT

The pre-v1.7.4 schema had two top-level scope columns: - scope_id BIGINT — nearest enclosing scope's node_id - scope_stack LIST<BIGINT> — on scope-creating nodes, the full chain of ancestor scope node_ids

These are replaced by a single STRUCT column:

scope: STRUCT<
    current   BIGINT,   -- replaces scope_id
    function  BIGINT,   -- new: nearest function ancestor's node_id
    class     BIGINT,   -- new: nearest class ancestor's node_id
    module    BIGINT,   -- new: nearest module/namespace ancestor's node_id
    stack     LIST<STRUCT<id BIGINT, kind SEMANTIC_TYPE>>
                        -- replaces scope_stack; now typed
>

Migration:

Before After
scope_id scope.current
scope_stack (LIST) list_transform(scope.stack, s -> s.id) for the same shape, or use scope.stack directly to get (id, kind) structs
Manual walk to find enclosing function scope.function — now a single column read

The three new named shortcuts (scope.function, scope.class, scope.module) are populated on every node by scanning the live scope stack during parsing, at no extra query-time cost. They turn common questions like "who calls this function?" from range joins over the full AST into hash lookups on a precomputed column.

Macros that previously used scope_id / scope_stack (ast_exports, ast_imports, ast_resolve, ast_callers, ast_callees, plus the CSS selector engine) have all been updated — no user-facing change there beyond naming in docs.

is_semantic_type() alias additions

Added NAMESPACE and NS as aliases for DEFINITION_MODULE, so callers from C++/Rust/Java contexts can write .namespace selectors (matches the same set as .module / .mod / .package). Not breaking — pure additions.

Performance

Benchmarks on duckdb/src/**/*.{cpp,hpp} (~1.2M AST nodes, pre-parsed table via ast_select_from):

Query Before After Speedup
function_definition#X::callers ~19.3s ~0.6s 32×
function_definition#X::callees ~19.9s ~0.6s 32×
is_semantic_type(st, 'CALL') isolated 232ms 12ms 19×
is_semantic_type(st, 'FUNCTION') isolated 72ms 6.5ms 11×

ast_callers / ast_callees on the same corpus (including parse time) dropped from OOM-or-multi-second to ~2.3s end-to-end, and are no longer quadratic in the number of functions.

CSS selector bug fixes

Six selector-engine bugs surfaced by the new multi-language test coverage, each with a tracker entry in tracker/bugs/008-css-selector-issues.md:

  1. type:has(.class) returned 0.class inside :has() args leaked into the top-level class_filter, making the base match demand the wrong semantic type. Fixed by anti-joining sel_arg_blocks in simple_class_candidates (parity with type/id candidates).
  2. :not(:has(.class)) had the same root causenot_has_conditions CTE was missing not_has_class. Added.
  3. Combinator + pseudo-class on the right side (e.g., class_def ~ class_def:has(X)) silently ignored the combinator because CSS parses it as pseudo_class_selector(sibling_selector, :has). Added sel_pseudo_class_unwrap CTE that re-roots to the combinator, with recursive handling for nested wraps like A > B:has(X):not(Y):first-child.
  4. ast_select_from missing from embedded macros — defined in src/sql_macros/css_selectors.sql but never embedded. Regen + fix of a latent language reference left over from the refactor.
  5. Compound filter leak across root selector types.func#X (root = id_selector) ignored the .func class filter, matching every node named X (call sites, identifiers, the definition). Applied all three simple filters (type/name/class) uniformly from every root type.
  6. Combinator EXISTS decorrelated into always-on joins — DuckDB's optimizer hoisted the EXISTS subqueries inside combinator branches into full-AST hash joins, so every selector paid for all 4 combinator branches. Restructured matched_raw as a UNION ALL of per-sel_type sub-queries, each gated by sp.sel_type = 'X' at the FROM-WHERE level so unused branches short-circuit to zero rows.

Other changes

  • Multi-language CSS selector test coverage (test/sql/css_selectors_multilang.test, 460 assertions across Python/JS/Rust/Go and Rosetta code examples).
  • tracker/bugs/007-duckdb-correlated-macro-bind-error.md updated with the DuckDB v1.5.2 timeline — PR #21913 merged and reverted on 2026-04-13, the v1.5.2 release does NOT contain the fix. PR #22033 now targets v1.5-variegata with a tightened version.
  • tracker/features/032-namespace-taxonomy-refinement.md — design note on the constraints of adding DEFINITION_NAMESPACE as a distinct semantic type (all four DEFINITION super-type slots are currently full; path forward is refining via language-specific bits).
  • Cleaner semantic_types.hpp comments distinguishing DEFINITION_MODULE (named module/namespace definitions) from ORGANIZATION_CONTAINER (structural file/program roots).
  • ast_select_rules / ast_select_list remain blocked by DuckDB upstream PR #22033; test/sql/_wip_ast_select_rules.test still only asserts registration. Will be unblocked when the upstream fix lands in a 1.5.x point release.

Key commits

Commit Description
693432c test: add multi-language CSS selector coverage and document bugs
ad64555 fix: semantic class inside :has() and :not(:has()) with type base
fd596d6 fix: unwrap pseudo_class_selector around combinator at selector root
4ead4ff fix: ast_select_from language reference and ship it in embedded macros
aa97512 fix: nested pseudo-class wraps around combinators
dde2791 perf: 17x faster ast_select on large codebases; fix compound filter leak
f5ca187 feat: consolidate scope_id + scope_stack into a single scope STRUCT column
9166c71 perf: rewrite ast_callers / ast_callees with scope.function
b789e7c perf: replace is_semantic_type() with SEMANTIC_TYPE equality on hot paths
e712d10 perf: reorder is_semantic_type cascade by call frequency
bcf613b perf: rewrite pe_callers in css_selectors.sql to use scope.function
465ffee feat: add NAMESPACE/NS aliases for DEFINITION_MODULE in is_semantic_type
c4d98c4 test: update fixture counts after css_selectors_test.* additions