Pseudo-Classes Reference¶
All pseudo-classes supported by ast_select, organized by category. Pseudo-classes compose freely — chain as many as you need on a single selector.
Containment¶
:has() — Contains Descendant¶
-- Functions containing a return statement
SELECT name FROM ast_select('src/*.py', '.func:has(return_statement)');
-- Functions that call execute()
SELECT name FROM ast_select('src/*.py', '.func:has(.call#execute)');
:not(:has()) — Does Not Contain¶
-- Functions without a return statement
SELECT name FROM ast_select('src/*.py', '.func:not(:has(return_statement))');
-- Functions that never call execute()
SELECT name FROM ast_select('src/*.py', '.func:not(:has(.call#execute))');
Positional¶
:first-child / :last-child¶
-- First function definition among its siblings
SELECT name FROM ast_select('src/*.py', 'function_definition:first-child');
-- Last method in a class
SELECT name FROM ast_select('src/*.js', 'method_definition:last-child');
:nth-child(n)¶
1-based position among siblings:
-- Second function definition
SELECT name FROM ast_select('src/*.py', 'function_definition:nth-child(2)');
:empty¶
Nodes with no children (leaf nodes):
-- Empty blocks (pass-only functions, empty classes)
SELECT name FROM ast_select('src/*.py', 'block:empty');
:root¶
The top-level node (module/program):
Structural¶
:named¶
Nodes with a non-empty name field. Filters out the many anonymous structural nodes:
-- Only named function definitions (excludes unnamed wrappers)
SELECT name, start_line FROM ast_select('src/*.py', '.func:named');
:syntax¶
Syntax-only tokens (keywords, punctuation). Useful for distinguishing keywords from their parent constructs:
-- Just the `if` keyword tokens (not if_statement)
SELECT * FROM ast_select('src/*.py', 'if:syntax');
-- if_statement constructs (not keywords) — use :not(:syntax) or exact type
SELECT * FROM ast_select('src/*.py', 'if:not(:syntax)');
:definition / :reference / :declaration¶
Query by NAME_ROLE flag:
-- All name-binding sites (functions, classes, variables, parameters)
SELECT name, type FROM ast_select('src/*.py', ':definition');
-- All name references (identifiers that use a name)
SELECT name, type FROM ast_select('src/*.py', 'identifier:reference');
-- Forward declarations only (C++ prototypes, TS signatures)
SELECT name FROM ast_select('src/*.cpp', ':declaration');
Scope¶
:scope (bare) — Is a Scope Boundary¶
-- All scope-creating nodes (functions, classes, loops, module)
SELECT type, name FROM ast_select('src/*.py', ':scope');
:scope(type) — Within Nearest Ancestor Scope¶
The most powerful pseudo-class. Matches nodes within the nearest ancestor of the given type, excluding subtrees of nested ancestors of the same type.
This solves the nested function problem:
-- Return statements within their DIRECT enclosing function
-- (not returns in nested inner functions)
SELECT peek, start_line
FROM ast_select('src/*.py', 'return_statement:scope(function)');
-- Calls within the nearest class (not from nested classes)
SELECT name FROM ast_select('src/*.py', '.call:scope(class)');
Without :scope(), ast_has reports outer_function as containing execute() even when the call is inside a nested inner_function. With :scope(function), only the direct enclosing function matches.
Scope Columns¶
Every node in read_ast output carries a scope STRUCT:
scope: STRUCT<
current BIGINT, -- nearest enclosing scope's node_id (-1/NULL at root)
function BIGINT, -- nearest function ancestor's node_id
class BIGINT, -- nearest class/struct/trait ancestor
module BIGINT, -- nearest module/namespace ancestor
stack LIST<STRUCT<id BIGINT, kind SEMANTIC_TYPE>>
-- scope nodes only: full ancestor chain with kinds
>
scope.current replaces the pre-v1.7.4 scope_id column. The
function/class/module shortcuts are the big win: they let you ask
"what function is this in?" as a single struct-field read instead of a
range join against the AST. scope.stack replaces the old
scope_stack (now typed — each entry carries its semantic kind).
-- Scope chain for a class method:
SELECT name, type, scope.current, list_transform(scope.stack, s -> s.id)
FROM read_ast('src/*.py')
WHERE type IN ('module', 'class_definition', 'function_definition')
ORDER BY node_id LIMIT 5;
-- module: scope.current=NULL, stack=[0]
-- Config: scope.current=0, stack=[0, 32]
-- __init__: scope.current=32, stack=[0, 32, 42]
-- "who calls ExecuteRecursivePipelines?" — O(1) per candidate
SELECT DISTINCT caller.name
FROM read_ast('src/**/*.cpp') call_site
JOIN read_ast('src/**/*.cpp') caller
ON caller.node_id = call_site.scope.function
AND caller.file_path = call_site.file_path
WHERE call_site.semantic_type = 'COMPUTATION_CALL'
AND call_site.name = 'ExecuteRecursivePipelines';
To get any node's full scope chain when it isn't a scope node itself:
look up its scope.current → read that node's scope.stack.
Scope Resolution Macros¶
Three macros build on scope.current, scope.function, and scope.stack
for name resolution:
ast_exports(source) — module-level public definitions:
ast_imports(source) — imported names with source module hints:
ast_resolve(source) — reference → definition binding via scope chain walk:
-- For each reference, find which definition it binds to
SELECT ref_name, ref_line, def_line, def_type, scope_hops
FROM ast_resolve('src/main.py');
Cross-file resolution: JOIN ast_imports with ast_exports to resolve imports:
SELECT im.imported_name, ex.file_path as resolved_file, ex.type
FROM ast_imports('src/app.py') im
JOIN ast_exports('src/**/*.py') ex
ON ex.name = im.imported_name
AND ex.file_path LIKE '%' || im.source_module || '%';
Call Graph¶
:calls(name) — Scope Contains a Call¶
Matches nodes whose scope contains a call to name. Unlike :has(.call#name), this uses scope resolution to avoid matching calls in nested functions.
-- Functions that call execute (direct scope only)
SELECT name FROM ast_select('src/**/*.py', '.func:calls(execute)');
-- Classes that call validate
SELECT name FROM ast_select('src/**/*.py', '.class:calls(validate)');
:called-by(name) — Call Inside Function¶
Matches call nodes that are inside the function name:
-- All calls made by main()
SELECT name, start_line FROM ast_select('src/**/*.py', '.call:called-by(main)');
-- Database calls inside process_request
SELECT name FROM ast_select('src/**/*.py', '.call:called-by(process_request)');
:is-called — Function Is Called¶
Matches function definitions that are called somewhere in the file:
-- Functions that are actually called
SELECT name FROM ast_select('src/**/*.py', '.func:is-called');
-- Unused functions (defined but never called)
SELECT name FROM ast_select('src/**/*.py', '.func:not(:is-called)');
:is-referenced — Definition Is Referenced¶
Matches definitions that are referenced somewhere:
-- Variables that are actually used
SELECT name FROM ast_select('src/**/*.py', '.var:is-referenced');
-- Dead code: defined but never referenced
SELECT name FROM ast_select('src/**/*.py', '.func:not(:is-referenced)');
:exported — Module-Level Public Definition¶
Matches definitions at module scope that are part of the public API:
-- Public API surface
SELECT name, type FROM ast_select('src/**/*.py', ':exported');
-- Exported but never referenced internally
SELECT name FROM ast_select('src/**/*.py', ':exported:not(:is-referenced)');
Pseudo-Elements (:: — Navigation)¶
Pseudo-elements return different nodes rather than filtering. They navigate FROM matched nodes to related nodes.
Tree Navigation¶
-- Parent node
SELECT * FROM ast_select('src/*.py', 'return_statement::parent');
-- Enclosing scope
SELECT * FROM ast_select('src/*.py', '.func#inner::scope');
-- Nearest enclosing definition (function, class, variable)
SELECT * FROM ast_select('src/*.py', 'return_statement::parent-definition');
-- Adjacent siblings
SELECT * FROM ast_select('src/*.py', '.func#validate::next-sibling');
SELECT * FROM ast_select('src/*.py', '.func#validate::prev-sibling');
-- ::previous-sibling is accepted as an alias for ::prev-sibling
Call Graph Navigation¶
-- Functions that call this function
SELECT name FROM ast_select('src/*.py', '.func#get_user::callers');
-- What this function calls
SELECT name FROM ast_select('src/*.py', '.func#main::callees');
Pseudo-Element Quick Reference¶
| Pseudo-element | Returns | Cardinality |
|---|---|---|
::parent |
Parent node | 1 |
::scope |
Enclosing scope node | 1 |
::parent-definition |
Nearest enclosing definition | 1 |
::next-sibling |
Next sibling | 1 |
::prev-sibling |
Previous sibling | 1 |
::callers |
Functions that call this | N |
::callees |
Functions this calls | N |
Ordering¶
:precedes(type) — Before a Sibling¶
-- Comments that appear before function definitions
SELECT peek FROM ast_select('src/*.py', 'comment:precedes(function_definition)');
-- Import statements before class definitions
SELECT name FROM ast_select('src/*.py', 'import:precedes(class)');
:follows(type) — After a Sibling¶
-- Functions defined after the last class
SELECT name FROM ast_select('src/*.py', 'function_definition:follows(class)');
-- Statements after imports (module-level constants, etc.)
SELECT peek FROM ast_select('src/*.py', 'expression_statement:follows(import)');
These provide the reverse direction that CSS combinators (~, +) can't express. A ~ B returns B; :precedes(B) returns the A nodes.
Modifiers¶
-- Async functions
SELECT name FROM ast_select('src/*.py', '.func:async');
-- Static methods
SELECT name FROM ast_select('src/*.java', '.func:static');
-- Abstract classes
SELECT name FROM ast_select('src/*.java', '.class:abstract');
-- Const/final variables
SELECT name FROM ast_select('src/*.js', '.var:const');
-- Access modifiers
SELECT name FROM ast_select('src/*.java', '.func:public');
SELECT name FROM ast_select('src/*.java', '.func:private');
SELECT name FROM ast_select('src/*.java', '.func:protected');
Annotations¶
-- Decorated functions (have any annotation/decorator)
SELECT name FROM ast_select('src/*.py', '.func:decorated');
-- Functions with type annotations
SELECT name FROM ast_select('src/*.py', '.func:typed');
-- Functions without a return type (void/None)
SELECT name FROM ast_select('src/*.py', '.func:void');
-- Functions with variadic parameters (*args, **kwargs, ...rest)
SELECT name FROM ast_select('src/*.py', '.func:variadic');
Pattern Matching¶
Two pseudo-classes parse their argument as real code and compare it structurally to the AST. They differ in what they compare against:
:match("code")— the current node is the root of the parsed pattern. Strict: the target's type must equal the pattern root's type.:contains("code")— some descendant of the current node is the root of the parsed pattern. Equivalent to:has(:match("code")).
:match("code") — Current-Node Structural Match¶
Use :match when you know the type of node you're looking for and want to check it directly:
-- Find call expressions that are exactly db.execute() with no arguments
SELECT name FROM ast_select('src/*.py', 'call:match("db.execute()")');
-- Find return statements that return None
SELECT peek FROM ast_select('src/*.py', 'return_statement:match("return None")');
.func:match("db.execute()") returns zero rows — a function_definition is not a call node, so the types don't match. Use :contains for "function contains X".
:contains("code") — Subtree Structural Match¶
Use :contains when you want to find a pattern anywhere inside a larger node:
-- Functions that contain a db.execute() call somewhere in their body
SELECT name FROM ast_select('src/*.py', '.func:contains("db.execute()")');
-- Functions containing a specific return pattern
SELECT name FROM ast_select('src/*.py', '.func:contains("return None")');
-- Classes that contain a self.db assignment anywhere inside
SELECT name FROM ast_select('src/*.py', '.class:contains("self.db = ___")');
Wildcards and Semantics¶
Both pseudo-classes share the same pattern parser. Use ___ (triple underscore) as a wildcard for "any name":
-- Match any assignment to self.<something>
SELECT name FROM ast_select('src/*.py', '.func:contains("self.___ = ___")');
Both use DFS pre-order contiguity — a subtree is a contiguous slice of the node array, so structural matching becomes array substring matching. No recursive tree traversal. :match is a direct lookup on the current node; :contains scans descendants.
One Pattern per Selector¶
Only one :match or :contains is supported per selector. To combine multiple patterns, chain ast_select calls via a CTE:
WITH execute_callers AS (
SELECT * FROM ast_select('src/*.py', '.func:contains("db.execute()")')
)
SELECT f.name FROM execute_callers f
WHERE EXISTS (
SELECT 1 FROM ast_select('src/*.py', '.func:contains("return ___")') r
WHERE r.file_path = f.file_path AND r.node_id = f.node_id
);
Custom Predicates¶
Define your own pseudo-classes by registering macros with the ast_selector_predicate_<name> naming convention. Once registered, use them in selectors as :<name> or :<name>("arg").
Setup¶
Dynamic custom predicates require the func_apply extension for runtime dispatch:
The pragma loads func_apply and replaces the internal dispatch stub with a real apply()-based dispatcher. If func_apply isn't installed, the pragma raises a clear error with next steps.
Defining a Predicate¶
A predicate macro takes two arguments: node (the AST row as a struct) and arg (the string argument from the selector, or NULL if none):
-- Predicate with an argument: :name_starts("prefix")
CREATE MACRO ast_selector_predicate_name_starts(node, prefix) AS (
node.name IS NOT NULL AND starts_with(node.name, prefix)
);
-- Predicate with no argument: :is_deep
CREATE MACRO ast_selector_predicate_is_deep(node, arg) AS (
node.depth >= 3
);
-- Predicate using semantic types: :is_test
CREATE MACRO ast_selector_predicate_is_test(node, arg) AS (
node.name IS NOT NULL AND (
starts_with(node.name, 'test_')
OR starts_with(node.name, 'Test')
)
);
The node struct contains all columns from read_ast() — name, type, depth, semantic_type, peek, start_line, scope, etc. Use any column to build your predicate logic.
Using Custom Predicates¶
Custom predicates work exactly like built-in pseudo-classes:
-- With an argument
SELECT name FROM ast_select('src/*.py', '.func:name_starts("test_")');
-- Without an argument
SELECT name FROM ast_select('src/*.py', 'function_definition:is_deep');
-- Negation
SELECT name FROM ast_select('src/*.py', '.func:not(:is_test)');
-- Combined with built-in pseudo-classes
SELECT name FROM ast_select('src/*.py', '.func:named:is_deep');
SELECT name FROM ast_select('src/*.py', '.func:is_test:has(return_statement)');
Discoverability¶
All registered predicates are discoverable via DuckDB's function catalog:
Example: FTS Predicate¶
Combine with DuckDB's full-text search to find nodes by content:
-- Build a text index on the AST
CREATE TABLE code AS SELECT * FROM read_ast('src/**/*.py');
PRAGMA create_fts_index('code', 'node_id', 'peek');
-- Predicate that checks FTS relevance
CREATE MACRO ast_selector_predicate_mentions(node, term) AS (
node.peek IS NOT NULL AND node.peek ILIKE '%' || term || '%'
);
-- Functions mentioning "database"
SELECT name FROM ast_select('src/**/*.py', '.func:mentions("database")');
Error Messages¶
If you use a custom pseudo-class without the required setup, you'll get targeted guidance:
- No
func_applyinstalled: suggestsINSTALL func_apply FROM community;thenPRAGMA sitting_duck_enable_dynamic_predicates; func_applyloaded but no matching macro: reports the specific macro name that's missing
Advanced: Custom Dispatch Without func_apply¶
Under the hood, custom predicates are dispatched through the ast_dispatch_predicate(fn, node, arg) macro. When func_apply is loaded, Sitting Duck registers this as a call to apply(). Without func_apply, it's a no-op stub.
You can replace this macro with your own dispatcher — no func_apply required:
CREATE OR REPLACE MACRO ast_dispatch_predicate(fn, node, arg) AS (
CASE fn
WHEN 'ast_selector_predicate_is_test'
THEN node.name IS NOT NULL AND starts_with(node.name, 'test_')
WHEN 'ast_selector_predicate_mentions'
THEN node.peek IS NOT NULL AND node.peek ILIKE '%' || arg || '%'
ELSE false
END
);
This is useful in environments where you can't install community extensions, or when you have a fixed set of predicates and want to avoid the func_apply dependency.
Quick Reference¶
| Pseudo-class | Meaning |
|---|---|
| Containment | |
:has(sel) |
Contains descendant matching sel |
:not(:has(sel)) |
Does NOT contain descendant |
:match("code") |
Current node IS the parsed pattern root (direct match) |
:contains("code") |
Some descendant IS the parsed pattern root (subtree match) |
| Positional | |
:first-child |
First among siblings |
:last-child |
Last among siblings |
:nth-child(n) |
Nth sibling (1-based) |
:empty |
No children |
:root |
Top-level node (depth 0) |
| Structural | |
:named |
Has a non-empty name |
:syntax |
Syntax-only token (keyword, punctuation) |
:definition |
Introduces a name with implementation |
:reference |
Uses a name |
:declaration |
Introduces a name without implementation |
| Scope | |
:scope |
Is a scope boundary |
:scope(type) |
Within nearest ancestor of type (scope-aware) |
| Call Graph | |
:calls(name) |
Scope contains a call to name |
:called-by(name) |
This call is inside function name |
:is-called |
Function is called somewhere |
:is-referenced |
Definition is referenced somewhere |
:exported |
Module-level public definition |
| Ordering | |
:precedes(type) |
Before a sibling of type |
:follows(type) |
After a sibling of type |
| Modifiers | |
:async |
Has async modifier |
:static |
Has static modifier |
:abstract |
Has abstract modifier |
:const |
Has const/final modifier |
:public / :private / :protected |
Access modifiers |
| Annotations | |
:decorated |
Has decorators/annotations |
:typed |
Has type annotation/signature |
:void |
No return type |
:variadic |
Has variadic parameters (*args, ...rest) |
| Custom | |
:<name> |
User-defined predicate (requires func_apply) |
:<name>("arg") |
User-defined predicate with argument |
See Also¶
- CSS Selectors Overview — Combinators, compound selectors, API reference
- Node Type Selectors — Three tiers of type specificity
- Attribute Selectors — Query by name, modifier, annotation, and more
- Semantic Type Aliases — Full alias table for
.semanticselectors