Semantic Types Reference¶
Complete reference for the semantic type system.
Overview¶
The semantic type system uses an 8-bit encoding to classify AST nodes into universal categories that work across all 27 supported languages.
The SEMANTIC_TYPE Column¶
The semantic_type column uses a custom DuckDB logical type called SEMANTIC_TYPE:
-- Displays as human-readable string
SELECT semantic_type FROM read_ast('file.py') LIMIT 1;
-- Returns: DEFINITION_FUNCTION (not 240)
-- Direct string comparison works
SELECT * FROM read_ast('file.py')
WHERE semantic_type = 'DEFINITION_FUNCTION';
-- Check the type
SELECT typeof(semantic_type) FROM read_ast('file.py') LIMIT 1;
-- Returns: SEMANTIC_TYPE
The type stores values efficiently as UTINYINT internally while displaying as readable strings.
Quick Reference¶
| Code | Name | Description |
|---|---|---|
| 32 | METADATA_COMMENT | Comments and documentation |
| 36 | METADATA_ANNOTATION | Decorators, annotations |
| 48 | EXTERNAL_IMPORT | Import/include statements |
| 52 | EXTERNAL_EXPORT | Export statements |
| 64 | LITERAL_NUMBER | Numeric values |
| 68 | LITERAL_STRING | String values |
| 72 | LITERAL_ATOMIC | Boolean, null |
| 80 | NAME_IDENTIFIER | Simple identifiers |
| 84 | NAME_QUALIFIED | Dotted names |
| 112 | TYPE_PRIMITIVE | Basic types |
| 144 | FLOW_CONDITIONAL | If/switch |
| 148 | FLOW_LOOP | For/while |
| 152 | FLOW_JUMP | Return/break/continue |
| 160 | ERROR_TRY | Try blocks |
| 164 | ERROR_CATCH | Catch blocks |
| 208 | COMPUTATION_CALL | Function calls |
| 212 | COMPUTATION_ACCESS | Member access |
| 220 | COMPUTATION_LAMBDA | Anonymous functions |
| 240 | DEFINITION_FUNCTION | Function definitions |
| 244 | DEFINITION_VARIABLE | Variable definitions |
| 248 | DEFINITION_CLASS | Class definitions |
| 252 | DEFINITION_MODULE | Module definitions |
Encoding Structure¶
8-bit encoding: [ss kk tt ll]
ss (bits 6-7): Super Kind (4 categories)
kk (bits 4-5): Kind (16 subcategories)
tt (bits 2-3): Super Type (4 per kind)
ll (bits 0-1): Language-specific
Super Kinds¶
META_EXTERNAL (0x00-0x3F)¶
Metadata, parser constructs, and external references.
| Kind | Code Range | Description |
|---|---|---|
| PARSER_SPECIFIC | 0-15 | Syntax, delimiters |
| RESERVED | 16-31 | Future use |
| METADATA | 32-47 | Comments, annotations |
| EXTERNAL | 48-63 | Imports, exports |
DATA_STRUCTURE (0x40-0x7F)¶
Data representation and naming.
| Kind | Code Range | Description |
|---|---|---|
| LITERAL | 64-79 | Values |
| NAME | 80-95 | Identifiers |
| PATTERN | 96-111 | Patterns |
| TYPE | 112-127 | Type info |
CONTROL_EFFECTS (0x80-0xBF)¶
Program flow and execution.
| Kind | Code Range | Description |
|---|---|---|
| EXECUTION | 128-143 | Statements |
| FLOW_CONTROL | 144-159 | Conditionals, loops |
| ERROR_HANDLING | 160-175 | Try/catch |
| ORGANIZATION | 176-191 | Blocks, structure |
COMPUTATION (0xC0-0xFF)¶
Operations and definitions.
| Kind | Code Range | Description |
|---|---|---|
| OPERATOR | 192-207 | Operators |
| COMPUTATION_NODE | 208-223 | Calls, access |
| TRANSFORM | 224-239 | Queries, iteration |
| DEFINITION | 240-255 | Functions, classes |
Helper Functions¶
semantic_type_to_string(code)¶
Convert code to name:
get_super_kind(code)¶
Get super kind:
get_kind(code)¶
Get kind:
is_definition(code)¶
Check if definition:
is_call(code)¶
Check if function call:
is_control_flow(code)¶
Check if control flow:
SELECT is_control_flow(144); -- true (CONDITIONAL)
SELECT is_control_flow(148); -- true (LOOP)
SELECT is_control_flow(152); -- true (JUMP)
is_identifier(code)¶
Check if identifier:
Specific Type Predicates¶
Convenience macros for common semantic type checks:
Definition Predicates¶
-- Check for function definitions
SELECT * FROM read_ast('file.py') WHERE is_function_definition(semantic_type);
-- Check for class definitions
SELECT * FROM read_ast('file.py') WHERE is_class_definition(semantic_type);
-- Check for variable definitions
SELECT * FROM read_ast('file.py') WHERE is_variable_definition(semantic_type);
-- Check for module definitions
SELECT * FROM read_ast('file.py') WHERE is_module_definition(semantic_type);
-- Check for type definitions (typedef, type alias)
SELECT * FROM read_ast('file.py') WHERE is_type_definition(semantic_type);
Computation Predicates¶
-- Check for function/method calls
SELECT * FROM read_ast('file.py') WHERE is_function_call(semantic_type);
-- Check for member/property access
SELECT * FROM read_ast('file.py') WHERE is_member_access(semantic_type);
Literal Predicates¶
-- Check for string literals
SELECT * FROM read_ast('file.py') WHERE is_string_literal(semantic_type);
-- Check for number literals
SELECT * FROM read_ast('file.py') WHERE is_number_literal(semantic_type);
-- Check for boolean literals
SELECT * FROM read_ast('file.py') WHERE is_boolean_literal(semantic_type);
-- Check for any literal
SELECT * FROM read_ast('file.py') WHERE is_literal(semantic_type);
Control Flow Predicates¶
-- Check for conditionals (if/switch/match)
SELECT * FROM read_ast('file.py') WHERE is_conditional(semantic_type);
-- Check for loops (for/while/do)
SELECT * FROM read_ast('file.py') WHERE is_loop(semantic_type);
-- Check for jumps (return/break/continue/throw)
SELECT * FROM read_ast('file.py') WHERE is_jump(semantic_type);
Operator Predicates¶
-- Check for assignments
SELECT * FROM read_ast('file.py') WHERE is_assignment(semantic_type);
-- Check for comparisons
SELECT * FROM read_ast('file.py') WHERE is_comparison(semantic_type);
-- Check for arithmetic operations
SELECT * FROM read_ast('file.py') WHERE is_arithmetic(semantic_type);
-- Check for logical operations (and/or/not)
SELECT * FROM read_ast('file.py') WHERE is_logical(semantic_type);
External/Import Predicates¶
-- Check for import statements (import, from...import, use, require)
SELECT * FROM read_ast('file.py') WHERE is_import(semantic_type);
-- Check for export statements
SELECT * FROM read_ast('file.js') WHERE is_export(semantic_type);
-- Check for foreign function interface declarations
SELECT * FROM read_ast('file.rs') WHERE is_foreign(semantic_type);
Metadata Predicates¶
-- Check for comments
SELECT * FROM read_ast('file.py') WHERE is_comment(semantic_type);
-- Check for annotations/decorators
SELECT * FROM read_ast('file.py') WHERE is_annotation(semantic_type);
-- Check for preprocessor directives (#include, #define)
SELECT * FROM read_ast('file.c') WHERE is_directive(semantic_type);
Organization Predicates¶
-- Check for blocks/scopes
SELECT * FROM read_ast('file.py') WHERE is_block(semantic_type);
-- Check for lists/arrays/containers
SELECT * FROM read_ast('file.py') WHERE is_list(semantic_type);
Type Predicates¶
-- Check for primitive types (int, string, bool)
SELECT * FROM read_ast('file.go') WHERE is_type_primitive(semantic_type);
-- Check for composite types (struct, union, tuple)
SELECT * FROM read_ast('file.go') WHERE is_type_composite(semantic_type);
-- Check for reference/pointer types
SELECT * FROM read_ast('file.rs') WHERE is_type_reference(semantic_type);
-- Check for generic/template types
SELECT * FROM read_ast('file.ts') WHERE is_type_generic(semantic_type);
Filtering Patterns¶
By Exact Type¶
By Super Kind¶
By Kind¶
Using Helper Functions¶
-- All definitions
SELECT * FROM read_ast('file.py')
WHERE is_definition(semantic_type);
-- All control flow
SELECT * FROM read_ast('file.py')
WHERE is_control_flow(semantic_type);
Semantic Refinements¶
Some types include refinements for more specific categorization.
Function Refinements¶
| Refinement | Description |
|---|---|
REGULAR |
Standard function |
LAMBDA |
Anonymous function |
CONSTRUCTOR |
Class constructor |
GETTER |
Property getter |
SETTER |
Property setter |
ASYNC |
Async function |
Variable Refinements¶
| Refinement | Description |
|---|---|
MUTABLE |
Mutable variable |
IMMUTABLE |
Constant |
PARAMETER |
Function parameter |
FIELD |
Class field |
Class Refinements¶
| Refinement | Description |
|---|---|
REGULAR |
Standard class |
ABSTRACT |
Abstract class/interface |
ENUM |
Enumeration |
STRUCT |
Struct type |
Loop Refinements¶
| Refinement | Description |
|---|---|
ITERATOR |
For/foreach loop |
CONDITIONAL |
While loop |
INFINITE |
Infinite loop |
Conditional Refinements¶
| Refinement | Description |
|---|---|
BINARY |
If/else |
MULTIWAY |
Switch/match |
TERNARY |
Ternary expression |
Universal Flags¶
In addition to semantic types, each node has a flags field for orthogonal properties.
Flag Values¶
| Flag | Value | Description |
|---|---|---|
IS_CONSTRUCT |
0x01 | Semantic language construct (not punctuation) |
IS_EMBODIED |
0x02 | Has body/implementation (definition vs declaration) |
Flag Helper Functions¶
-- Check if node is a semantic construct
SELECT is_construct(flags) FROM read_ast('file.py');
-- Check if node has implementation body (definition vs declaration)
SELECT is_embodied(flags) FROM read_ast('file.cpp');
SELECT has_body(flags) FROM read_ast('file.cpp'); -- alias
Distinguishing Definitions from Declarations¶
-- Find only function definitions (with body), not forward declarations
SELECT name, file_path
FROM read_ast('**/*.cpp', ignore_errors := true)
WHERE semantic_type = 'DEFINITION_FUNCTION'
AND is_embodied(flags) -- Has implementation
AND name IS NOT NULL;
-- Find forward declarations only
SELECT name, file_path
FROM read_ast('**/*.{h,hpp}', ignore_errors := true)
WHERE semantic_type = 'DEFINITION_FUNCTION'
AND NOT has_body(flags) -- Declaration only
AND name IS NOT NULL;
-- Using is_definition() for all definition types (functions, classes, variables)
SELECT name, semantic_type, file_path
FROM read_ast('**/*.cpp', ignore_errors := true)
WHERE is_definition(semantic_type)
AND has_body(flags)
AND name IS NOT NULL;
Cross-Language Examples¶
Functions Across Languages¶
-- Python: def, async def
-- JavaScript: function, arrow functions
-- Java: method_declaration
-- Go: function_declaration
-- All have semantic_type = 'DEFINITION_FUNCTION'
SELECT language, name, type
FROM read_ast(['**/*.py', '**/*.js', '**/*.java'], ignore_errors := true)
WHERE semantic_type = 'DEFINITION_FUNCTION'
ORDER BY language, name;
Classes Across Languages¶
-- Python: class_definition
-- Java: class_declaration
-- TypeScript: class_declaration
-- C++: class_specifier
-- All have semantic_type = 'DEFINITION_CLASS'
SELECT language, name
FROM read_ast(['**/*.py', '**/*.java', '**/*.cpp'], ignore_errors := true)
WHERE semantic_type = 'DEFINITION_CLASS';
Next Steps¶
- Core Functions - Function reference
- Parameters - Parameter reference
- Cross-Language Analysis - Practical examples