SynthPath
Pattern-learning ARC solver. This system is a neural guided program synthesis solver that learns reusable patterns from solved tasks and compounds them across rounds.
92/400
TRAIN
8/400
EVAL
~14s
TIME
PIPELINE
PERCEPTION
39Each grid is parsed into a scene graph: objects detected via connected components, classified by role, linked by spatial relations, and summarized as a 64-dim feature vector for library recall.
CANDIDATE GENERATORS
60When no library pattern matches, beam search tries these generators. Each proposes candidate actions scored by minimum description length.
PATTERN DSL
57Learned patterns are stored as guard \u2192 bind \u2192 body programs. Guards check preconditions, binds extract task-specific values, and the body applies parameterized actions.
CHANGELOG
2026-03-19 18:00 — Neural-guided search, tiered library, diagnostics infrastructure
HintNet multi-head classifier predicts task priors (size family, action kind, flags) to rerank library recall, candidates, and beam search — model only reranks, verification stays in control. Pattern library now uses candidate/promoted/rejected tiers: promotion requires ≥2 distinct provenance tasks and non-source success; failure-based auto-demotion for high-use low-success patterns; default recall returns only promoted patterns. New --diag mode in benchmark.py emits per-task JSON with stage timing, recall quality, candidate coverage, beam stats, failure taxonomy, and hint quality metrics. Bug fixes: pixel_infer now tries geometric extractors before color maps (fixes generalization failure on fliplr tasks); action-family grouping now consistent between training and runtime; expensive solver stages (pixel infer, body sweep, neural) now respect timeout budget.
2026-03-19 14:00 — Comprehensive solve diagnostics
Added SolveDiagnostics dataclass capturing per-task timing, recall quality, failure taxonomy, and hint accuracy metrics. New solve_with_diagnostics() function instruments every pipeline stage (8 stages) with monotonic timing. Library gains recall_with_diagnostics() exposing cosine/hint_bonus/ hybrid scores per recalled pattern. Benchmark gets --diag flag that saves per-task JSON and prints aggregated summary: solved-by-stage breakdown, failure taxonomy, recall hit rate, avg stage time, hint quality metrics. 25 new tests, 1,695 total, 100% coverage.
2026-03-16 22:00 — Meta-strategy: auto-discover abstract rules
Built diff-driven synthesis that replaces blind beam search with structural analysis. Meta-strategy tries 10 abstract feature extractors (neighbor count, object size rank, border detection) and picks the simplest one consistent across all training pairs. Leave-one-out validation prevents overfitting. Key insight: pixel-level memorization doesn't generalize; abstract features (IS_BORDER, NEIGHBOR_COUNT, SIZE_RANK) do. Also: candidate filtering by same_dims/diff_dims, param-shape sub-grouping in extraction, first-step decomposition for partial pattern learning. 92/400 train (+4), 1,391 tests, 100% coverage.
2026-03-16 18:00 — Learning loop fixes: 3 param coordination fixes
Three fixes to unblock the pattern learning loop. Fix 1: body executor normalizes short-form fold_symmetry modes (“lr”→“sym_lr”, etc.) so hypothesis params flow through the DSL pipeline. Fix 2: split mixed ActionKind groups — PATTERN_CONTINUATION, HOLLOW_RECT_OP, FRAME_FILL now have distinct signatures for cleaner extraction grouping. Fix 3: added selector_hint task property for property-based selector explanation in extract_object groups. 1,321 tests, 100% coverage.
2026-03-16 16:30 — 13 new BodyKinds for pattern generalization
Added 13 new BodyKind enum values with full implementations: UPSCALE_BLOCK, EXTRACT_BY_PREDICATE, STAMP_AT_MARKERS, SEPARATOR_SUMMARY, COLOR_SUBSTITUTE, STACK_CONCAT, DRAW_LINE, RIGID_SHIFT, DAMAGE_REPAIR, PATTERN_EXTEND, NEIGHBOR_RULE, SLIDE_TO_WALL, COUNT_ENCODE. These enable the learning loop to generalize hypothesis solutions into reusable patterns. MOVE and EXTEND_LINE unstubbed (delegate to RIGID_SHIFT and DRAW_LINE). Updated antiunify mappings for 16 action types. Pattern DSL body vocabulary: 21 → 34. 1,313 tests, 100% coverage.
2026-03-17 00:15 — Port 12 inference engines (batch 3)
Ported 12 more inference engines from ericagi as candidate generators: directed cross, gravity fill, bbox complement fill, translate to target, connect over bg, recolor to closest, diagonal stamp, object outline, row period fill/extend, col period extend, pattern substitution. 88/400 train (22.0%), 54 candidate generators, 1,168 tests, 100% coverage.
2026-03-16 23:30 — Output construction hypothesis families
Added 7 new hypothesis families targeting diff-dims tasks: separator cell summary, upscale block, stack objects, crop to colored region, select unique cell from separator grid, 1x1 summary, and input-as-template. These handle the 138 unsolved diff-dims tasks (crop, scale, summary, stacking categories). 88/400 train (+12), 17 hypotheses total, 1,226 tests, 100% coverage.
2026-03-16 22:45 — Hypothesis families + ported inference engines
Added 6 new hypothesis families (template stamp, color mapping, gravity, symmetry completion, fill enclosed, extract by predicate) for instant pattern recognition before beam search. Ported 5 inference engines as candidates (diagonal connect, cross extension, gravity align, gap fill, rigid shift). 76/400 train (+3), 10 hypotheses, 42 candidate generators, 1050 tests, 100% coverage.
2026-03-16 20:30 — Role-aware candidate generators
Added 5 scene-graph-driven candidate generators that use role classification (FRAME, SEPARATOR, MARKER, TEMPLATE, LEGEND) to propose structured transformations: stamp template at markers, template recolor at markers, frame interior fill, separator grid operations, and legend color mapping. 73/400 train (+1), 951 tests, 100% coverage.