SynthPath

Pattern-learning ARC solver. This system is a neural guided program synthesis solver that learns reusable patterns from solved tasks and compounds them across rounds.

92/400

TRAIN

8/400

EVAL

~14s

TIME

PIPELINE

01perceivebuild scene graph, 64-dim feature vector
02recallcosine similarity against pattern library
03applyguard → bind → body → verify
04searchbeam search with candidate generators
05extractgroup solutions, anti-unify, validate
06learnadd new patterns to library for next round

PERCEPTION

39

Each grid is parsed into a scene graph: objects detected via connected components, classified by role, linked by spatial relations, and summarized as a 64-dim feature vector for library recall.

roles 6
FRAMESEPARATORMARKERTEMPLATELEGENDCONTENT
relations 14
ADJACENT_4ADJACENT_8CONTAINSSAME_ROWSAME_COLSAME_COLORSAME_SHAPEALIGNED_HALIGNED_VABOVEBELOWLEFT_OFRIGHT_OFINSIDE
properties 19
n_input_colorsn_output_colorsbackground_colorn_nonbg_colorsn_objectsh_ratiow_ratiohas_symmetrytile_mirrormax_object_sizemin_object_sizesmallest_object_colorlargest_object_colormost_frequent_input_colorleast_frequent_input_colorcolor_only_in_inputunique_output_colorchanged_colorselector_hint

CANDIDATE GENERATORS

60

When no library pattern matches, beam search tries these generators. Each proposes candidate actions scored by minimum description length.

global 6
flip-hflip-vrotate-90rotate-180rotate-270transpose
color 4
color-remapobject-recolorobject-removalrecolor-by-property
object 6
object-transformper-object-mirror-rotatesort-objectsextract-unique-objectneighbor-recolorobject-solver
extraction 5
crop-contentextract-by-roleframe-interiorminority-extractiondamage-extract
spatial 25
symmetry-completionfill-enclosedconnect-same-colorgravitycross-fillline-extensionborder-outlineflood-filldiagonal-connectcross-extensiongravity-aligngap-fillrigid-shiftdirected-crossgravity-fillbbox-complement-filltranslate-to-targetconnect-over-bgrecolor-to-closestdiagonal-stampobject-outlinerow-period-fillrow-period-extendcol-period-extendpattern-substitution
structural 9
tilestamp-templategrid-decomposepixel-rulesadditive-extensionpattern-continuationhollow-rectdamage-repairupscale
role-aware 5
stamp-template-at-markerstemplate-recolor-at-markersframe-fillseparator-grid-opslegend-mapping

PATTERN DSL

57

Learned patterns are stored as guard \u2192 bind \u2192 body programs. Guards check preconditions, binds extract task-specific values, and the body applies parameterized actions.

guard 13
SAME_DIMSDIFFERENT_DIMSMIN_OBJECTSMAX_OBJECTSHAS_COLOR_CHANGEHAS_SYMMETRYHAS_SEPARATORHAS_MARKERSIS_ADDITIVEOBJECTS_VARY_BYANDORNOT
bind 10
CONSTANTBACKGROUND_COLORALL_OBJECTSFILTER_OBJECTSPROPERTYRANK_BYLEARN_COLOR_MAPDIFF_COLORSUNIQUE_IN_OUTPUTMARKER_POSITIONS
body 34
RECOLORREMOVEMOVEMIRROR_OBJROTATE_OBJMIRROR_GRIDROTATE_GRIDTRANSPOSEGRAVITYFILL_ENCLOSEDFLOOD_FILLEXTEND_LINECONNECTCROP_CONTENTTILEFOLD_SYMMETRYGRID_DECOMPOSEFOR_EACHSEQUENCECONDITIONALLOOKUPUPSCALE_BLOCKEXTRACT_BY_PREDICATESTAMP_AT_MARKERSSEPARATOR_SUMMARYCOLOR_SUBSTITUTESTACK_CONCATDRAW_LINERIGID_SHIFTDAMAGE_REPAIRPATTERN_EXTENDNEIGHBOR_RULESLIDE_TO_WALLCOUNT_ENCODE

CHANGELOG

2026-03-19 18:00 — Neural-guided search, tiered library, diagnostics infrastructure

HintNet multi-head classifier predicts task priors (size family, action kind, flags) to rerank library recall, candidates, and beam search — model only reranks, verification stays in control. Pattern library now uses candidate/promoted/rejected tiers: promotion requires ≥2 distinct provenance tasks and non-source success; failure-based auto-demotion for high-use low-success patterns; default recall returns only promoted patterns. New --diag mode in benchmark.py emits per-task JSON with stage timing, recall quality, candidate coverage, beam stats, failure taxonomy, and hint quality metrics. Bug fixes: pixel_infer now tries geometric extractors before color maps (fixes generalization failure on fliplr tasks); action-family grouping now consistent between training and runtime; expensive solver stages (pixel infer, body sweep, neural) now respect timeout budget.

2026-03-19 14:00 — Comprehensive solve diagnostics

Added SolveDiagnostics dataclass capturing per-task timing, recall quality, failure taxonomy, and hint accuracy metrics. New solve_with_diagnostics() function instruments every pipeline stage (8 stages) with monotonic timing. Library gains recall_with_diagnostics() exposing cosine/hint_bonus/ hybrid scores per recalled pattern. Benchmark gets --diag flag that saves per-task JSON and prints aggregated summary: solved-by-stage breakdown, failure taxonomy, recall hit rate, avg stage time, hint quality metrics. 25 new tests, 1,695 total, 100% coverage.

2026-03-16 22:00 — Meta-strategy: auto-discover abstract rules

Built diff-driven synthesis that replaces blind beam search with structural analysis. Meta-strategy tries 10 abstract feature extractors (neighbor count, object size rank, border detection) and picks the simplest one consistent across all training pairs. Leave-one-out validation prevents overfitting. Key insight: pixel-level memorization doesn't generalize; abstract features (IS_BORDER, NEIGHBOR_COUNT, SIZE_RANK) do. Also: candidate filtering by same_dims/diff_dims, param-shape sub-grouping in extraction, first-step decomposition for partial pattern learning. 92/400 train (+4), 1,391 tests, 100% coverage.

2026-03-16 18:00 — Learning loop fixes: 3 param coordination fixes

Three fixes to unblock the pattern learning loop. Fix 1: body executor normalizes short-form fold_symmetry modes (“lr”→“sym_lr”, etc.) so hypothesis params flow through the DSL pipeline. Fix 2: split mixed ActionKind groups — PATTERN_CONTINUATION, HOLLOW_RECT_OP, FRAME_FILL now have distinct signatures for cleaner extraction grouping. Fix 3: added selector_hint task property for property-based selector explanation in extract_object groups. 1,321 tests, 100% coverage.

2026-03-16 16:30 — 13 new BodyKinds for pattern generalization

Added 13 new BodyKind enum values with full implementations: UPSCALE_BLOCK, EXTRACT_BY_PREDICATE, STAMP_AT_MARKERS, SEPARATOR_SUMMARY, COLOR_SUBSTITUTE, STACK_CONCAT, DRAW_LINE, RIGID_SHIFT, DAMAGE_REPAIR, PATTERN_EXTEND, NEIGHBOR_RULE, SLIDE_TO_WALL, COUNT_ENCODE. These enable the learning loop to generalize hypothesis solutions into reusable patterns. MOVE and EXTEND_LINE unstubbed (delegate to RIGID_SHIFT and DRAW_LINE). Updated antiunify mappings for 16 action types. Pattern DSL body vocabulary: 21 → 34. 1,313 tests, 100% coverage.

2026-03-17 00:15 — Port 12 inference engines (batch 3)

Ported 12 more inference engines from ericagi as candidate generators: directed cross, gravity fill, bbox complement fill, translate to target, connect over bg, recolor to closest, diagonal stamp, object outline, row period fill/extend, col period extend, pattern substitution. 88/400 train (22.0%), 54 candidate generators, 1,168 tests, 100% coverage.

2026-03-16 23:30 — Output construction hypothesis families

Added 7 new hypothesis families targeting diff-dims tasks: separator cell summary, upscale block, stack objects, crop to colored region, select unique cell from separator grid, 1x1 summary, and input-as-template. These handle the 138 unsolved diff-dims tasks (crop, scale, summary, stacking categories). 88/400 train (+12), 17 hypotheses total, 1,226 tests, 100% coverage.

2026-03-16 22:45 — Hypothesis families + ported inference engines

Added 6 new hypothesis families (template stamp, color mapping, gravity, symmetry completion, fill enclosed, extract by predicate) for instant pattern recognition before beam search. Ported 5 inference engines as candidates (diagonal connect, cross extension, gravity align, gap fill, rigid shift). 76/400 train (+3), 10 hypotheses, 42 candidate generators, 1050 tests, 100% coverage.

2026-03-16 20:30 — Role-aware candidate generators

Added 5 scene-graph-driven candidate generators that use role classification (FRAME, SEPARATOR, MARKER, TEMPLATE, LEGEND) to propose structured transformations: stamp template at markers, template recolor at markers, frame interior fill, separator grid operations, and legend color mapping. 73/400 train (+1), 951 tests, 100% coverage.