Locy Notebook Plan: Semiconductor Yield Excursion Triage¶
This plan defines a flagship, notebook-first Locy story for non-automotive manufacturing. Goal: show how Locy helps process engineers move from noisy telemetry to explainable containment decisions.
1) Audience¶
- New Locy users: need clear schema-first setup and guided rule reading.
- Data/Process engineers: need realistic anomaly and containment logic.
- Advanced logic users: want
DERIVE,ASSUME,ABDUCE, andEXPLAINin one coherent workflow.
2) Data Backbone¶
Primary source data:
- UCI SECOM dataset (real semiconductor pass/fail + timestamp + sensor features)
- Citation DOI:
10.24432/C54305 - License: CC BY 4.0
Notebook-local prepared data:
examples/data/locy_semiconductor_yield_excursion/secom_lots.csvexamples/data/locy_semiconductor_yield_excursion/secom_feature_catalog.csvexamples/data/locy_semiconductor_yield_excursion/secom_excursions.csvexamples/data/locy_semiconductor_yield_excursion/secom_notebook_cases.csv
3) Story Arc¶
Narrative:
- Fab sees rising end-of-line failures.
- Engineers need to identify likely process module/tool contributors.
- They must decide hold/rework scope before scrap spreads.
- They need a defensible explanation for each action.
Notebook outcome:
- Surface likely excursion paths.
- Simulate containment options.
- Propose root-cause hypotheses.
- Explain each recommendation with rule-level provenance.
4) Notebook Structure (Detailed)¶
Section A: Setup and Context¶
- Explain business objective and what the reader will build.
- Import dependencies and open an isolated database.
- State that schema mode is required/recommended.
Section B: Schema-First Modeling¶
Define labels:
Lot(lot_id, yield_outcome, test_timestamp)Feature(feature_id, module, tool_id, effect_size, selected)Tool(tool_id, module)Module(name)
Define edges:
(:Lot)-[:OBSERVED_EXCURSION]->(:Feature)(:Feature)-[:MEASURED_ON]->(:Tool)(:Tool)-[:PART_OF]->(:Module)
Section C: Ingest Real Prepared Data¶
- Load CSV files.
- Insert lots, selected features, tools/modules, and excursion edges.
- Print small sanity checks:
- lot count
- fail count
- excursion edge count
Section D: Baseline DERIVE¶
Rules:
failed_lotfromLot(yield_outcome='FAIL')module_signalfrom failed lot excursions projected to module.tool_signalfrom failed lot excursions projected to tool.- Optional transitive relation
module_risk_pathfor module-level rollups.
Query examples:
- top modules by number of failed-lot excursions.
- top tools within the leading module.
Section E: Counterfactual ASSUME¶
Scenarios:
- Assume temporary hold on one tool.
- Assume tighter excursion threshold policy for one module.
Derivations:
contained_fail_lotunder the assumption.residual_fail_lotnot contained by the assumption.
Expected teaching point:
ASSUMElets readers test interventions before operational changes.
Section F: Root Cause Hypothesis with ABDUCE¶
Task:
- For a selected high-failure lot cohort, find smallest tool/module hypotheses that explain observed excursion pattern.
Expected output:
- ranked candidate hypotheses (for example, tool sets or module-level hypotheses).
Teaching point:
ABDUCEprovides plausible explanations, not guaranteed truth.
Section G: Decision Justification with EXPLAIN¶
Explain these:
- why a lot is classified high risk.
- why a lot remains residual risk after a chosen assumption.
Expected output:
- compact proof/trace path showing contributing rules and facts.
Teaching point:
- operational decisions become auditable and teachable.
Section H: Final Validation Checklist¶
- one module/tool should clearly dominate risk in this dataset slice.
- assumption should reduce impacted fail-lot set.
- abduced hypotheses should align with derived high-signal module/tool.
- explanations should reference concrete facts and derived relations.
5) Markdown-First Pedagogy Requirements¶
Each code cell must be preceded by markdown that states:
- what this cell does.
- why it matters operationally.
- what output shape to expect.
For Locy cells, add short reading tips:
- how to read each
CREATE RULE. - what relation keys represent.
- what would indicate a modeling bug.
6) Scope for v1 Notebook¶
In scope:
- end-to-end reproducible notebook using prepared SECOM-derived data.
- one full thread demonstrating
DERIVE->ASSUME->ABDUCE->EXPLAIN. - schema-first modeling and explicit expected outcomes.
Out of scope (v1):
- online feature engineering notebooks.
- training ML models in-notebook.
- real fab-identifiable tool naming.
7) Follow-on v2 Ideas¶
- Add temporal windows (
within last N hours) for excursion propagation. - Add maintenance logs as another fact source.
- Add multi-line manufacturing module support and comparative risk scoring.