Semantic Association Metrics
Measurement framework for semantic spread and fidelity
A methodology for measuring semantic properties of word associations using embedding-based metrics. Implements the Divergent Association Task (DAT) framework for spread scoring and a novel fidelity metric for task validity. Provides the scoring foundation for the INS-001 Semantic Cartography instrument family, enabling measurement of how participants navigate conceptual space.
Core Constructs
This methodology operationalizes four constructs from the INS-001 Semantic Cartography instrument family:
| Construct | Definition | Operationalization |
|---|---|---|
| Spread | How much conceptual territory responses cover | Mean pairwise semantic distance (DAT methodology) |
| Fidelity | How well clues jointly identify the bridging task | Coverage × efficiency of foil elimination |
| Communicability | Whether meaning survives transmission to another agent | Reconstruction accuracy by partner or LLM |
| Surprisal | How unexpected a semantic transition is | Negative log-probability against reference network |
Theoretical Foundation
The scoring framework draws on two established research traditions:
Divergent Association Task (DAT) Olson et al. (2021) demonstrated that mean pairwise semantic distance between unrelated words correlates r ≈ 0.40 with composite creativity measures. This provides empirical validation for using semantic spread as a proxy for divergent thinking capacity.
Distributional Semantics Cosine similarity in embedding space serves as a standard measure of semantic relatedness. Hill, Reichart & Korhonen (2015) distinguish between similarity (categorical membership) and relatedness (associative connection)—embeddings capture the latter more reliably.
For caveats regarding embedding validity and cultural bias, see LIB-002: Digital Validity.
Metric Architecture
Both INS-001.1 (Signal) and INS-001.2 (Common Ground) use exactly two primary metrics:
- Spread — How much semantic territory do the clues cover?
- Fidelity — How well do clues jointly identify the bridging task?
These metrics are orthogonal (r = 0.055):
| Pattern | Spread | Fidelity | Interpretation |
|---|---|---|---|
| Creative & precise | High | High | Wide-ranging associations that triangulate the target |
| Divergent but off-task | High | Low | Scattered responses that don’t constrain the solution |
| Conventional but effective | Low | High | Clustered but accurate identification |
| Constrained & off-task | Low | Low | Limited exploration, poor task engagement |
Studies in This Family
| Study | Title | Focus | Status |
|---|---|---|---|
| MTH-002.1 | Spread and Fidelity Scoring | Core scoring algorithms and thresholds | Published |
| MTH-002.2 | Communicability Metrics | Reconstruction accuracy measurement | Calibrating |
| MTH-002.3 | Surprisal Scoring | Transition probability against reference networks | Calibrating |
| MTH-002.4 | Population Normalization | Bootstrap null distributions and percentile ranking | Calibrating |
Implementation
The scoring algorithms are implemented in:
- Primary:
scoring.py - Specification:
INS-001-scoring-metrics-v2.md
Limitations
| Limitation | Impact | Mitigation |
|---|---|---|
| Embedding model bias | Cultural and frequency effects in word representations | Document reference model; note Western/English bias |
| Threshold calibration | Interpretation bands require empirical validation | Derive from observed distributions; update with data |
| Single embedding model | Results depend on specific model choice | Use standardized model (text-embedding-3-small); note in methods |
Changelog
| Version | Date | Changes |
|---|---|---|
| 2.0 | 2026-01-18 | Updated terminology: “relevance” → “fidelity”, “divergence” → “spread”; updated metric architecture to reflect orthogonal metrics |
| 1.0 | 2026-01-15 | Initial publication |