Gated Cross-Attention Matcher for Aligning Course with Relevant KUs

ABSTRACT

Curriculum alignment remains a persistent challenge in academic program design, accreditation processes, and assessment development. Course syllabi and outcome tables typically describe instructional intent at a broad level, whereas examinations evaluate fine-grained competencies through micro-assessments such as short-answer prompts, sub-topics, and specific skill checks. This cross-granularity mismatch, combined with inconsistent phrasing and boilerplate educational language, makes traditional keyword search and generic embedding retrieval methods unreliable.

To address this problem, we propose a retrieval-based alignment framework centered on a lightweight Gated Cross-Attention Module (GCAM) that improves robustness, adaptability, and interpretability. The system begins with a structured data layer that ingests raw curriculum artifacts, including syllabi and learning outcome tables. These documents are parsed and normalized into Knowledge Units, each represented by a title, a list of topics, and associated learning outcomes. The model layer employs a frozen large language model encoder to produce contextual token representations for both queries and memory items. Rather than fine-tuning the full backbone, GCAM introduces lightweight adaptation through low-rank projection layers and two interpretable gating mechanisms. Cross-attention is applied between query tokens and memory tokens to generate a query-aware memory representation. The Token Gate selectively suppresses irrelevant or noisy memory tokens while emphasizing concept-bearing terms. The Head Gate dynamically activates the most useful attention heads for a given query, allowing the model to adapt its alignment behavior depending on query intent and granularity. Together, these mechanisms convert standard cross-attention into an intent-adaptive and noise-resistant semantic matching module. The model is trained using a contrastive objective that encourages aligned pairs to produce stronger similarity scores than non-aligned pairs. During inference, course titles, topics, or micro-assessments are scored against the KU memory bank to retrieve the most relevant Knowledge Units and associated learning outcomes.

This retrieval-based framework reduces manual alignment effort, improves consistency across instructors, and enhances interpretability through explicit gating behavior.

Aashish Pandey is a system, security, and network professional with 5 years of experience, currently pursuing a PhD in Computer Science at the University of North Texas.
Submitted by Katie Dey on