ICSE 2026 SEET Poster #20

Textbook-Embedded Virtual Teaching Assistant That Answers CS1 Questions With Verifiable Citations

University of Toronto, Edward S. Rogers Sr. Department of Electrical & Computer Engineering, Toronto, ON, Canada

Poster

Poster Summary

This work presents a textbook-embedded CS1 virtual teaching assistant built with retrieval-augmented generation. The assistant returns concise answers and section-level textbook citations so students can verify each response directly in course material.

Figures from Poster

Figure 1: Example VTA response with citation links.

Figure 1: Example VTA response with citation links.

Figure 2: Deployed retrieval workflow and system pipeline.

Figure 2: Deployed retrieval workflow / system pipeline.

Figure 3: RR comparison on exam-related queries.

Figure 3: RR comparison on exam-related queries.

Figure 4: Distribution of query categories from usage logs.

Figure 4: Distribution of query categories from usage logs.

Problem

CS1 students need just-in-time explanations, but course knowledge is split across textbook, lecture notes, and exams. General web search and ungrounded AI responses can be off-level and hard to verify, especially for subtle C language concepts.

Approach

The pipeline chunks the textbook with section metadata, retrieves top contexts (N=10), and generates grounded answers with clickable chapter/section citations. Exam-style queries are routed through a BM25 stage and then reranked by cosine similarity.

Results

Retrieval (Exam-Related Queries)

  • BM25 + Cosine reranking: MRR 0.50
  • Cosine similarity: MRR 0.45
  • Keyword matching + Cosine: MRR 0.34

Student usage emphasized conceptual questions. In the survey, 89% found the assistant helpful; 43% were likely to ask conceptual questions, while 78% were unlikely to use it for problem-solving due to academic-integrity concerns.

Limitations and Future Work

The assistant is limited by textbook coverage and has weaker support for visual explanations and broader programming guidance. Improvements include better visual-content handling, higher-level retrieval summarization, and policy-aware onboarding for students.

Takeaways

  • Evidence-backed answers with section-level citations.
  • Hybrid retrieval improves keyword-dense exam query handling.
  • Students prefer conceptual support over problem-solving use.