A senior project conducted in partial fulfillment of a B.S. in Computer Science from Yale University. Focused on LLMs, RAG, and English literature pedagogy.
Large language models – also known as LLMs – have revolutionized the world in a relatively short period. LLMs are composed of transformer networks that learn “context” and “meaning” during their analysis of sequential relationships in input data (4). Popular LLMs, such as GPT-3, have been integrated into consumer-facing web applications that have generated public engagement with AI.
While these consumer applications have broadly been used in personal and professional settings, they may be additionally suitable in an educational context. Advancements in technology have long had impacts on pedagogy. For example, the release of Khan Academy began a chapter of globalized self-learning powered by the Internet. LLM-integrated software applications can contribute to even more personalized learning experiences, where a student can interface with a virtual instructor using a method best suited to that student’s pedagogical profile.
This senior project is interested in how LLMs can be fine-tuned with retrieval-augmented generation (RAG) and integrated into software applications to enhance English literature education at the undergraduate level. To view a more detailed project plan, visit the proposal document here.
A software application called Verse has been created for this senior project. Source code, as well as documentation, can be found at the Verse repository here.
This repository contains a few folders that correspond to the main deliverables of the project:
-
UX-Research-Report
: Contains a PDF of the submitted UX research report. Includes a brief literature survey on the nexus of LLMs and pedagogy, as well as the results from a research survey. -
Retrieval-Augmented-Generation
: Contains information on the Verse application and how to access source code and documentation. -
Class-Materials
: Contains assignments completed for the CPSC 490 course, including: project proposal, presentation slides, and reports.
While LLMs offer the potential for a more productive pedagogical experience, it may not be wise to fully replace the human student-instructor dynamic. AI assistants may democratize study for those who cannot access high-quality academic spaces. Additionally, AI assistants can be used as a supplemental tool to in-person learning. But dialogue between human students and human teachers should not be something to replace entirely. Rather, AI should focus on enhancing, rather than replacing, a person’s relationship with their learning.
-
Ahuja, A.S., et al. “ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education.” Learning and Individual Differences, 9 Mar. 2023, www.sciencedirect.com/science/article/abs/pii/S1041608023000195.
-
Jeon, J., Lee, S. Large language models in education: A focus on the complementary relationship between human teachers and ChatGPT. Educ Inf Technol 28, 15873–15892 (2023). https://doi.org/10.1007/s10639-023-11834-1
-
Lan, Yu-Ju, and Nian-Shing Chen. “Teachers’ Agency in the Era of LLM and Generative AI: Designing Pedagogical AI Agents.” Educational Technology & Society, vol. 27, no. 1, 2024, p. I–XVIII. JSTOR, https://www.jstor.org/stable/48754837. Accessed 25 Feb. 2024.
-
“What Are Large Language Models?” NVIDIA, www.nvidia.com/en-us/glossary/large-language-models/. Accessed 29 Jan. 2024.