Undergrad projects on LLM Agent

LLM Multi-Agent System For Accessible Video Description

Accessible Video Descriptions (AVD) are narrative audio tracks describing important visual content for blind and low-vision users. The project aims to develop a conversational agent (LLM-based assistant) that interacts with users to generate relevant video descriptions on-demand. The output will be an application demonstrating interactive AVD generation.

LLM Multi-Agent System for Competitive Programming

Online Judge (OJ) systems contain vast repositories of programming problems, typically organized into volumes or contest-based groupings. However, such arrangements lack fine-grained classification based on problem topics, difficulty levels, and required skills, making it difficult for learners to navigate effectively. This project aims to develop a classification method leveraging Large Language Models (LLMs) to automatically categorize programming problems based on problem statements. Using dataset insights from the work "Classification of Programming Problems based on Topic Modeling", the project will implement and compare LLM-based classifiers with traditional topic modeling approaches, evaluating accuracy, efficiency, and interpretability in problem categorization.

LLM Multi-Agent System For Querying Website Content

Many websites contain large amounts of information, making it challenging for users to find specific answers quickly. This project aims to develop an assistant powered by LLMs capable of answering user queries based on the content of a specific website. The process involves retrieving relevant information from the target website's pages (e.g., via scraping) and utilizing an LLM to understand the user's question and synthesize an accurate answer from the retrieved content. The ultimate goal is to create a functional prototype demonstrating improved information access and user experience (e.g ReactJS UI https://ai-sdk.dev/).

LLM Multi-Agent System for Accessible Video Descriptions

Accessible video descriptions (AVD) is a narrative audio track that describes important visual content, making it accessible to blind and low-vision users. However, most human-generated AVDs are expensive and time-consuming. Machine Learning models can leverage video signals and metadata to create video descriptions, which may involve the use of methods for scene detection, image captioning, text-to-speech, and audio synchronisation. The project aims to develop a conversation agent that interacts with users to generate video descriptions. Ultimately, the project will deliver an application that renders the video and generates AVD from user questions.