August 2025
A next-generation Alexa-style AI assistant with voice, vision, and video interactions powered by multimodal AI and agentic RAG.
This project reimagines the voice assistant experience by combining voice, vision, and retrieval-augmented generation into a unified multimodal AI system. Unlike traditional assistants, it can not only respond with natural speech but also leverage a webcam to see the user, recognize gestures, expressions, and objects, and provide context-aware answers. Users can talk to it, show things to it, or even interact with it in video-call style. Built with LangGraph, Groq, Google GenAI, OpenCV, SpeechRecognition, and ElevenLabs, it blends real-time multimodal awareness with conversational intelligence for a futuristic assistant experience.
Talk to the assistant and receive natural, human-like speech responses.
Understands shirt colors, gestures, objects, and facial expressions via webcam.
Retrieves knowledge from personal sources for context-aware answers.
Interact with the assistant in a video-call style interface.
You might also be interested in these projects.
July 2025
A full-featured educational Q&A platform with AI moderation, personalized feeds, Trails, and admin/moderator control.
December 2024
A scalable full-stack e-commerce platform built with the MERN stack, TypeScript, Docker, Redis, Stripe, Firebase, Cloudinary, and more.
September 2025
An AI-powered research assistant that searches arXiv, reads papers, generates insights, and writes new research in LaTeX PDF format.