RAG-based PDF Chatbot built with LangChain and Pinecone.
Project Media
Photos / Videos go here
PaperLM was born out of the need to efficiently extract and interact with information from large sets of research papers. Traditional search methods fall short when you need specific answers buried in hundreds of pages of technical documentation.
The system processes PDF files by chunking them into manageable segments, generating embeddings, and storing them in Pinecone. When a user asks a question, the most relevant chunks are retrieved and passed to the LLM to generate a precise answer with citations.
The project resulted in a significant improvement in the speed of information retrieval for researchers, reducing the time spent searching through documents by approximately 60%.