This project revolves around the development of a research assistant using the Google Vertex AI Palm2 platform. The aim is to streamline the process of searching for and accessing academic papers from Google Scholar, providing researchers with a user-friendly and efficient tool. The research assistant is implemented as a Streamlit application, allowing users to input their search specifications and navigate through Google Scholar seamlessly. One of the key features of the research assistant is its automatic scraping functionality. Once the user provides their search criteria, the application scours Google Scholar across multiple pages, retrieving relevant papers. The scraped papers are then organized into a comprehensive dataframe, providing researchers with a structured overview of the available literature. Additionally, the application also selects and provides downloadable PDF versions of the papers, making it convenient for users to access and read the full content. To further enhance the capabilities of the research assistant, it integrates with Google Vertex AI and Langchain. Google Vertex AI is a powerful machine learning platform that enables users to leverage advanced AI models and tools. By integrating with Vertex AI, the research assistant allows researchers to create a knowledge base from the downloaded papers, enabling them to extract insights and answer questions related to the content. Langchain, another crucial component, provides additional functionality for knowledge extraction. It offers a range of AI models and tools specifically designed for language processing and analysis. Integrating Langchain with the research assistant expands its capabilities, allowing researchers to delve deeper into the papers and extract valuable information.
Category tags:Web Scraping & Data Extraction