Hi, I'm Pratyush Srivastava.
A
Self-driven and highly motivated Data Scientist with a deep passion for Generative AI, Agentic AI systems, and advanced Machine Learning. I thrive on building intelligent, autonomous solutions that combine LLMs, reasoning workflows, and data-driven insights. Curious by nature and analytical in approach, I enjoy transforming complex real-world problems into scalable AI-powered products that deliver measurable impact.
About
A Data Scientist and AI Engineer with M.Tech from IIT Patna, specializing in Generative AI, Agentic AI, and MLOps.
I hold a Master's degree in Technology (M.Tech) in Artificial Intelligence and Machine Learning from IIT Patna and a Bachelor's degree in Computer Science from Galgotias University.
With 3.5+ years of professional experience, I have built strong expertise in Data Science, Machine Learning, Deep Learning, Generative AI, and NLP. My journey has involved working extensively on data preprocessing, model development, and deployment, leveraging technologies such as Flask, MySQL, PostgreSQL, and cloud platforms.
As a detail-oriented Data Scientist and AI Engineer, I specialize in developing intelligent, scalable, and end-to-end AI solutions that solve complex real-world problems. My core areas of expertise include Generative AI, Agentic AI, Multi-Agent Systems, Machine Learning, MLOps, and Cloud Technologies (AWS, GCP). I am particularly passionate about harnessing AI to extract insights from unstructured data, build predictive models, and create AI-driven applications that deliver measurable business value.
I am actively seeking opportunities to apply my skills in data-driven problem-solving, model optimization, and AI innovation. My goal is to contribute to impactful projects that leverage the transformative power of AI while continuously learning and advancing in this rapidly evolving field.
- Languages: Python, C, SQL (MySQL, PostgreSQL), HTML
- Core ML & DL Frameworks/Libraries: NumPy, Pandas, Scikit-learn, TensorFlow, Keras, PyTorch, XGBoost
- LLM & Generative AI Tools: Hugging Face Transformers, LangChain, CrewAI, RAG (Retrieval-Augmented Generation), Prompt Engineering, AI Agents, Autogen, Langgraph, Tools
- Cloud Platforms & Services: Google Cloud Platform (Vertex AI, BigQuery, Cloud Storage, Cloud Functions), AWS (Bedrock, SageMaker, Lambda, EC2), Heroku
- Developer Tools & IDEs: Git, VS Code, PyCharm, IntelliJ, Eclipse, Anaconda
- Business & Productivity Tools: Power BI, MS Excel, MS Word, PowerPoint
- Core Competencies: Data Science, Machine Learning, Deep Learning, Generative AI, NLP, AI Agents, Statistical Analysis, Predictive Modeling, Business Analytics, Data-Driven Decision Making, Model Deployment, Customer Insights
Current Focus: Generative AI | Agentic AI | Multi-Agent Systems | MLOps | Cloud AI Engineering
Experience
Neuralgo.ai (goML) | Machine Learning Engineer
- Architect and deploy Generative AI and LLM-powered solutions using RAG, Transformers, and Agentic AI frameworks to solve complex business challenges.
- Fine-tune, optimize, and evaluate LLMs (OpenAI, Llama, Mistral, etc.) for domain-specific tasks, enhancing accuracy and contextual intelligence.
- Build scalable GenAI pipelines integrating LangChain, LlamaIndex, and vector databases (FAISS, Pinecone) for retrieval, reasoning, and orchestration.
- Develop autonomous multi-agent systems enabling intelligent decision-making, task planning, and adaptive learning.
- Implement robust MLOps and DevOps practices using Docker, Git, and AWS for reliable model deployment and lifecycle management.
- Stay at the forefront of LLM, Agentic AI, and Generative AI research, applying emerging techniques to production-ready applications.
- Tools: Python, Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Agentic AI, Prompt Engineering, LangChain, LlamaIndex, Transformers, Hugging Face, Vector Databases (FAISS, Pinecone), PyTorch, AWS, Docker, Git, MLOps
Infosys Limited | Digital Specialist Engineer (Analytics)
- Collaborated with cross-functional teams to define project requirements, ensuring AI solutions and analytics meet organizational goals.
- Partnered with project managers to identify analytics needs, establish critical metrics and KPIs, and provide actionable insights to drive informed decision-making.
- Designed, developed, and optimized generative AI models to improve performance, scalability, and efficiency in alignment with business objectives.
- Developed and maintained AI and data pipelines, handling data preprocessing, feature extraction, model training, and evaluation to ensure efficient data flow.
- Contributed to best practices for generative AI and data systems, troubleshooting and debugging issues while embracing new technologies to enhance analytical outcomes.
- Tools: Python (Core and Advanced), Machine Learning, NumPy, Pandas, Anaconda, Algorithms, Flask, LLM, Generative AI, Streamlit, RAG, Fine Tuning, Agentic AI, FastAPI
Professional Projects
Healthcare Document Intelligence Assistant (RAG-Based)
- Built an AI-driven medical document analysis system using a RAG pipeline with OCR-based text extraction and semantic indexing for multi-format medical files.
- Implemented a FAISS vector database enabling ultra-fast semantic search and evidence-grounded retrieval.
- Developed an LLM-powered QA and report generator (GPT-5) delivering accurate, context-aware answers with source citations.
- Created an automated medical report generator producing structured, professional PDF summaries.
- Integrated Google Drive, secure APIs, session memory, and an intuitive user interface to enable seamless document handling and real-time interactions for healthcare teams.
- Tools: GPT-5, FAISS, RAG, LangChain, Python, OCR, Google Drive
AI-Powered Sales and Site Analytics Chatbot
- Collaborated with clients to gather requirements and design a highly efficient sales analytics chatbot solution, leveraging natural language processing (NLP) for enhanced user interaction.
- Engineered seamless integration of the sales analytics chatbot with Vertex AI and the Gemini LLM model, achieving real-time data processing and improving user query response times by an average of 3 seconds.
- Designed and developed a robust query framework to construct and execute SQL queries, streamlining data retrieval and improving user insights through automated summarization.
- Implemented an NLP pipeline to extract and map user entities, increasing accuracy in query interpretation and conversational responses.
- Converted user queries into SQL statements, transforming data into natural language insights and delivering dynamic visualizations to enhance decision-making in a conversational format.
- Tools: Python, NLP, LLM, Gemini, GCP, Vertex AI, BigQuery, SQL, LangChain
AI-Powered Mortgage Document Understanding and Automation Platform
- Designed and implemented an AI-driven solution leveraging LLMs to automate data extraction and classification from complex mortgage documents within a scalable GCP cloud environment.
- Applied Retrieval-Augmented Generation (RAG) techniques to enhance information retrieval accuracy and streamline data population workflows in the user interface.
- Developed an enterprise-grade Conversational AI chatbot for document question-answering, utilizing LangChain, vector embeddings, and vector databases for efficient semantic search and document understanding.
- Delivered an end-to-end intelligent document processing pipeline, significantly reducing manual data entry efforts and improving operational efficiency.
- Tools: Python, LLMs, RAG, FAISS, LangChain, GCP, Vector Databases
Personal Projects
An Arcade Game based on PyGame(Python)
- The Air Assult Game is a dynamic and challenging game where players control a copter flying through an obstacle-filled path. As soon as the game starts, the copter automatically moves forward, and the player’s goal is to balance it by navigating around obstacles to avoid crashes. The difficulty increases as the game progresses, keeping players on their toes. Developed using Python, Tkinter, and PyGame, the game provides a smooth, addictive, and engaging experience.
A Rain Prediction Machine Learning App
- This project involves the development of a Weather Prediction Model that forecasts whether the next day will be sunny or rainy. Using machine learning algorithms, the model analyzes historical weather data to make accurate predictions. It was built using Python, along with data manipulation libraries such as Pandas and NumPy, and trained using ScikitLearn. The model was then deployed on the Heroku Cloud Server, ensuring it is accessible online for real-time predictions. This project demonstrates my proficiency in machine learning, data processing, and cloud deployment.
A Simple and Extensible Desktop Application Based on OpenCv.
- This is a Student Attendance Management System, a desktop application designed to automatically detect and track student attendance using facial recognition technology. The system leverages OpenCV for real-time image processing and machine learning algorithms to identify students and mark their attendance accurately. Attendance records are stored and managed using Excel sheets, making it easy to maintain and update student data. Built with Python and Tkinter, the application provides an intuitive user interface, offering seamless integration of machine learning and image processing techniques.
A Mushroom Prediction Machine Learning App
- This project involves a Mushroom Classification Model that predicts whether a mushroom is poisonous or not based on various features such as its color, size, and shape. Using machine learning classification algorithms, the model analyzes the provided attributes and classifies the mushroom as either edible or toxic. The model has been deployed on Streamlit, providing an interactive and user-friendly web interface for real-time predictions. This project demonstrates my expertise in machine learning, data classification, and web deployment.
A Web app Based on Django
- The Blood Bank Management System is a web-based platform designed to streamline and automate the process of managing blood donations, recipients, and stocks. The system simplifies the search for blood during emergencies and helps in maintaining a detailed record of blood donors, donation events, and available blood units in the bank. Built with Django for the backend and SQLite for the database, the platform features an intuitive user interface created using HTML, CSS, and Bootstrap, with interactive elements powered by JavaScript. This project showcases my skills in web development, database management, and creating solutions to address real-world problems.
An Uber Ride Prediction Machine Learning Web App
- The Ride Prediction System is a machine learning web application designed to predict the number of rides a user will take per month. By analyzing historical data, the model forecasts monthly ride usage based on various features. The application is built using Flask, enabling a smooth integration of the machine learning model with a user-friendly web interface. The frontend is developed with HTML, CSS, and JavaScript to provide an interactive experience, while the machine learning model predicts ride patterns. The application has been deployed on Heroku, making it accessible for real-time predictions. This project demonstrates my expertise in machine learning, web development, and cloud deployment.
Skills
Programming Languages
Python
HTML5
SQL
C
Core Java
Databases
MySQL
PostgreSQL
MongoDB
Core ML & DL Frameworks/Libraries
NumPy
Pandas
Scikit-learn
TensorFlow
Keras
PyTorch
OpenCV
Streamlit
Chainlit
LLM & Generative AI Tools
Transformers
LangChain
CrewAI
LangGraph
LlamaIndex
Autogen
Cloud Platforms
AWS
Azure
GCP
Heroku
Developer Tools & IDEs
Git
VS Code
PyCharm
IntelliJ
Eclipse
Anaconda
Web Frameworks
Flask
FastAPI
Business & Productivity Tools
Power BI
MS Excel
MS Word
PowerPoint
JIRA
Education
Indian Institute of Technology Patna
Patna, Bihar, India
Degree: Master of Technology in Artificial Intelligence and Data Science Engineering
CGPA: 9.49/10.0
Batch: July 2024 - July 2026
- Probability and Statistics
- Advanced Machine Learning
- Artificial Intelligence
- Deep Learning
- Natural Language Processing
Relevant Courseworks:
Greater Noida, Uttar Pradesh, India
Degree: Bachelor of Technology in Computer Science and Engineering
CGPA: 8.78/10.0
Batch: 2018-2022
- Database Management System
- Operating System
- Compiler Design
- Design and Analysis of Algorithms
- Software Engineering and Testing Methodologies
Relevant Courseworks:
Lakhimpur Kheri, Uttar Pradesh, India
Degree: Intermediate(12th)
Percentage: 84.8%
Batch: 2016-2018
- Physics
- Chemistry
- Mathematics
- English
- Hindi
Relevant Courseworks:
Lakhimpur Kheri, Uttar Pradesh, India
Degree: HighSchool(10th)
CGPA: 10.0/10.0
Batch: 2014-2016
- Social Science
- Science
- Mathematics
- English
- Hindi
Relevant Courseworks:
Certifications and Achievements
Research Paper and Publications
Resume
Contact
Prepared By
Pratyush Srivastava
Copyright Ⓒ 2026 All Rights Reserved.

