Hi, I'm Pratyush Srivastava.
A Data Sc|
Self-driven and highly motivated Data Scientist with a curious mind, passionate about Machine Learning, NLP, and Data Analytics. Enjoys solving complex real-world problems through data-driven approaches and AI-powered solutions.
About
A Data Scientist and Machine Learning Enthusiast with a passion for NLP, Deep Learning, and Data-Driven Solutions.
I am a Computer Science Graduate Student at Galgotias University with a deep passion for Data Science, Machine Learning, and NLP. With 3+ years of professional experience, I have honed my expertise in Python, Machine Learning, Deep Learning, NLP, and Data Analytics. My journey has involved working extensively on data preprocessing, model development, and deployment, along with technologies like Flask, MySQL, PostgreSQL, and cloud platforms.
As a detail-oriented Data Scientist and Machine Learning Engineer, I specialize in building intelligent, scalable solutions that solve complex real-world problems. I have hands-on experience in Natural Language Processing (NLP), Generative AI, Machine Learning and Deep Learning, and I am particularly passionate about leveraging AI to extract insights from unstructured data, develop predictive models, and deploy AI-driven applications.
I am actively seeking opportunities to apply my skills in data-driven problem-solving, model development, and analytics in a challenging role. I am eager to contribute to innovative projects while continuously learning and growing in the field of AI and Data Science.
- Languages: Python, C, SQL (MySQL, PostgreSQL), HTML
- Cloud Technologies: Google Cloud Platform (GCP), AWS, Heroku
- Developer Tools: Git, VS Code, PyCharm, IntelliJ, Eclipse, Anaconda
- Libraries and Frameworks: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, TensorFlow, Keras, NLTK, Langchain
- Software Tools: MS Word, PowerPoint, MS Excel, Power BI
- Technologies: Data Analysis, Machine Learning, NLP, Deep Learning, Data Engineering, Web Scraping, Model Deployment, Cloud Computing
- Other Skills : Business Analytics, Data-Driven Decision Making, Statistical Analysis, Customer Insights, Predictive Modeling
Current Focus: Python / Machine Learning / Deep Learning / Data Science / Gen AI / Agentic AI
Experience
- Collaborated with cross-functional teams to define project requirements, ensuring AI solutions and analytics meet organizational goals.
- Partnered with project managers to identify analytics needs, establish critical metrics and KPIs, and provide actionable insights to drive informed decision-making.
- Designed, developed, and optimized generative AI models to improve performance, scalability, and efficiency in alignment with business objectives.
- Developed and maintained AI and data pipelines, handling data preprocessing, feature extraction, model training, and evaluation to ensure efficient data flow.
- Use Artificial Intelligence, NLP technologies, and cognitive machine learning to develop AI Applications.
- Design and implement processes and strategies to enhance the end-user experience.
- Contributed to the development of best practices for generative AI and data systems, troubleshooting and debugging issues while embracing new technologies to enhance analytical outcomes.
- Tools: Python (Core & Advanced), Machine Learning & Deep Learning , Exploratory Data Analysis (EDA) , Flask , Generative AI , RAG , NLP
Professional Projects
- Collaborated with clients to gather requirements and design a highly efficient sales analytics chatbot solution, leveraging natural language processing (NLP) for enhanced user interaction.
- Integrated the chatbot with Vertex AI, Gemini LLM model, and BigQuery, enabling real-time data processing, analysis, and actionable insights.
- Designed and developed a robust query framework to construct and execute SQL queries, streamlining data retrieval and improving user insights through automated summarization.
- Implemented an NLP pipeline to extract and map user entities, increasing accuracy in query interpretation and conversational responses.
- Converted user queries into SQL statements, transforming data into natural language insights and delivering dynamic visualizations to enhance decision-making in a conversational format.
- Tools: Python , NLP , SQL , VertexAI , BigQuery , GCP , Langchain
- Leveraged large language models (LLMs) to develop a scalable solution for automating data extraction from mortgage documents within an AWS cloud environment.
- Implemented a Retrieval-Augmented Generation (RAG) approach to classify and extract relevant information from mortgage document streams, streamlining the data population process into the user interface (UI).
- Developed a Conversational AI solution, incorporating a document QnA chatbot powered by enterprise-grade vector databases and document embeddings using LangChain, delivering an efficient and intelligent document processing workflow.
- Tools: Python , Large Language Models(LLMs) , RAG , LangChain , AWS
Personal Projects

An Arcade Game based on PyGame(Python)
- The Air Assult Game is a dynamic and challenging game where players control a copter flying through an obstacle-filled path. As soon as the game starts, the copter automatically moves forward, and the player’s goal is to balance it by navigating around obstacles to avoid crashes. The difficulty increases as the game progresses, keeping players on their toes. Developed using Python, Tkinter, and PyGame, the game provides a smooth, addictive, and engaging experience.

A Rain Prediction Machine Learning App
- This project involves the development of a Weather Prediction Model that forecasts whether the next day will be sunny or rainy. Using machine learning algorithms, the model analyzes historical weather data to make accurate predictions. It was built using Python, along with data manipulation libraries such as Pandas and NumPy, and trained using ScikitLearn. The model was then deployed on the Heroku Cloud Server, ensuring it is accessible online for real-time predictions. This project demonstrates my proficiency in machine learning, data processing, and cloud deployment.

A Simple and Extensible Desktop Application Based on OpenCv.
- This is a Student Attendance Management System, a desktop application designed to automatically detect and track student attendance using facial recognition technology. The system leverages OpenCV for real-time image processing and machine learning algorithms to identify students and mark their attendance accurately. Attendance records are stored and managed using Excel sheets, making it easy to maintain and update student data. Built with Python and Tkinter, the application provides an intuitive user interface, offering seamless integration of machine learning and image processing techniques.

A Mushroom Prediction Machine Learning App
- This project involves a Mushroom Classification Model that predicts whether a mushroom is poisonous or not based on various features such as its color, size, and shape. Using machine learning classification algorithms, the model analyzes the provided attributes and classifies the mushroom as either edible or toxic. The model has been deployed on Streamlit, providing an interactive and user-friendly web interface for real-time predictions. This project demonstrates my expertise in machine learning, data classification, and web deployment.

A Web app Based on Django
- The Blood Bank Management System is a web-based platform designed to streamline and automate the process of managing blood donations, recipients, and stocks. The system simplifies the search for blood during emergencies and helps in maintaining a detailed record of blood donors, donation events, and available blood units in the bank. Built with Django for the backend and SQLite for the database, the platform features an intuitive user interface created using HTML, CSS, and Bootstrap, with interactive elements powered by JavaScript. This project showcases my skills in web development, database management, and creating solutions to address real-world problems.

An Uber Ride Prediction Machine Learning Web App
- The Ride Prediction System is a machine learning web application designed to predict the number of rides a user will take per month. By analyzing historical data, the model forecasts monthly ride usage based on various features. The application is built using Flask, enabling a smooth integration of the machine learning model with a user-friendly web interface. The frontend is developed with HTML, CSS, and JavaScript to provide an interactive experience, while the machine learning model predicts ride patterns. The application has been deployed on Heroku, making it accessible for real-time predictions. This project demonstrates my expertise in machine learning, web development, and cloud deployment.
Skills
Programming Languages





Databases



Libraries






Frameworks




Other






Education
Greater Noida, Uttar Pradesh, India
Degree: Bachelor of Technology in Computer Science and Engineering
CGPA: 8.78/10.0
Batch: 2018-2022
- Database Management System
- Operating System
- Compiler Design
- Design and Analysis of Algorithms
- Software Engineering and Testing Methodologies
Relevant Courseworks:
Lakhimpur Kheri, Uttar Pradesh, India
Degree: Intermediate(12th)
Percentage: 84.8%
Batch: 2016-2018
- Physics
- Chemistry
- Mathematics
- English
- Hindi
Relevant Courseworks:
Lakhimpur Kheri, Uttar Pradesh, India
Degree: HighSchool(10th)
CGPA: 10.0/10.0
Batch: 2014-2016
- Social Science
- Science
- Mathematics
- English
- Hindi
Relevant Courseworks:
Certifications and Achievements
Research Paper and Publications
Resume
Contact
Prepared By
Pratyush Srivastava
Copyright Ⓒ 2025 All Rights Reserved.