Term Deposit Prediction
A comprehensive end-to-end Machine Learning project designed to predict bank deposit subscriptions using the well-known “Bank Marketing” dataset with production grade deployment techniques.
Data Science professional with 2+ years of experience wrangling massive datasets. I speak fluent ML, DL, CV, NLP, and more. Hackathon champion (top 10 in 10+) with a knack for unearthing hidden insights. Ready to tackle your toughest data challenges.
I transform messy data into actionable knowledge using expert wrangling and analysis techniques.
Proficient in building and deploying ML models and deep learning architectures for solving complex data problems.
Possess knowledge of generative models for creating new data or manipulating existing data and solving use cases.
I deploy and manage ML models in production using MLOps best practices, ensuring smooth integration and real-world impact.
I wield expertise in computer vision, NLP, and time series analysis to unlock insights from diverse data.
I leverage my technical skills to build user-friendly and visually appealing websites using WordPress (great for side hustles!).
A comprehensive end-to-end Machine Learning project designed to predict bank deposit subscriptions using the well-known “Bank Marketing” dataset with production grade deployment techniques.
This project tackles the challenge of predicting bank deposit subscriptions using a comprehensive Machine Learning pipeline. It leverages the popular “Bank Marketing” dataset to train and deploy a model that optimizes F1-score and important performance metrics due to the dataset’s imbalanced nature.
The project demonstrates a complete workflow, including data ingestion with CockroachDB, data exploration and feature engineering, model selection with hyperparameter tuning via MLFlow, and deployment to Microsoft Azure cloud using Docker containers and Github Actions for CI/CD. Additionally, a web application is built with PyWebIO and Flask for user interaction with the prediction model.
Data Storage & Retrieval
Data Preprocessing & Analysis
Model Selection & Model Training
Web Application Development
Model Deployment
CI/CD (Continuous Integration/Continuous Delivery)
HyperRez: An open-source python library to make your blurry photos shine in just 2 lines of code!
Enhance and upscale your images effortlessly with HyperRez, a Python library crafted by Rauhan Ahmed Siddiqui. Leveraging the power of GFPGAN and RealESRGAN, state-of-the-art Generative Adversarial Networks, this library provides advanced image enhancement capabilities. Real-ESRGAN focuses on upscaling backgrounds, while GFPGAN excels in refining the faces of human subjects.
End-to-End Python based ML project focusing on forecasting multiple multivariate time series with production grade deployment techniques.
This is a Python, ML-based API with an impressive Normalised Mean Squared Error of 0.039 and Normalised Mean Absolute Error of 0.10 for forecasts of this multiple, multivariate timeseries data on a 15-day window. This POC aims to empowers retailers with data-driven sales forecasting to optimize inventory management and improve profitability. This comprehensive solution predicts short-term sales (up to 30 days) with outstanding accuracy and performance metrics. By integrating seamlessly with existing workflows, it enables retailers to:
Want to grill a legend like Cristiano Ronaldo or Lionel Messi with questions? This project builds an AI that lets you chat with the 35 greatest footballers EVER! Ask anything about their careers, rivalries, or iconic moments, and this AI, powered by cutting-edge tech, will answer like a true football mastermind. Get ready to rewrite fan history!
This project delves into the world of football legends, offering a unique and interactive way to explore their careers and achievements. By leveraging cutting-edge AI techniques, it empowers users to ask insightful questions about any of the top 35 footballers of all time, as identified by a renowned source like The Guardian.
Here’s a breakdown of the system’s core elements:
In essence, this project offers a captivating example of how advanced AI techniques can be harnessed to create a truly engaging and informative experience. Football enthusiasts can now delve deeper into the stories of their favorite players through an interactive and intelligent interface, fostering a deeper appreciation for the sport’s rich history.
Led an end-to-end binary classification project using machine learning. Achieved more than 84% accuracy with CatBoost after data cleaning/analysis and Python-Astra DB (multi-cloud Cassandra) integration & hyperparameter tuning.
Designed, developed, and managed websites and landing pages, building complete e-commerce solutions with payment integration and managed website infrastructure (security, hosting, domains).
Transformed raw data into insights using Python for cleaning, analysis (statistical tests), and visualization (Tableu dashboards, storyboards). Also presented findings to the mentor and supported a team of 10 interns.
Mentored aspiring computer science professionals in courses like Data Analytics, Database Management, Cloud Systems, and Predictive Modeling. Fostered a supportive learning environment to guide their development.
Collaboratively developed and deployed interactive virtual try-on systems using image segmentation and generative diffusion models. Also worked on Retrieval-Augmented Generation (RAG) systems utilizing advanced retrieval techniques.
Btech in Computer Science with specialization in Data Science.
While the realm of data science encompasses various programming languages, Python has emerged as a clear frontrunner. Its claim to the “best” title, however, might raise eyebrows. This blog delves into the key strengths that solidify Python’s position as a powerful and versatile tool in the data scientist’s arsenal.
One of Python’s most significant advantages is its approachable syntax. Unlike languages with complex or cryptic structures, Python prioritizes clarity and readability. This characteristic makes it an ideal choice for beginners, allowing them to grasp concepts quickly and focus on the underlying logic rather than getting bogged down in intricate syntax rules.
The language utilizes English-like keywords and consistent indentation to convey the program’s flow, making the code intuitive and self-documenting. This not only simplifies the learning process but also enhances code maintainability for both the author and collaborators.
Python boasts a vast and ever-expanding collection of libraries and frameworks specifically designed for data science tasks. These pre-built tools provide a plethora of functionalities, enabling data scientists to efficiently handle various aspects of their workflow.
Here are some of the most prominent libraries that empower Python in data science:
These libraries not only expedite common data science tasks but also promote code reusability and consistency. Additionally, the open-source nature of these libraries fosters continuous development and community contributions, ensuring access to cutting-edge functionalities.
Python’s strength extends beyond data science. Its general-purpose nature makes it applicable to various domains, including:
This versatility offers several benefits:
Python enjoys a large and active community of developers, data scientists, and enthusiasts. This vibrant community translates into several advantages:
This strong community support system empowers individuals to learn, collaborate, and solve problems effectively, accelerating their growth and productivity in the field of data science.
While Python might not always be the champion in terms of raw execution speed, it offers a compelling balance between readability, maintainability, and performance for most data science tasks.
Several factors contribute to this balance:
Therefore, while Python might not be the absolute fastest language, its trade-off between efficiency and developer productivity makes it a compelling choice for many data science applications.
Python’s reign in the realm of data science is not merely a fad. Its approachable nature, rich ecosystem of libraries, versatility, thriving community, and efficient balance of readability and performance make it an invaluable tool for data.
Let's connect