Customer Behavior Modeling: Buy-til-you-Die Models

Predicting a Customer’s Lifetime Value, or how much they will spend over the next few years, is a very challenging problem. Some Data Scientists (me) have full time jobs dedicated to solving this…

A Quick Way To Get On Top Of Your Company’s Data: Create a Spreadsheet-Based Data Dictionary!

If you’ve been on a data team for any amount of time, you’ve probably felt overwhelmed by the sheer amount of data sources you have to keep track of. This is the sprawl of reports, tables, data…

Create A Quantum Bayesian Network

Bayesian networks are probabilistic models that model knowledge about an uncertain domain. Such as the survival of a passenger aboard the Titanic. Bayesian networks build on the same intuitions as…

Principles of GDPR and its impact your Data Analytics Platform

General Data Protection Rules or as they call it GDPR across the world is good enough to scare anyone collecting personal data of its customers for their business. However, it is way beyond the…

Image Processing and Optimal Transport

What do a data scientist in computer vision and a child playing with a shovel at the beach have in common? Both transport mass at some point. For the child, this mass evidently is sand, but what…

Question Answering with Pretrained Transformers Using Pytorch

Question answering is a task in Natural Language Processing (NLP) to answer questions asked in natural language. We implement transformers for this task.

6 Development Habits for Increasing Your Cloud ML Productivity

My previous posts have been mostly technical, covering a range of topics on training in the cloud and advanced TensorFlow development. This post is different. You might consider it more of an opinion…

A primer to Big Complex Systems

Space shuttles, telecommunication networks, and even new hip tech startups are home to big and complex systems. These systems employ thousands of software developers on different teams. Although big…

Extracting information from XML files into a Pandas dataframe

Real-world data is messy, and we know it. Not only does such data require a lot of cleaning, a lot of times, the format in which we receive data is also not suited for analysis. This means that…

How to capture SQL queries generated by Power BI

Once upon a time, your old scary DBA walked into your office with a red-colored face and asked you furiously: “What the hell you’ve done with your Power BI report?! It killed all of our workloads!!!”…