Skip to main content

Posts

Featured Post

Curing writer's block with sunk cost fallacy

I paid $20 to renew this blog's domain in July. But the truth is, I had been suffering from writer's block ever since the start of this year and hadn’t posted a single thing. At one point, I was ready to give up on the blog altogether, but a voice in my head kept reminding me of all the time and money I’d already invested in this blog. So, this week, I sat down to write this imperfect, patchy article—about none other than that voice itself.  Let me start with a classic scenario where you might have also encountered this voice. Suppose you’re at an Italian restaurant and ordered some pasta and tiramisu. After finishing the pasta, you realize you’re full, and there’s no way your stomach can handle that delicious tiramisu sitting right in front of you. But then, that beautiful brain of yours reminds you that you’ll be paying for the tiramisu whether you eat it or not. In a desperate attempt to avoid wasting money, you reluctantly eat two quick bites. And just like that, my frien
Recent posts

What is SUTVA for A/B testing?

Imagine if person B’s blood pressure reading depends on whether person A receives the blood pressure medicine in a randomized controlled trial. This will be violating Stable Unit Treatment Value Assumption (SUTVA) SUTVA states that the treatment received by an individual should not influence the outcome we see for another individual during the experiment. I know the initial example sounded absurd, so let me try again. Consider LinkedIn A/B testing a new ‘dislike’ reaction for its users, and the gods of fate chose you to be part of the initial treatment group that received this update. Excited after seeing this new update, you use this dislike reaction on my post and send a screenshot to a few of your connections to do the same, who are coincidentally in the control group that did not receive the update. Your connections log in and engage with my posts to use this dislike reaction, but later get disappointed as this new update is not yet available to them. The offices of LinkedIn are tr

A causal inference problem faced by Uber

Many companies randomly assign a subset of their customers to treatment and control groups for running experiments to test discount strategies. They send discount coupons to all the customers in the treatment group via email and track the difference between sales conversion rates of treatment and control groups. On paper, this sounds straightforward. But in practice, the below four different behaviors of customers complicate this a great deal: 1. Defiers - This group will be negatively impacted upon receiving the coupon via email and will not make a purchase. 2. Always Takers - Despite being in the control group that did not receive the email, they will find a way to get their hands on the coupon and use the discount 3. Never Takers - They would choose not to open the email and hence, do not see the discount coupon. So they "choose" not to be treated. 4. Compliers - This group opens the emails and finds the discount coupon, i.e., they get treated. Companies ideally want their

A practical advice about building models

One of the most practical pieces of advice I recently learned about building models is counterintuitive. It suggests that we should not immediately jump into training models on the data. Instead, we should first try to create heuristic rules for the prediction problem at hand. For example, if we are trying to predict whether a customer will buy the latest edition of the iPhone or not, a simple heuristic rule would be that customers with an annual income greater than $80,000 USD and a history of purchasing Apple products would have a higher probability of buying the new iPhone. You could write a simple SQL query to test out such heuristic rules on your training and holdout sets and evaluate their effectiveness. This approach could sometimes help you create better features, identify inherent target leakage issues, and provide a baseline that you could aim to beat with the models.

Solving Customer Churn with a hammer!

Learning when data should take a back seat and give way to domain knowledge is a valuable skill. Suppose you built a machine learning model on the data of your customers to predict churn risk. Now that you have a risk score for each customer, what do you do next? Do you filter the top n% based on the risk and send them a coupon with a discount in the hopes that it will prevent churn? But what if price is not the factor driving churn in many of these customers? Customers might have been treated poorly by customer service, which drove them away from your company's product.  Or there might have been an indirect competitor's product or service that removes the need for your company's product altogether (this happened to companies like Blockbuster and Kodak in the past!) There could be a myriad of factors, but you get the point! Dashboards and models cannot guide any company's strategic actions directly. If companies try to use them without additional context, more often tha

The Gambler's fallacy

In a world riddled with conflicts and disagreements, we all can wholeheartedly agree that the probability of my articles becoming viral and the Bitcoin price seeing a 1000% increase is not only independent but also extremely unlikely. If I claim that these two events are dependent in an attempt to gain engagement from the large crypto community, does it not make me a conman? Or I could simply be a common man (or a conspiracy theorist) who mistakenly perceives independent events as somehow interconnected. Another group that commonly struggles with this issue is individuals with gambling addictions. Don’t we all have those friends (or in a few cases, we were those friends) who experienced consecutive losses in gambling but persisted because they believed their turn to win was imminent? It could be portrayed as a tale of remarkable persistence and unwavering determination when that friend miraculously wins a significant sum of money, potentially bankrupting the casino. However, there is o

Can you defeat Monty Hall to win a Batmobile?

  You slipped after accidentally stepping on a banana peel and somehow fell into another dimension where people are in game shows all the time. As you dust yourselves off and stand up, you realize you are in the 1960s version of the game show “Let’s Make a Deal.” The host of this show, the late Monty Hall, looks at you suspiciously at first but later presents three doors in front of you and asks you to choose one. You don’t trust strangers, so you demand to know what’s happening before you make your next move. Monty Hall patiently explains that there’s a brand new Batmobile behind one of the doors (yes, Batman is real in this dimension), and goats behind the other doors. You could own the Batmobile if you correctly guess the door behind which it was hidden. You pull your Batsuit out of your pocket to don the mask of the world’s greatest detective (as per DC Comics) and analyze the three doors with a careful gaze. You look meticulously for any minuscule details that might give away the