One of the most practical pieces of advice I recently learned about building models is counterintuitive.
It suggests that we should not immediately jump into training models on the data. Instead, we should first try to create heuristic rules for the prediction problem at hand.
For example, if we are trying to predict whether a customer will buy the latest edition of the iPhone or not, a simple heuristic rule would be that customers with an annual income greater than $80,000 USD and a history of purchasing Apple products would have a higher probability of buying the new iPhone.
You could write a simple SQL query to test out such heuristic rules on your training and holdout sets and evaluate their effectiveness.
This approach could sometimes help you create better features, identify inherent target leakage issues, and provide a baseline that you could aim to beat with the models.
Comments
Post a Comment