Wednesday, 6 June 2012

My big fat data-driven wedding: how to decide when to get married

There's all this hype about the importance of using data to support critical business decisions. But few things can be more important in someone's life than choosing the partner they plan to spend the rest of their lives with. So look no further for a data-driven approach to support this critical decision: when should you stop looking and finally get married?The formal setup of the problem is as follows: there are $$N$$ potential marriage partners out there, and I want to choose the one and only partner of my life. My goal of course is to settle with no-one but the very best partner to marry.I can 'date' each candidate sequentially in a random order. When I date someone, I can evaluate how she ranks in comparison to all previous partners I dated. Based on this, I can either go on and marry her in which case the process stops, or reject her and carry on dating more partners. Once a partner was rejected she cannot be called back again. The problem is hard, because it's assumed that I know nothing about a partner before I actually dated her. Some might immediately recognise the typical problem of exploration versus expoitation hidded in here. So when should I stop dating and marry someone?Let's say I want to maximise the probability of ending up with the best partner as wife. Firstly, there's no point in selecting someone who's not the best so far, because that also implies she cannot be the best overall. Also, the earlier you marry, the higher the chance of missing out on someone without even dating her. It turns out the optimal strategy is as follows: date the first $$k$$ candidates and reject them irrespective of their 'performance'. Then, keep on dating partners, and choose the first one, who ranks above everyone I dated before. The threshold $$k$$ depends on $$N$$, assymptotically it's about a third of all candidates ( $$k \approx \frac{1}{e}N$$ see slides for details).I gave a talk about this theory in our lab at Cambridge not long after getting married, and did a quick back-of-the-envelope calculation to see how close my marriage decision was to the optimal. The theory predicts based on my criteria I should've dated roughly 196,132 partners before getting married. Well, I leave it to you to guess whether I did that or not :) But at the end of the day I ended up making the best decision, so it doesn't matter.I got recently reminded of this problem, as I was thinking about making instant decisions in the PeerPerks platform, which as it turns out  can be modelled as a more complicated version of the optimal marriage problem, called probabilistic submodular multiple-choice secretary problem. In PeerPerks, we allow brands to offer freebies to influential users of social media. Our job is to make sure the promotion generates maximal success based on various indicators. In PeerPerks, every once in a while a user turns up on the website and authenticates using their twitter or facebook. At this point we have access to their social media profile, so it's like we've dated that person: we have a little insight into which topics she's interested in, what kind of people she's connected to, etc. Based on this instant evaluation our algorithm decides whether or not the person qualifies to receive a perk or not.