Estimating CTR does not seem a difficult task — even if you have terabytes of logs with the impressions and clicks, a small Hadoop cluster cope with them without any problems. If the amount of objects is not so large, for example, hundreds of thousands of advertising companies, it is feasible to estimate CTR in real time. But the situation looks differently if you need to estimate CTR in real-time for hundreds of millions of objects, processing millions of events per second, and then to use the results for the evaluation of tens of millions of candidates every second.
Dmitriy will talk about the implementation of CTR estimation for news feed at OK.ru, which technologies were used as a basis and how they had to be adjusted. He will also discuss the abilities of streaming data analysis platforms in general, not limited to the CTR estimation.
Dmitry Bugaychenko, Odnoklassniki
Dmitriy graduated from St. Petersburg State University in 2004, got a PhD degree in the field on the formal logical methods in 2007. He spent almost 9 years in outsourcing without losing contact with the university and research community. Big data analysis at Odnoklassniki became for Dmitriy an unique chance to combine theoretical knowledge and scientific foundation to the development of a real and popular products. And with this chance, he gladly took advantage of coming here in 2011.