Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I spent a few cycles in media buying and later in sell-side ad tech. Please do say what you will about advertising and its effects on the web, but I will say this: it is a world of fascinating tech. As a buyer I experienced janky pacing all of the time across various platforms, because this is a HARD problem. We had to manually adjust campaigns on a daily basis to ensure pacing worked properly. It was common to stop a campaign and overspend by hundreds of dollars while all of the caching spun down.

I'm fascinated to see they are running that all on a single node. Its a massive amount of state aggregated from billions of events that needs to be served at extremely low latency, but couldn't it be partitioned somehow??? Google Fi/Spanner and BigTable have certainly been developed to support these issues. I've been trying to dig up what infrastructure powers Google AdX, but I haven't found anything. AdWords seems to be tied to Spanner, but AdX is/was an entirely different beast. In any case I'm quite certain that it isn't running pacing on a single, gigantic node.



As an anecdotal data point, I once configured a test campaign on Doubleclick Bid Manager (now Google DV360) about two years ago that I needed some quick exposure on. So I set a budget cap of 100$ just for safety and didn‘t do any targeting, so I was effectively bidding on half the worlds‘ ad inventory. What I didn‘t check or notice was that pacing wasn‘t set to even, but to Flight ASAP.

Suffice to say, I spent 730$ within _seconds_, so fast actually Googles systems couldn‘t even switch off fast enough to prevent 7,3x overspend, and the only thing that prevented stupid me from a five digit spend was probably choosing an unusual ad size.

Fascinating stuff indeed :)


> Its a massive amount of state aggregated from billions of events that needs to be served at extremely low latency, but couldn't it be partitioned somehow???

The bidder/pacer state is not necessarily massive, and certainly it does not consist of all the gazillions of past events. Depending on the strategy/bidding model, it can range from a few MB to several GBs, something that can fit in a beefy node.

> Google Fi/Spanner and BigTable have certainly been developed to support these issues.

I doubt any external store can be used with so low latency constraints (2-10ms) and high throughput (millions RPS). Perhaps Aerospike but even that is a stretch to put it in the hot-path. At this scale you're pretty much limited to fetch the state in memory and update it asynchronously every couple of minutes/hours.

Source: I also work in ad tech.


> Google Fi/Spanner

For anyone else confused it's probably Google F1 and Spanner.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: