Craig Silverman on Opening Google’s Black Box

Investigative journalist Craig Silverman was obsessed with something he thought few people would care about. It was advertisements on websites — those squares on the sides or bottom of the page, announcing car deals, “amazing getaways”, stunned doctors, and the like — that most people skim past. While people may not care about ads per se, ads fund the content they really care about, whether it is celeb news or sports scores.

“I have been obsessed with digital ads since about the end of 2016,” Silverman says. “I read a story in the New York Times about a digital ad fraud scheme that had supposedly stolen millions of dollars.”

He knew there was one big player in this market — a big gorilla called Google — that performed the task of hooking up websites offering pixels to brands looking to have their merchandise advertised. To a large extent, such ads fund the commercial internet; and Google, being the most dominant player, is the glue.

Silverman’s reporting career has focused on online misinformation. Before he became a professional journalist, he ran a blog called Regret The Error (that later became a book), that focused on media inaccuracy. Later, while at Buzzfeed News, he uncovered networks that pushed fake news over Facebook, some of which he traced to Macedonia and other countries. In fact, he is likely responsible for the term “fake news” becoming ubiquitous.

But this was different. This was bigger. It wasn’t just teens creating clickbait for seniors for a few bucks. This was, as he puts it, “fake traffic and fake audiences” that were stealing millions of dollars from real publications by spoofing them.

Digital ads run on a system that is so opaque that it took him years to educate himself on it.

“This is why there is so much fraud and so much theft in it, because it’s very hard to understand,” he says. “Even the brands who spend millions or billions of dollars every year are not looking very closely at how it all works.”

Since the spring of 2021 he had been at ProPublica, an outlet focused on investigative journalism. It is fairly atypical in giving their reporters the time and space to dig into complex investigations. Having discussed his interest in digital ads in his start-of-the-year “ideas” memo, he was contacted by a data journalist, Ruth Talbot, also at ProPublica, who shared his obsession. At the start of 2022, they had formed a reporting team, along with computational journalist Jeff Kao (whose project on Jan 6 videos had aided the select committee investigating the attack). They intended to dig into the opaque fortress that was the Google ad system.

In recent years, there has been an industry-wide move towards requiring transparency from digital ad space sellers: the group that, to analogize to real life, rent billboards around town out to marketers. Led by the non-profit consortium IAB Tech Lab, they promote transparency standards so that brands can ensure their ads don’t show up on sleezy websites — websites with violence, sexually-explicit content, that commit piracy, etc.; or on websites that get little traffic. One of the main tools of such transparency is a “sellers.json” file: this is where digital billboard sellers are supposed to present their full information under their real names. 

Silverman’s group at ProPublica was surprised to find that despite Google’s prominent role in the consortium promoting transparency standards, their own sellers file had a high proportion of completely anonymous partners, identified merely by inscrutable strings of numbers. In fact, Google still permits their partners to sign up in “confidential” mode. The anonymity effectively creates an impenetrable zone, that Silverman has called Google’s black box.

Silverman’s group set out, as he puts it, to deanonymize Google sellers data.

The numbers themselves told no story at all; so they went about it in reverse: who, in the wider world of websites, was transacting with Google’s ad business? Doing so was no easy task. It was akin to starting with all citizens of a metropolis, then deducing who might shop at a certain grocery store.

But first, they had to have a list of the citizens of this digital metropolis: website domains. Through contacts in the industry, Silvernman and his group came up with 7 million. Subsequently, Talbot wrote a “scraper”: a bot that could examine websites to see if they were sending real-time requests to the Google ad system as they loaded. 

“It was in many ways much harder than I anticipated,” Talbot says on email. “Figuring out how to do that was a large amount of trial and error – going to pages running Google ads, inspecting them, comparing that to what I’d seen on other pages and slowly narrowing down the various kinds of ads that could be placed and how they got there.”

It was painstaking work that involved wading through floods of data. It took a year; along the way they had some help from contacts they had sought out in the industry, including a confidential source: a former Google leader who worked on trust and safety issues. 

The investigation discovered several ways in which Google’s ad system does not live up to their own rules of service, and several where the ads they place can cause real-life harm. 

Despite Google’s public eschewal of gun ads, they found that Google not only accepts gun ads, it directs them to highly inappropriate websites, including to websites for kids, without the knowledge of the website owners. They found that under cover of anonymity, Google did business with a sanctioned Russian company that harvested user data. They found the Google ads platform funding porn, piracy, and disinformation

Silverman was acutely aware that few people would care about something as opaque and as technical as digital ads, however, he was convinced of its importance.

“The challenge with an investigation like that is,” he says, “is how do you actually not just find things that are worth reporting, but how do you also make people care? And we tried to do that by finding examples and by showing case studies that could make it come alive. And I still think of how we could have done that better.”

Print Friendly, PDF & Email