The Questionable Analytics of Censorship
By Bill Franks, May 09, 2019
Historically, concerns about over-zealous censorship have focused on repressive governments. In the United States, free speech has been a pillar of our society since its founding. For the most part, government attempts at censorship or speech restrictions receive swift and successful push back. In recent times, however, a new path to censorship has arisen in the form of search engine and social media companies that are building analytically-based censorship algorithms.
These organizations are using analytics to censor speech more aggressively than any past United States governmental effort and are somehow convincing a sizable portion of the population that it is a good thing. This post will outline why using analytics for centralized censorship is a steep and slippery slope and also lay out an alternative that will enable those same censorship analytics to provide people with a choice rather than a dictate.
Where is the line?
Let’s assume for the sake of argument that we all agreed that censorship was ethical and desired (of course, we don’t all agree on that, but assume we do). Under those terms then we still have to agree on exactly where to draw the line that delineates what should be censored from what should not. Reaching such agreement would be as impossible as deciding to censor in the first place. But, for the sake of argument, let’s assume we could all magically agree on the exact same lines in the sand. Does that mean we’re ready to be effective at implementing our censorship plan? No!
Even after agreeing that we should censor information and agreeing on what to censor, we still have to build the analytical processes to flag the “bad” content. As we all know, no algorithm will be perfect. So, do we error on censoring too much “legitimate” content to ensure we filter out all the “illegitimate” content? Or, do we make sure we allow all “legitimate” content through even though that will also let some “illegitimate” content sneak past? Once again, we’ll find it almost impossible to reach agreement.
No matter what analytics we agree to, the models will still make errors. Our censorship will never perfectly match our intentions, even if we agreed to those intentions. Inherently, therefore, using algorithms to censor information will lead to disparities between intent and outcome. Is this an effective or ethical use of analytics?
The Reality Today
The concern today is that we have data science teams making up their own rules about what to censor and forcing us to accept it. The people drawing the lines in the sand are not representative of the general population and the people building the models won’t be any more successful than anyone else at effectively targeting the arbitrary lines drawn. This is a dangerous situation where unelected, anonymous people are deciding what information we see and who can speak.
This isn’t just an ideological issue as some would suggest. Sure, some people will agree or disagree more with the current censorship being applied. But, just remember that even if you are comfortable with the decisions being made today because they fall in line with your world view, totally different decisions might be made tomorrow when someone else is in charge. Once you accept the right of these organizations to censor, the tables can be turned on you at some point, even if today that is not the case.
Just think of the sticky situations we’ll get into based on the standards of today. If I post an April Fool’s article, do I risk being banned for spreading fake news? At what point is my view simply unpopular or contrarian and at what point is it “dangerous and illegitimate” and worthy of being censored, along with me also being completely banished? These are not decisions to be made lightly.
An Alternative Option to Centralized Censorship
Personally, I don’t believe in censorship. However, some people do. Why not give us all a choice to view information as we prefer? The same algorithms being built to censor information by force can be made available as options we can turn on or off — much like we do with privacy settings. Let’s allow individuals to make the choice with regards to what they read, watch, or hear and what they don’t.
There can be various filters aimed at hate speech that differ based on how the user chooses to define hate speech and how strict the user desires the filter to be. There can also be filters that knock out any political content of any type, for instance, if we just want a break from politics. When we want to catch up on politics, we can always turn the filter off. We can also have positive filters that elevate a topic we’re interested in. Perhaps a big sports event is upcoming and so I turn on the filter that requests more content than usual on the event.
Analytics can be used to filter any type of information in or out. We can make those analytics available for people to choose from instead of having faceless workers in Silicon Valley forcing their choices and their models on us all.
If we aren’t careful, we’ll soon slip into an Orwellian world of extreme censorship and suppression of information. Of note is that the greatest risk today isn’t from the government but from private corporations who control the flow of information in today’s world. This is one example where analytics are being used in ways that could lead to disaster if we don’t have a broader conversation as a society about how we should proceed.
As outlined above, I’d love to see individuals enabled to make our own choices. Give us the ability to censor (or not) as we each see fit. There is no reason that the analytics of censorship can’t be steered in this direction of choice and away from the current dictatorial trajectory.
About the author
Bill Franks is IIA’s Chief Analytics Officer, where he provides perspective on trends in the analytics and big data space and helps clients understand how IIA can support their efforts and improve analytics performance. His focus is on translating complex analytics into terms that business users can understand and working with organizations to implement their analytics effectively. His work has spanned many industries for companies ranging from Fortune 100 companies to small non-profits.
Franks is the author of the book Taming The Big Data Tidal Wave (John Wiley & Sons, Inc., April, 2012). In the book, he applies his two decades of experience working with clients on large-scale analytics initiatives to outline what it takes to succeed in today’s world of big data and analytics. Franks’ second book The Analytics Revolution (John Wiley & Sons, Inc., September, 2014) lays out how to move beyond using analytics to find important insights in data (both big and small) and into operationalizing those insights at scale to truly impact a business. He is an active speaker who has presented at dozens of events in recent years. His blog, Analytics Matters, addresses the transformation required to make analytics a core component of business decisions.
Franks earned a Bachelor’s degree in Applied Statistics from Virginia Tech and a Master’s degree in Applied Statistics from North Carolina State University. More information is available at www.bill-franks.com.