Week 6: Flooded With Data

18 Sep 2018

Co-Author Anthony Morrell

Cathy speaks on the quality of algorithmic scoring systems, cutting the big data argument to the core. While everyone else seems to be talking about how frighteningly effective or how easily misused big data can be, Cathy has a vastly different experience; that of big data being misapplied, giving incorrect results and perpetuating problems.

If you’re blindly following the data, or if your algorithm writers have any bias whatsoever, your data enforces the status quo. It strengthens stereotypes and prevents progress. Predictions create reality, because of where you’re looking and how.

The biggest issue with this situation is that without sufficiently educated oversight, black boxes will feed out data which is at best accidentally incorrect… and at worse, actively spitting out falsehoods in order to profit or destabilize.

However. While the trust put into algorithmic systems and big data being largely misplaced is a major issue, it’s not the only one. Big data also applies to how that data is collected, where it comes from, how informed individuals are on what data is being collected, and how it can be used.

The scariest thing about big data is not when it’s wrong, but when it’s right. When it becomes so verbose, detailed, and accurate that it becomes not only feasible, but laughable to track the habits, locations, and lives of any individual you want to. This is the future of big data. Secret algorithms are scary… but a spreadsheet with every detail about your life from where you live to where you are currently, where you will be in 20 minutes, all your bank and identifying data; details that are only as secure as the collating companies that handle and sell them? That’s terrifying.

And Cathy is right; this data is being used to determine who gets or keeps their jobs, where laws get enforced, how much the system expects a person to make, or predict how likely they are to get in an accident. And it will only get more detailed. What about simply tracking family medical history rather than asking individuals to self report them? Suddenly you have a case where someone might be denied healthcare, a job, or a lease due to information that is predictive but not definitive; and even if it were, it’s still discrimination based on factors outside their control.

But the ideals of a well applied system of predictive engineering, done by a system with all the information and being used to offer people the best suggestions, save their careers, help them… these ideals are as impossible to get rid of as with any other dream that technology empowers.

What really is the most important question is simply that of how best to apply big data, and where it should be restricted. Because as a tool it is undeniably powerful and extremely useful. Like Cathy herself says; she’d love to see how the Justice system would change if algorithmic principles were applied to the reformation side of law. When even such a strongly ethical skeptic still wants to go forward, it seems like a good place to be.