Brooke: Hello everyone, and welcome to the podcast of The Decision Lab, a socially conscious applied research firm that uses behavioral science to approve outcomes for all of society. My name is Brooke Struck, research director at TDL, and I’ll be your host for the discussion. My guest today is Jordan Ellenberg, Professor of Mathematics at the University of Wisconsin at Madison, author of the 2014 book; How Not to be Wrong, and author of the recently published ‘Shape: The Hidden Geometry of Information, Biology, Strategy, Democracy, and Everything Else’. In today’s episode, we’ll be talking about a great topic from his latest book; humans, algorithms, and hybrid decision systems. Jordan, thanks for joining us.
Jordan: Thanks for having me on.
Brooke: Before we get started, please tell us a little bit about yourself and about this latest book; Shape.
Jordan: So I am a mathematician. I’m a math professor, and I am a geometer, in case you were like, “What’s it called when somebody does geometry?” Although what that means is probably very different from what a lot of people would imagine, and that’s one of my motivations in writing this book. Geometry is the theme of the book. That’s why it’s called Shape. And it’s not about triangles. Okay- there are some triangles in it. But it’s not mostly about triangles, because I feel like almost everything that’s mathematical or logical or quantitative or that has to do with optimization or searching or improving is geometric in some way. It’s wended its way into everything. So I found, actually, that there was too much to talk about rather than too little, and so in the book, I talk a lot about developments for artificial intelligence and machine learning. I talk about the very vexed problem of congressional redistricting, a very hot political issue in the United States. I talk about tournament level checkers. I talk about the spread of pandemics. I didn’t expect to be writing about that, but I suddenly, for some reason last year found myself interested in this topic, which is also a deeply geometric one. And then the occasional triangle. There’s a triangle or two.
Brooke: Yeah, and if I can just add my voice to the redeeming qualities of geometry, because of course it needs it, right? After everyone’s high school experience with geometry, it needs rehabilitation in terms of its image. If you’re not excited enough about geometry, if you’re excited about bitcoin or cryptocurrencies, all of the calculations that go into processing those transactions are done using GPUs, which are all about geometry. So there you go.
Jordan: Maybe you’ll explain that to me at some point and the relation to crypto, which I don’t touch in the book at all, but I would love to know more about it. You brought up your high school experience. What was your high school experience with geometry? Can I interview you for a second?
Brooke: Yeah, totally. I loved geometry and trigonometry. I was all over that. I was really a math kid.
Jordan: Because one thing I say in the book is that geometry is the cilantro of math. People are not neutral about it. There’s people who are like, “This is my top subject. Everything else was like, “Why am I doing this?” And then geometry made sense.” And there’s other people who are like, “Everything made sense except geometry. Why did I do that? I liked computing stuff and doing calculus. But geometry, why were we doing that?”
And actually, I found that the people who feel one way often don’t even know that there’s half of the world that feels the other way. But for people who like it or for people who hate it, it’s clearly very different. It’s clearly very different from everything else we do in the high school curriculum, and it clearly sort of has its own mathematical vibe, which is one reason I thought it would be a good topic for a whole book.
Different Problems Need Different Solutions
Brooke: Okay, so let’s dive into the book a little bit. There are two types of problems that you treat there that from a decision science angle strike me as somewhat different, so let me start pulling those apart. One type of problem is one where there is a correct answer, and the second type of problem is one where there’s an appropriate answer. We might expect that for both of these kinds of questions, we should hope for a wide or high degree of agreement among people if you were to ask them what the answer is, probably higher degrees of agreement amongst specialists but ultimately, one has an objectively correct answer, and the other does not. If we think about two examples of these, playing ‘Go’, for instance, we can measure objectively which algorithm is better at playing Go than another because there’s a very clear set of rules that dictates who wins and who doesn’t, and which moves are allowed and which are not. If we think about redistricting, that’s a very, very different type of problem. There are many, many different solutions to that. And while we can perhaps stipulate some kind of condition for ranking those solutions, whatever kind of explicit rule we would want to give to that ranking is going to be extremely contentious. There’s not going to be one agreed upon, fully explicit rule about which districting solutions are better than others. Can you walk us through a little bit about that and how algorithms and geometry fit into those two different types of problems?
Jordan: Sure. And maybe I’ll start from the second part of the question, because I think this question of redistricting, which just to give the three-second sum up, is the way we divide a political region, like a US state, into smaller regions, each one of which has a representative. It might seem like, “Oh sure, just draw some boundary lines.” Well, it turns out how you draw them has a huge effect on who ends up sitting in the legislature, who ends up making the laws, among them deciding what the next map looks like. So you see the feedback loop problem pretty immediately. It’s very easy for people from my community of mathematicians to see that as a math problem. And I think one of the big challenges, a challenge I like to think that people have met, is really realizing that not every problem that has math in it is merely a math problem. Just as you say, if you go into this with the idea of like, ”What is the right answer to how the districts should look?” I’m not even sure that question has an answer, but if it does, it certainly doesn’t have a purely mathematical answer, right? There’s an interwinding between the mathematics, the law, the philosophy if you like, the ethics, and the politics; all of which matter. So I think when people try to approach this purely mathematically, you just get stuff that’s kind of useless. That being said, also, if you try to approach it entirely non-mathematically, you also get things that don’t work. And I think that’s more traditional in politics, to think of this as a problem you can solve purely by political or legal reasoning. I think that’s in the end going to be just as unsatisfactory as trying to treat it as a math textbook exercise.
Now, when you talk about something like Go, just as you say, there’s much clearer metrics. We know there’s an absolutely agreed upon criterion for who has won and who has lost. But I think it’s also worth asking yourself, what is the actual goal? Is it to be the best at winning? I mean, that’s one thing you could mean by it, but it’s not the only thing you could mean by it. And so in the book, I talk some about Go, but I talk even more about checkers, which is farther along that path.
Checkers is a game where not only do machines outplay the best humans, but the game is solved. In other words, we literally know, as a mathematically theorem, that two perfect checkers players, there’s no way for the first player to win, and there’s no way for the second player to win. It’s like tic-tac-toe. Two players who don’t screw up will always draw. And then there’s an interesting question. You might say, “Okay, it’s done. There’s literally no point in playing checkers.”
Well, you could say that if you wanted to, but in fact, people still do play checkers. There’s still a human world championship with checkers that people vie for. And I think you could approach that two ways. You could say, “Well, those people are just wasting their time because the game is solved.” Or you could say, “Wait, maybe there’s something to it that is not captured by this mathematical solution.” Maybe it’s a bias, but I tend to be in favor of, if people seem to be doing something, maybe my first guess should be there’s some worthwhile reason they’re doing it before declaring it an empty enterprise.
AI & Algorithms
Brooke: Yeah, from an AI perspective, the way that you create and train an algorithm to play checkers is diametrically opposed to the way that you train an algorithm to play Go. Can you say a little bit about that and the size of the possibility space?
Jordan: Well, they’re problems that are hugely different in size. And it’s interesting that you say that, because I think one thing is that because checkers is such an astronomically smaller problem than Go, checkers can be solved by a previous generation of techniques. Now, if that had not been the case, like let’s say if you were to try to design a checkers engine today, I wouldn’t be surprised if you did try to use the same kind of AlphaGo type methodology that was used to solve Go.
Now, because the game is solved, you would know, provably you can’t do any better than the… Actually, it’s a question I never thought about, and it’s a pretty interesting one. There exists an absolutely perfect, not approximately perfect, checkers player. Given what we know about Go and chess, it would be surprising if training on the same kind of protocols we used for Go and chess didn’t produce a very, very good checkers player that would beat top human players. Would it generate an actually perfect, a provably perfect checkers player? I’m not sure, actually. I’m not sure what I think. What do you think? Because they’re not designed for that, right?
Brooke: So let’s just back up on that and talk a little bit about how these algorithms are trained differently in these two circumstances. In the case of checkers, because the set of finite moves is so much smaller at each stage, it’s actually computationally possible to compute all of the possible end games for each move. And that’s what makes it possible to finish checkers, to solve checkers. When it comes to Go, Jordan, I’ll ask you to chime in here. What’s the number of possible permutations that there can be?
Jordan: Oh, I’m sure that number has not been computed. I suppose we could probably get a lower bound for it, but it’s a number so big there’s no point in stating it. Does that make sense?
Brooke: Right. So more or fewer permutations than there are stars in the universe?
Jordan: Oh, way more. I think even checkers has more than that. Come on. That’s a – Stars are very large, and there’s not that much room for them.
Brooke: (laughter) Sorry, I apologize for not thinking bigger. But anyway, what that means is that for checkers, you start from this kind of computational approach where you say, “I’m going to calculate forward based on all of the possibilities.” And because that realm of possibilities is small enough that a computer can manage it, you can actually make progress with that approach. When it comes to Go, because the number of possibilities is so absolutely massive, the way that the current generation of Go algorithms were programmed and were trained was by exposing them to, I guess it’s got to be hundreds of thousands of games of Go. So you look at the actual sequence of moves from real games that were played by real, individual humans, and from there, the algorithm kind of extrapolates what are the strategies that are most successful. Now, that’s a very simplified approach, or a simplified description of the approach, but that’s a really different kind of computational problem than extrapolating forward based on logical possibilities..
Jordan: Yeah, and actually, one thing I write about in the book, and actually, this comes up actually in talking about pandemic modeling, and not around talking about checkers, but you’re drawing a connection that maybe I didn’t quite see, that there is a deep question, if you’re trying to design an algorithm for figuring out what to do or for predicting the future, which are very interrelated problems, as you know. You can do that by trying to incorporate information you know about the structure of the problem. So if you are designing a checkers engine, Jonathan Schaeffer, who designed Chinook, the computer champion of checkers, I promise you that guy knows a lot about checkers. He brings to it a lot of knowledge. Now, in some sense, the modern Go engines, you might say they don’t start with any prior knowledge about Go at all. They’d have to know the rules, of course. They have to know what you’re allowed to do and what counts as winning and what counts as losing, but that’s all. They don’t have to have any prior information built in as to what’s a good strategy or something like that.
They have this database of existing human games, and then of course they also have self-generated databases where they play against themselves many millions of times, and that can generate an even bigger database, different versions of itself testing itself against itself and honing its strategy. So those are actually two really different paradigms. Are you trying to understand the structure of the process and use that to predict, or are you just looking at previous results and trying to statistically interpolate from what you’ve observed?
I call the first thing reverse engineering and the second thing curve fitting. I don’t know why I say I call them. That’s what they’re called. (laughter)[inaudible 00:15:26] I set up in the book explicitly that tension between those two ways of analysis. And I think you see it most vividly, to me, in natural language processing, where if you’re trying to get a computer to do a language task, traditionally, any spoken language has lots of structure. Sentences are made of nouns and verbs that chunk together in predictable ways. We learn how to diagram a sentence. Did you do that when you were a kid? It was a very generational thing.
But modern language tools really don’t do that. They really just take these giant corpora of human language that humans have produced, and by looking at statistical regularities, try to figure out what word is most likely to come next after the previous five words. And they do it without anybody telling them what a noun is or what a verb is. Those are pretty different approaches.
Brooke: Yeah, for sure. And I want to tap in now to this idea of the interactions between humans and machines. If we think about Go, for instance, when AlphaGo was being trained, it was just looking at huge, huge numbers of games, initially games that were played by real human beings, then playing against itself and testing out strategies and that kind of thing. But the basis on which it was built was a set of human games of Go. And there’s this fascinating moment when AlphaGo was playing against the human world champion when there’s this move that AlphaGo makes that comes completely out of left field. And the world of Go, as you can imagine, sitting on the edge of its seat, trying to follow this great intrigue, was shocked by this one move that was just so revolutionary in the way that it suggests a completely different interpretation of the game.
If anyone here is a big fan of soccer or football or some human sport, the idea that there could be millions of people around the world sitting on the edge of their seat, waiting to see what happens in a game of Go may seem a little bit strange, but just take it on faith that that actually happened, and that there was this kind of collective gasp around the world as people went, “Wow. Something amazing just happened.”
And that’s a really cool phenomenon to emerge from something that fundamentally was only trained on human data, or started out only trained on human data. But from there, it managed to produce something that changed human’s interpretation. If some human had made that move 100 years ago and that collective gasp had happened, then the subsequent 100 years of Go might have played out differently, which means that the inputs on which AlphaGo would have been trained would have been different. AlphaGo would have been a different algorithm if that had been done by a human 100 years prior. So these feedback loops are really, really fascinating.
Jordan: And actually, here’s an empirical question I don’t know the answer to, which your question raises. I mean, there are different Go communities that play with different styles, the same with chess or with any game. I don’t know if you train an engine on one community of play versus a different community of play, do they tend to converge on each other and arrive at roughly similar strategies, or would you see perhaps both very good but rather different in style engines trained on the two different corpora? Certainly we know in language, that’s very true. Different corpora, you train and you just get different kinds of text generation engines. And I actually don’t know if that kind of experiment has been done in game playing of this kind.
Can Machine-Printed Artwork Be Better?
Brooke: Yeah, I don’t know either, but I think about something like… There was an AI algorithm that was trained essentially to forge Rembrandts. I don’t know if you followed this. They basically did 3D scans of Rembrandt paintings and tried to create an algorithm that could build a new Rembrandt. Not just copy an existing work, but create a new work that would be essentially convincing enough to fool an expert.
And in fact, when they had some world renowned Rembrandt expert look at what was produced with a 3D printer so that it would have depth to the paint, which I found super cool, the master looked at it and said, “Oh, that’s very interesting. This looks like a Rembrandt from this period and this kind of thing, but I’ve never seen this one before.” And so indeed it was successful. But one of the things that the machine didn’t do, which a lot of artists do, is look back on their past works and say, “Well, this is total crap. I’m going to do something absolutely different.”
It reminds me of this great quote that, “Mediocre artists borrow, real masters steal.” You have to appropriate it for yourself. It has to become your own. And that’s something that you don’t see in the Rembrandt case, and that’s something that in natural language processing, from what I’ve seen of computer-generated story writing, it’s still in its infancy. I don’t see anything there where the computer has managed to become the master that steals and appropriates style for itself. But in the case of Go, it seems like we are kind of on that frontier, when that big collective gasp moment happened, that was the Go engine kind of jumping over Rembrandt.
Jordan: But here’s the thing, Brooke. Remember, who’s gasping in that scenario? It’s not the machine. It’s us. That feeling of surprise, that feeling that something is truly new here, that’s our feeling. So it’s not so clear to me. After all, in the case of Rembrandt, you’re saying the computer is doing what you might call interpolation, filling in between existing examples and making something that you might see is to diminish the whole process of painting in a way, the average Rembrandt, something that’s, we would say in math, in the convex hull of a bunch of existing Rembrandts or something like that.
And so in that case, the master looks and says, “Yeah, this is something that fits into some existing cloud of Rembrandt’s.” But what the machine is doing is perhaps not that different, and so you might say that, “Oh, maybe what’s actually the case is that that startling move that everyone gasped at, maybe that move is in fact an interpolation that sits nicely inside the cloud of existing human moves, and we don’t happen to have perceived that.”
The Electoral Map and Gerrymandering
Brooke: Yeah, that’s interesting. I would ask also whether master artists, if I could ask Rembrandt, “Were you ever surprised by the things that you came up with?” Was Rembrandt a humble person who would say, “Yes, of course. This was very surprising to me. I never expected it to be such a hit.” Or would he be some arrogant SOB who was like, “Oh, this is just hogwash, and everyone’s impressed only because they are so feeble.”
But shifting now from moments when AI watches us and learns from us, to us watching AI and learning from it, let’s talk now about the gerrymandering example. In the book, you talked a lot about essentially teams of political analysts sitting down and carving up maps to make very advantageous ridings, as we would call them in Canada, or districts, as you would call them in the States, to essentially create a nice kind of buffer against adverse winds in voting patterns.
So if everyone’s leaning towards your party, then everything’s going to go well for you, but the winds have to blow quite strongly against you in order for it to start having a real effect. Tell us a little bit about the process of gerrymandering and how AI got brought into that sphere.
Jordan: Yeah, absolutely. By the way, it doesn’t happen in Canada because there were reforms put in to keep it from happening, but it certainly used to. In fact, in the book I include a poem about it from The Toronto Globe from the 19th century. It was a well-known enough practice that there was a poem called Hive the Grits. Are liberals still called grits in Canada today, or is that an outdated term?
Brooke: Yes, but only when we wear our long white nylons up to the knee and our black leather shoes. (laughter)
Jordan: (laughter) But yeah, “Hive the Grits, pack them all into hives so that their legislative power is reduced”, is how it used to be done in Canada, just like it’s currently done in the United States, before reforms were brought in. But yes, absolutely. This is something that has been, on and off, a very hot topic in American politics since before there was America, really. Since colonial times.
But it’s especially hot now partly because it’s grown more mathematical, both on the sides of the people who are doing it and on the side of the people who are trying to reform it away. So it’s funny, I’m not sure I would say it’s AI per se that’s involved, but something of the same flavor, that what you’re doing when you try to write a good Go engine is you’re trying to explore the space of all possible strategies for playing Go and try to find the best one, best by some metric.
As you say, in that case, the metric is very objective. What you need to do to understand gerrymandering is to somehow understand the space of all possible ways to split a state up into districts. And that, like the space of Go strategies, is just an insanely, unmanageably, uncomputably huge space. That’s what the two problems have in common, that and the fact that you’re trying to explore that space in order to find, if you’re the gerrymanderer, if you’re the person trying to get an advantage for your party, you’re trying to explore that space to maybe find the map that is best for your purposes, that’s best for winning.
So in that sense, it’s quite close. If you’re the reformer, you’re doing something rather different. You’re exploring the space at random, doing what’s called a so-called random walk, which is the fundamental geometric idea that runs through the book in many, many, many different contexts, in order to say, “Hey look, I got my computer, who doesn’t care which party wins, to generate 20,000 different maps.” And in some of them, this party did a little better, and in some of them, this party did a little worse, but here’s the range between this many seats and this many seats is usually what happened. Like everything else in probability, you get a nice kind of bell shaped curve. So here’s what happened when we drew the districts without the aim of partisan advantage, and here’s the districts we actually have. They’re way over here. Of course we’re on SoundCloud, so you can’t see the extravagant gesture I’m making with my hand, but just imagine I’m pointing way over to the edge of the bell curve. And by usual statistical methods, saying, “Okay, this did not happen by chance. These districts were picked explicitly to give a huge mega advantage to one party.” And that’s what we can do with the geometric methods.
Brooke: Yeah, I really enjoyed that chapter, if for no other reason than stylistically there were so many bait-and-switch moments, where it was like, “Well, what if we tried this metric?” And as I’m reading it, I’m like, oh yeah, that metric sounds like it’s really going to be good. And then I read on a few pages it’s like, “Here are all the reasons that metric is terrible.” Okay, well, I’ve been let down. But wait, there’s this other metric. I’m like, “Oh, okay, cool. Here’s the one that’s going to be great.” I read on a few pages, and it turns out that one’s terrible as well. So I don’t want to ruin it for anyone who’s going to sit down and read the book, so spoiler alert, the metric that you come to is about vote efficiency. Can you talk a little bit about vote efficiency?
Jordan: Well, so what it means is essentially – I mean, to give a short primer on how gerrymandering works; much as you might want to, you don’t get to control how many people in your state are going to vote for your party or going to vote for the other party. Well, I guess in some crazy ideal world, you could take on popular policies that people like and thus affect the number of people who voted for your party, but let’s say that’s off the table. (laughter) Let’s say you’re committed to doing whatever it is you’ve decided to do, irrespective of the public will. And so you have to ensure your legislative majority in some other way, against potentially the will of a possibly hostile electorate. And so the way you can best do that is to ensure that there are districts where the opposition party has some huge proportion of the vote. You want to use up a lot of their voters in these districts that almost entirely belong to them, where 80, 90% of the voters belong to one party, leaving a residue of many districts where you have a modest advantage, enough so that you’re going to win almost all of them.
So if your opponents have a small number of districts where they have 90% of the vote and you have a lot of districts where you win 55% of the vote, you’re going to have a huge legislative majority even though, statewide, there might be an equal number of voters. And actually, I don’t even know, in Canadian politics, if there are these monocultural ridings where one party absolutely dominates to a 75%, 80, 85% level. That does exist in the United States.
Brooke: Yeah, it does exist in Canada as well. If I think about Alberta and Saskatchewan, for instance, not a single liberal candidate was elected in the last election in either of those two provinces. I think there’s only one liberal west of Ontario before you reach BC, so you’ve got this massive blue conservative chunk in the middle. So you do have exactly this kind of thing. But yeah, I mean, we can get into Canadian politics on another show, perhaps over a beer. (laughter)
This is an interesting thing, right, where now what we’re talking about is districting, which is an activity that’s undertaken by humans, and the suggestion is not being put forward that an AI algorithm or even any kind of fully explicit algorithmic approach, whether driven by AI or not, should be used. Rather, AI is being used as a way to put up some guardrails for human action. So rather than AI watching us and learning to play on its own, as is the case with Go, what we’re doing here is essentially just letting AI roam free, and we are watching it to learn about our own strategies, and to help us to inform decision making. In this instance, it’s about taking this random walk, as you mentioned, creating thousands upon thousands of possible maps, and then looking at where human-drawn maps fall within that distribution to figure out the extent to which it’s plausible that such a map could have been driven without being really, really strongly drawn based on partisan advantage.
Jordan: Right, and actually it’s interesting, because you’re drawing out this opposition, which I feel like is a very useful way to think about it, but I also think that in any real world problem, both of those tendencies are going to be there, right? So you described the thing about game playing as the machine learning from us, learning from this corpus of humanly played games that it has. But we are also looking at the way the machine plays and learning from it and getting ideas from it, so there is flow in both directions.
And in the same way… Well, let’s see if I can make this case for the gerrymandering situation. I suppose it’s true that the machine is, in some sense, not supposed to take into account too much about the way that the humans have priorly made decisions about districting. I mean, people do live where they live, and this is actually one thing that makes the problem so difficult, that when people previously tried to come up with these very abstract metrics of, “is this map fair?”, I think people would say something like the following. They would say, “Well, look, we didn’t make it be that there were cities that were 80% democratic, and we didn’t make it be that there were regions that were 80% republican. That just happened. People live where they live. They choose to live where they live. They may choose to live around people who are politically like them. Why are you saying there’s something nefarious about this? This is maybe how it is.”
Well, one of the great things about this random walk technique is that it does take something about human choices, because it does take as input where people actually live and who they voted for in the past. And so what that means is that we can separate out those two effects. Wisconsin is a great example. I wrote a lot about Wisconsin, my state, in the book, both because it’s a place with very high profile gerrymandering and court cases about gerrymandering, but also because it’s a state whose population traditionally has been very close to evenly split between the two main US parties. And what the analysis shows is that the fact that many Democrats are packed into the city where I live, Madison, and into Milwaukee, the other large city in Wisconsin, it does give Republicans in Wisconsin an advantage that does not come from intentional gerrymandering. But it also allows you to show how big that advantage is, and it’s about half the size of the advantage they actually get. So they double the size of their advantage by strategic line drawing. And so that’s what’s cool. I think prior to these techniques, a reformer would have said, “You clearly gerrymandered,” and the Republicans would have said, “Nope, that’s just how it was because of where people chose to live.” And now we can actually sort of distinguish between those two hypotheses.
Brooke: Yeah, the other thing about the random walk approach is that as people start to cluster in this ‘politically sorting’ type of way, the bell curve itself will move with it.
Brooke: So it’s there. It’s in the numbers. It’s not like there’s supposed to be some magic number in the middle. It’s responsive. It’s sensitive to those kinds of sorting outcomes.
The Power of Electoral System Design
Jordan: The people who work on this, and I’m a popularizer, I talk to the mathematicians who actually do this work and develop these algorithms, and definitely the practitioners don’t see what they’re doing as, “Oh, we’re just here to provide evidence for court cases about districts.” They see what they’re doing as a kind of novel way of doing social science and understanding questions like, what would be the political effect of certain kinds of migrations and certain kinds of shifts of population?
Another really interesting question, and actually, this is squarely in decision science, is most of the work so far sort of takes for granted that the way elections are run; is what it is. In this very simple what’s called first-past-the-post system in almost all US elections where everybody votes for their first choice candidate, and whoever gets the most votes wins. Very simple, and it means that anyone who’s not from one of the two big parties is essentially not a factor, except insofar as they may draw those off from one of the two major party candidates.
So I think a really interesting thing to look to for the future is that, sort of to my surprise, actually, mathematicians have been talking about alternating voting systems for years, and academic political scientists. That idea is gaining some traction in the United States. New York City is switching to ranked-choice voting for all its elections this year.
The entire state of Maine switched last year. I think a really interesting future question is, how does radically changing the way in which we vote affect the ability to gerrymander? And I think one cool thing about these random walk techniques is it kind of gives you the ability to predict that in advance rather than just letting it happen.
Brooke: Unbeknownst to yourself, you are sticking your finger in a very sensitive wound for Canadians. (laughter)
Jordan: Oh, because you had a thing in BC, right? Wasn’t there a referendum in British Columbia?
Brooke: Oh, yeah. There is the British Columbia example as well, and it seems like they made a bit more progress there, but federally in 2015, the current government ran on a platform that included electoral reform, which then famously died before getting to the floor of the Parliament.
Jordan: Because people reneged who had said they were for it?
Brooke: Ah…I don’t want to get too much into that, (laughter) but I mean, the reasons behind it… Part of it is that alternative systems are harder to understand. Not necessarily inherently, but if you’re looking at a situation where the electoral base is not super engaged, and there are questions within the population about even how the relatively simple current system operates, there’s very much a behavioral angle to this, asking how much would a more complicated system that allows us to map better these kind of complex contours of voter preference create noise within the data because people actually don’t understand how their inputs are fed into the system? And so that was one of the issues here in Canada, is that people worried that that noise issue, I mean, it was never framed in this language of course, but that the noise of a more complex system would actually kind of negate the benefits of being able to map contours more closely.
Jordan: So I don’t want to stick my thumb further into the wound and move it around or anything, but were Canada’s leading decision scientists such as yourself consulted on what the effects would be from a human system interaction level?
Brooke: I’m a young man, Jordan. (laughter) Even in 2015, I wasn’t the renowned decision scientist that I am today (laughter). Okay, but let’s shift gears here a little bit.
Hi there, and welcome back to the Decision Corner. In this episode of the podcast, I’m talking all things mathematics and geometry with Jordan Ellenberg; best-selling author and Professor of Mathematics at the University of Wisconsin at Madison. So far we’ve spoken about the different types of problems that can be addressed by human beings and Artificial Intelligence, including the very pertinent issue of electoral system design and gerrymandering. Coming up, we’ll dig a little deeper into the tension between ‘Man’ and ‘Machine’, and ask whether there is potential for us to coexist and maybe even learn from each other. Stay tuned.
Man Vs Machine – The Case of Autonomous Vehicles
Brooke: Yeah, so let’s shift gears a little bit. If we’re thinking about different circumstances where we can deploy AI tools, or any kind of algorithmic tools, what are the characteristics of these situations that we should be paying attention to so that moving forward we can know whether we should be leaning towards setting up an ‘us watching the machine’ situation or a ‘machine watching us’ kind of situation?
Jordan: Yeah, it’s a good question, and I think my instinct is to say, and I’m kind of working this out as I say it, that the situations where the machines maybe require less oversight are those, exactly as you say, where the goals are more clearly delineated and agreed upon. In something like a game, a game like checkers or chess or Go, where we’ve settled the matter of what it means to win, and we’ve settled the matter of what it means to be successful. I think that in a situation where the goals themselves are kind of contested by humans, I think that’s where, in some sense, we can be watching the computer and learning from that about trying different things and seeing if the machine we build, seeing what it achieves and seeing if it achieves something we want. But in the end, there’s going to be some element of human judgment and human supervision that you don’t actually want to get rid of.
So one thing in terms of the voting, and then I’m going to switch to a different topic, one thing many people who think about this problem say, is they’ll say, “Well, look, why don’t we just let the computer draw the maps, given that it can do it, and given that we can program it not to care which party gets ahead?” And the answer is a political one. And sometimes in my community, political means bad, but political is not bad. It’s actually a question about politics. Why shouldn’t it be political? And so the answer is that I think neither voters, nor elected officials, nor courts are really going to accept handing off this responsibility to a machine. And actually, I’ve come to see that that’s quite valid. We don’t actually want to do that without human supervision. I think an interesting intermediate case, tell me what you think, is the question of autonomous vehicles. Because there, what counts as success is probably more clear and less contested than voting, but not as clear and not as uncontested as games we’re playing right? There’s certainly some clear failure modes of autonomous driving. And I think for that very reason, I think you see it hotly argued about, is the goal to have a true autonomous car that really just, without a human in it, it goes places and does things? Or is it always meant to be something where there’s some interaction between human and machine, human-supervised driving? I’m not going to even weigh into that debate, but it certainly is a debate.
Brooke: Yeah. I think my position on that is that the end game is one perhaps that is easier to state, where if we were to say, “We want to completely rethink transit from the ground up,” as though thousands and thousands of years of human history have not brought us to the point that we are right now, the answer would not be to have most cars on the road driven by humans and some vehicles on the road driven by machines.
The problem in this situation is us. We would never think of a system where we have machine drivers, where there are individual cars that need to make decisions. A machine would solve this as a systems problem. There is a system that makes determinations for the system as a whole, not a whole bunch of independent actors who need to propiocept each other. The challenge is that the machines are introduced now into a circumstance where humans are acting individually and propriocepting each other with the benefit of a lot of track record to get us to where we are, and also a lot of behaviorally well-documented biases that make us okay with lots and lots of death on the road. Whereas a single fatality from an autonomous vehicle just looms so large in the public psyche. One question that I have not seen asked is, of all the miles driven by autonomous vehicles, if those miles had been driven by humans, how many deaths should we have expected to see, and how does that compare to what autonomous vehicles have done? That’s just totally not the narrative.
Jordan: Oh, if you haven’t seen that question asked, that shows that you don’t hang out on Elon Musk’s fan Twitter. I promise you that very issue is discussed quite a lot.
Brooke: And what’s the answer?
Jordan: Oh, I don’t remember, because I’m not really part of Elon Musk’s fan Twitter, but I mean, one sees it. It sort of appears in one’s feed. No, I think it’s pretty low. Actually, I just was watching a discussion on this. I mean, you can do that computation, but of course, right now, autonomous vehicle trips are anything but a random sample of all trips.
Jordan: Right? When is that being used? It’s being used obviously in a fairly new car, because the car that has it is new, so a car which in other ways has better safety features. It’s being used under different conditions. People are probably not turning on full autonomy in the middle of a thunderstorm in the dark.
Brooke: Yeah. The other thing is the type of errors that autonomous vehicles make are ones that humans are very unlikely to make, but autonomous vehicles perform exceptionally well in avoiding the errors that humans make at a much larger rate. For instance, and going back to the chess example, I think one of the earlier algorithms that was designed to play chess, the strategy that a human player exploited in order to beat that algorithm was to essentially make the moves that come with the highest computational cost for the algorithmic opponent. And so what ended up happening is the algorithm took so much time in its earliest moves when the computational requirements were so high, that by the time it came to later moves, it had to apportion its computational time. And so the human player just kind of dragged it along and dragged it along and dragged it along until the computer was out of time, and then it needed to move very, very erratically in a way that was computationally not optimal.
This is the kind of thing that at this point, the computational power has overcome that type of limit. It’s hard for a human to use that kind of strategy anymore because computers just run so quickly. But it’s the type of thing where a human exploited something that another human would have picked up on and taken some kind of compensation strategy. But computers learn in a different way than we do, and so things that seem obvious to us don’t always come out so clearly in the algorithm.
Jordan: Yeah, and actually this really ties in with something I talk about in the book, which is this notion of difficulty. Maybe this is where I get a little philosophical, but what does it mean for a problem to be difficult? And I think we have a kind of mental model, which is pretty impoverished and not good enough, of difficulty as kind of like a line. Things are arranged from least to most difficult. But in fact, this is not a very good description of what difficulty is, because different kinds of tasks are difficult in different ways, and different kinds of tasks are difficult for different kinds of thinking beings. So I think there’s a way in which people see something like AlphaGo and say, “Crap, this is it. The computers are smarter than us now, in this very hard thing, being good at Go. For the computer it’s easy, so computers are better than us.”
No. That’s one task. There’s other tasks that humans, at this moment, find very easy and machines find extremely difficult. So I think it’s very salutary, actually, because this difference between different ways of learning, as you say, it really just puts right in our face that there’s no linear nature of difficulty. Every time you say, “A problem is difficult,” you’ve got to say, “Difficult for whom?”
Brooke: Yeah, that raises a really interesting thing about the way that AI is deployed in the market. If we think about the way that Amazon warehouses are run, it’s an AI algorithm that dispatches the orders to people to go and collect things, but the reason that there are still people going to collect things is that people are still the most efficient ‘computer’ to move to a designated spot, open a box, identify visually, very, very rapidly what’s inside the box, grab only the relevant thing, close the box, and put it back.
Jordan: Huh. I didn’t know that. Is that to a machine a hard problem?
Brooke: Yeah. It’s not yet economically efficient to build a machine to replace that worker, which also says a lot about how we treat that worker.
Jordan: I think I read that folding a shirt is another problem like that, that it’s something we think of as easy, but that is apparently incredibly hard to get a mechanical device to do.
Brooke: Yeah, washing dishes as well. The correct amount of pressure to apply to a dish, to hold it while washing, is extremely difficult.
Jordan: Oh my God. Are there hilarious blooper reels of robots attempting to wash dishes and smashing them all? I would watch that for hours. (laughter)
Machines Learning From Us, and Us Learning From Machines
Brooke: Well, after you’re done here, you know what you’re up to for the next few hours. Okay, so how do we create optimal conditions for humans and machines to learn from each other? My sense is that part of it has to do with embracing this notion of difficulty along different dimensions, difficult for whom and in which ways. And part of it-
Jordan: And by the way, when you say dimensions, already the geometry is there. It’s very much you’re making a geometric statement when you’re saying there’s more than one dimension there. Sorry for interrupting. Had to say it. Carry on.
Brooke: Yeah, for sure. And the other is maybe about, as we discussed before, how explicitly we can state the goal condition. For instance, and maybe the considerations that go into it as well. So if we think about, for instance, decisions about to whom credit should be extended, we might say, “The goal condition can be quite explicitly stated: don’t run risk above a certain threshold,” but that’s not all that goes into decisions about extending credit. Credit, especially for home ownership, the history of who gets credit and who doesn’t in terms of home ownership in the United States as well as in Canada, has got very clear racial dimensions to it. There’s an equity issue and a historical issue that really needs to come into that consideration. Equity is deeply embedded in there, even if on the surface there seems to be something that can be very clearly and explicitly stated as a rule that a machine can operate with. So that’s one. And the other is around the input data that you use to inform the decision. There’s been lots and lots written about that as well, that if you include someone’s postal code or their zip code in calculating their risk, you’re just going to wash a whole bunch of often racially biased data, not necessarily intentionally, but just along with postal code comes so many highly correlated factors that you will end up producing racially biased results even if race never was explicitly integrated into your model.
Jordan: Yeah, and I think what you say exactly speaks to why there’s never going to be a final answer to ‘who’s watching who’, like who’s in charge and who’s subservient. I think in the end, there’s always going to be an interplay between the human and the device, just for the reasons that you say. You don’t want to do what you might call blame washing, which is when something bad happens, but you somehow gave the decision making to a being who, by definition, has no moral status and can’t be blamed for anything, a machine. And you could say, “Well, don’t blame me.” Who should I blame? Not the machine, because you can’t blame the machine because it’s a machine, right? So then you’ve sort of made the blame disappear somehow. But not really, it’s like the rabbit was always there, even if you put it in the hat. So it’s a complicated question, obviously. But I think part of it is just as you say, I think you should give up on the idea that it’s going to eventually be a one-way street and you have to figure out which way the street goes. I think you have to accept that there’s going to be iterations of responsibility and supervision between the human and the machine. And maybe one thing I would say, and this is one of the parts of machine learning I find most interesting, is that we talked about the flow of information from the machine to the human as humans watching the machine and learning from its output. We have to settle for that, and I don’t think we should settle for that.
I think people work on what’s called legible machine learning, and to say, “to what extent can we understand not just what’s being output by the machine, not just what it’s telling us is a cat and what it’s telling us is a dog, but what are the guts? What’s it actually doing on the inside?” And of course that’s incredibly complicated. I like to think of it in terms of debugging, something that any one of us who has programmed knows what that is. And a kind of program that I used to write when I was a kid, like 60 lines of BASIC, the issue was, find the line of code that’s creating the screwed up output. Find the place where I typed 50 go to 50. “Oops, no, don’t do that.” In a modern machine learning algorithm, we can’t really be like, “Oh, this neuron, that’s the problem. That’s the one that’s screwing up.” It’s not modular in the same way.
This is just a dream I guess, but I feel like the future of debugging, the future of trying to understand what’s going on under the hood, is work with a system that’s clearly carrying out some complicated, and I would go so far as to say ‘cognitive’ task, but which is not neatly broken up into modules which you can look at one at a time, and yet maybe what it’s doing is clearly dysfunctional, and you’ve got to try to figure out, ”Okay, what intervention can I do to make it more functional?” Does that sound like anything? That’s clinical psychology. That’s literally what that entire field is. And I do think we’re probably looking at a future where there is something that looks more like clinical psychology, and maybe even psychotherapy. Won’t that be a cool development with AI? I mean, the machines will need therapists for sure, because you can’t debug them the way that you can debug a 60-line BASIC program.
Brooke: Yeah, especially those algorithms that are learning from us, they’re definitely going to need therapy.
Jordan: Right. (laughter)
Brooke: But in terms of the practical steps, for someone who’s thinking about how to use algorithms and how to use AI to get to the next best solution, not just to say, “We need the one solution that’s going to rule them all for all time,” but “We’re standing where we are now, and we want to be somewhere better tomorrow.” The starting point from our discussion seems like the first thing is to diagnose what kind of problem is it that you’re solving?
Is it the kind of problem that has a success condition that can be made fully explicit, or is it one where there are going to be kind of mediating factors that inherently need to go into figuring out whether something is the right solution, or which one wins for instance? If you are looking at a situation where you have no mediating factors, for sure AI and algorithms are going to be a really good bet. If you have few mediating factors, AI with some human supervision is probably a good place to start.
As you start to have more mediating factors that you can’t explicitly articulate and weigh off against the others as you create this kind of cloud of conditions that all need to be met, that’s where you move more and more towards something where you’re probably going to continue relying on human decision makers. But then you should be asking yourself, “Well, of this cloud of conditions, are there certain ones in there that I can render in a very explicit way that I can parcel out and send to the algorithm to do its work? And then I will use that as a way to reflect on what it is that human decision makers are doing.” Does that sound about right?
Jordan: Yeah, I think so. Maybe I’ll say this; the one thing I would add as a kind of slogan is that I think there’s a certain asymmetry, which as you say, some problems are more like checkers, and some problems are more like redistricting. But I think, to be honest, most of the world is not so much like checkers. It’s an incredibly important testing ground, but I think just looking around the world and the way people talk about things, I think there’s more of a problem with people mistaking non-checkers-like things for checkers than there are people mistaking checkers for non-checkers-like things.
So I would say that a real kind of humility is required. And if you think you know exactly what the objective is, it’s always good to constantly be taking a step back from that and asking yourself if the objective is really as well-defined and uncontested, as checkers-like, as you think it is.
Brooke: I think that’s a brilliant note to end on. Jordan, thank you very much for this conversation. It’s been great.
Jordan: Thank you. I learned a lot. It was awesome.
We want to hear from you! If you are enjoying these podcasts, please let us know. Email our editor with your comments, suggestions, recommendations, and thoughts about the discussion.