Cogut Institute for the Humanities

4. Uncovering the Humanities in Data Science

What ideas and assumptions about human social life underlie data science and new media? How might scholars in and beyond the humanities work together to diagnose and respond to the algorithmic frameworks of digital culture, especially those that reinscribe or reinforce forms of division and discrimination?

Episode Transcript

Amanda Anderson: From the Cogut Institute for the Humanities at Brown University, this is Meeting Street. I’m Amanda Anderson, the show’s host and director of the Institute.

The rise of new media and big data has had profound impacts on social and political life. Today I talk with media scholar Wendy Hui Kyong Chun about the ways in which these proliferating systems have affected our relations to one another and undermined democratic institutions and aspirations.

We explore the ideas and assumptions that have led to this situation, such as the notion that we all most want to be in communication with people like ourselves. As Wendy Chun argues, homophily, or love of the same, grounds network science. But why should it? And what are the consequences? We discuss why scholars in the humanities are particularly well suited to diagnose and respond to the informing assumptions of new media and data analytics.

Wendy Chun has been doing important work on these questions as the founding director of the Digital Democracies Institute at Simon Fraser University, where she holds the Canada 150 Research Chair in New Media. Prior to joining Simon Fraser University in 2018, she was professor and chair of the modern culture and media department here at Brown University.

Her most recent book, which has been widely influential since its publication in 2016, is titled Updating to Remain the Same: Habitual New Media [MIT Press].

Wendy, welcome to Meeting Street!

Wendy Hui Kyong Chun: Oh, it’s a pleasure to be here.

Amanda Anderson: So in Updating to Remain the Same you called new media “wonderfully creepy.” What do you mean by that?

Wendy Hui Kyong Chun: So I came up with the term “wonderfully creepy” when I was teaching new media at Brown, and I needed a term to register the fact that the promise and threat of new media can’t be separated. And it’s not because new media is exceptional but rather because it’s not. Because like any form of communication it operates via vulnerability, or via leaking.

So to be as concrete as possible, think of how your network card operates. So right now your wireless card is downloading all data. You’re downloading your neighbors’ data, all your students’ data when you’re on campus, and then it erases all the data that’s not specifically directed to you. So this technically means that your network card is always operating in what’s called “promiscuous mode,” although promiscuous mode just means that you get to see what your network card is doing.

So in a network there is no monogamous mode because a monogamous mode would mean that it’s just you and your computer and you’re not communicating with anybody. And so I wanted to give them that sense that they need and they’re constantly sending back and forth information.

And this is both creepy and wonderful. I also wanted to gloss the fact that what was important was to see that this is happening all the time, because when this is hidden from us, which it usually is, we can fall into paranoid fantasies of security or just complete vulnerability. So we needed this engagement to open up the wonderfulness of this creepiness.

Amanda Anderson: That’s terrific, a really helpful explanation. Since publishing Updating to Remain the Same you’ve launched the Digital Democracies Institute at Simon Fraser. What are the aims of this institutional venture, and what are you hoping to achieve with it? And I guess I’d also be curious to know, is it responding to a different set of conditions than those that preoccupied you in writing Updating to Remain the Same?

Wendy Hui Kyong Chun: So it is very much a post-2016 venture, and so the Canada 150 Chair, which I hold, was actually put in place by Canada in order to attract professors from the U.S. and the U.K. and other places post-2016 who would find Canada a wonderful place to be.

And so, most broadly, the Digital Democracies Institute seeks to understand and counter misinformation, polarization, abusive language, and discriminatory algorithms. Although we’re based at SFU, we’re the hub of a global network. So there are 19 partner institutions in the U.S. and Europe and 21 affiliated researchers external to SFU, and the external researchers are in computer science, theater and performance studies, media studies, network science ...

And we’re global and multidisciplinary and multi-sector because we have to. So no one discipline has solved these problems. They still remain. And we have to work across boundaries because too often we’re reinventing the wheel or we’re getting stuck in our tracks. So each of our research streams starts with a hard problem and a roadblock. And I can give you a concrete example of that, if that would be helpful.

Amanda Anderson: That would be great. Yes, please do!

Wendy Hui Kyong Chun: OK, so Beyond Verification, which is our project on misinformation, begins with the fact that fact-checking is important but it’s not enough, right? So fact-checking will clearly always lag behind the spread of misinformation or disinformation, but more importantly, people often don’t care if something is correct or incorrect.

We share something because it’s compelling or true, and the more certain politicians lie, the more authentic they appear. So rather than treating this as an unfortunate circumstance, this is where we start. So rather than asking, “Is this correct or incorrect?” we ask, “Under what circumstances do users find something to be true and compelling regardless of its facticity?” So why do users who come to distrust CNN move to trusting Breitbart? How does one move from being suspicious to, like, being a conspiracy theorist and trusting certain sources? So who benefits from authenticity?

Clearly a lot of work in authenticity has been done in the humanities and in the social sciences, right, so we could think of Lionel Trilling, but also Avril Bell and Sarah Banet-Weiser. So one project that’s part of this larger rubric is building a model of authenticity. Taking these definitions of transgression, etc., from the humanities, trying to build a coding scheme of what that would look like, and then comparing an article’s authenticity with its facticity and trying to understand the relationship between the two.

Amanda Anderson: That’s fascinating. I’m curious, what do you think drives the desire for or the susceptibility to claims to authenticity in new media?

Wendy Hui Kyong Chun: I think there’s a lot of things. One is just the advertising model. Because it’s click based, you want things to be as outrageous as possible, right? And so there’s an accentuation towards the affectively charged.

As well, there’s a move towards micro targeting. So you want people to actually click on the ads. So in order for them to click on the ads, you want to choose something that they care about. And so there’s a lot of work going into segmenting individuals and finding how they deviate from the norm with others. So clearly, the fact that a lot of people like Harry Potter — not that interesting. But you might like Napoleon Dynamite — that puts you in a different category. So to find out those little quirks and accenting them in order to make you more receptive. And that’s linked to what you find true and compelling rather than what you find to be correct.

And there’s also a lot of work done on authenticity and marketing. So it’s become algorithmic. You’re told to be authentic in order to be a real leader. It’s linked to notions of transparency where an authentic person is allegedly transparent. Their outside and their inside coincide, not because they’re nice people all the time — and so therefore their outer and their inner side coincides to this sincere and nice self — but rather that their outside shows these deep, dark secrets within. So in order to be authentic at certain places at work, for instance, you’re supposed to reveal a secret about yourself or show up in your pajamas during pajama day. It’s this constant call to transgress in order to make you more predictable.

Amanda Anderson: Right, and it’s interesting because what you’re describing is largely a kind of market logic. And yet, at the same time, this very same model is driving the political sphere or the dynamics of political life online. At this point, is there a way to distinguish between sort of the register of the economic and the register of the political in the realm of new media? Or have they become sort of collapsed in on one another?

Wendy Hui Kyong Chun: I think they’ve become collapsed. I think that Sarah Banet-Weiser makes the best argument about the ways in which they have been collapsed, which means for her, importantly, that authenticity isn’t a concept that’s completely corrupt, but it’s a concept from which you begin in order to understand the ways in which the citizen has also become the consumer, or the way the citizen-consumer has come together. So it becomes a point of intervention rather than a point of giving up.

Amanda Anderson: Fascinating. I want to return for a moment to your reference to the importance of the humanities within this very complex collaborative project that your institute is pursuing. You are quite explicit as an institute in seeking to promote a collaboration between data science and the humanities. And I would love to hear you talk just a little bit more about what role you think the humanities has to play and why it’s important that it’s a primary named collaborator or partner.

Wendy Hui Kyong Chun: I think the humanities are key to every project, and importantly, they are because they’re already part of every technology. So the way a lot of people talk about the humanities and technology is to say, “OK, we have to add the humanities to it,” or, “We have to add ethics to it. We have to add the social sciences to it.” And this ignores the fact that the humanities and social sciences are already embedded. The problem is it’s usually bad social sciences or bad humanities that are embedded within these technologies.

So to be as concrete as possible, consider the notion of homophily. So homophily is the notion that similarity breeds connection, and it’s axiomatic within social media networks. It’s considered to be absolutely true [that] if you like this, you like this; if you like people like this, etc., etc. Which means that echo chambers aren’t an unfortunate accident. They’re the goal. But the term homophily itself comes from sociology, and it comes from studies of biracial yet segregated post-World War II housing. In particular, it comes from white residents’ attitude towards living in biracial housing.

And what’s key is that the whole situation, even back then, was far more complicated. So [Paul F.] Lazarsfeld and [Robert K.] Merton also came up with the term heterophily, and their initial survey, which was done by Patricia West and others as well, showed that things were far more complicated. So people might not have had best friends within the same racial group, but they had a lot of acquaintances and friendships. So all of this is completely erased in this notion of homophily.

So homophily substitutes for both racism and community. And clearly the humanities has thought beyond notions of similarity. All the work within critical race studies, and thinking about difference, and the importance of difference, and the importance of other kinds of relations — this is key to building more complex networks because these networks often put in place the world they imagine. So they’re disruptive because they close down the future.

The humanities is also key because machine learning depends on a really retrograde notion of history. It’s homogeneous empty time à la Benjamin. It’s the notion that, you know, the past, the present, and the future follow within this linear progression, so the future cannot be radically different from the past. So this is why when machine learning programs are trained on the past, they make racist predictions. But what this also means is that they’re not verified as correct unless they make racist predictions.

So truth equals consistency, and this is, of course, something that Hannah Arendt has critiqued. And so the humanities, I think, need to be in discussion with people developing machine learning models so we can have other forms of history and of relations between the past and present embedded within these models and in terms of verifying these models.

Amanda Anderson: That leads me to ask you whether, in the conversations that you’ve participated in, where you’ve had data scientists and humanities interacting with one another — did those go well? Are there sometimes forms of incomprehension or resistance that develop? And if so, can you give an example?

Wendy Hui Kyong Chun: So what’s key about any interdisciplinary collaboration is that you have to be able to respect the other person and their methods, and you have to care about something. And so, just to draw an example from what happened at Brown, at one point there was talk of a big data initiative, and they brought me together with the head of computer science and with somebody in biostats who worked on AIDS. And we looked at each other and said, “Why are we here? How can we work together? And what can we do together that we can’t do on our own?” And it was actually the person from biostats who said, “I can show almost any correlation to exist. I can show almost anything is real.

But I don’t know what’s true.” And then from there, I was like, “Well, we … even if something is true, it doesn’t mean that it will inspire action” — global climate change is a perfect example of that — and so we decided that was the problem that we would focus on. So I think what’s important is that you go for a larger problem that you know you can’t solve on your own, that everyone cares about, and then when things start unfolding, you have to be able to question each other’s first principles in the nicest possible way, and to be open to having your first principles challenged.

Amanda Anderson: Yes. In your current project, you talked about how there are lots of partners involved, and I’m curious to know: who do you see as your most important partners, whether inside or outside the university — or among your most important partners? I’m sure they’re all important.

Wendy Hui Kyong Chun: Yes, they’re all important. Let me choose two emblematic ones then. And again, these are people who do very different work than us, which is why collaboration is so important, right? Because if we all do the same work then we don’t need to work together. So the Social Science Research Council, which is currently headed by Alondra Nelson — we’ve been working with their Media and Democracy group, which is headed by Jason Rhody, and we’ve been putting on workshops around questions of authenticity and disinformation as well as conferences.

We’re going to be working on journal publications. And what’s great is that they have, as one of the oldest social science research centers, already in place a fantastic network of social scientists working on these issues. So it’s been great to collaborate with them and to think through other ways of engaging authenticity.

Another key partner has been Kalina Bontcheva’s group at Sheffield [University], and she is in computer science — in particular, natural language processing. And she’s done a lot of work on abusive language directed at politicians — rule-based detection of abuse. So we’re working with her to look at the hard cases, the gray cases, and to figure out the importance of context to figuring out and dealing with natural language processing approaches to abusive language.

Amanda Anderson: Can you explain a little bit more about that? I mean, first of all, what is natural language processing, and how is it important to analyzing and responding to abusive language?

Wendy Hui Kyong Chun: So natural language processing is basically as bizarre as it sounds natural. It is statistical analysis of machine learning — usually machine learning based, but not always — algorithmic analysis of language. And so what she works on are various rules to say, OK, if there is this word — so a lot of it can be word based — and it’s in this kind of grammatical structure, then it’s most likely abusive language.

Amanda Anderson: And so you mentioned a moment ago that one of the things that her group is focusing on is how important context is. Again, can you give an example and show the kind of work that you think that she’s doing or the advances she’s making that are so important?

Wendy Hui Kyong Chun: So we’re working together on context. So what they’ve worked on is a really great way of detecting the language, but, of course, a word or a phrase might mean one thing in one context and one thing in another, right? And so what’s happening is that, for instance, Black English gets regularly targeted as abusive language when it’s not. People talking about racism get censored because of these systems which are automatically detecting certain kinds of words. So clearly, context matters.

The context is hard to program in, right? So part of it is trying to say, OK, NLP [Natural Language Processing] tools may not be perfect, but they’re good enough to say, look, something’s happening. So rather than having an automatic decision, if something is flagged, how then can we have tools or ways of trying to understand who’s saying it, how they’ve been saying it, not in terms of an individual person, but larger trends.

Also, rather than saying we need these tools for the platforms so the platform can shut down conversation, developing these tools for users so when something is directed at them, they can have a sense of “OK, how often has this been used? How frequently? How can we engage with this? How have other people engaged with this?” So what we’re really interested in is counter speech, and how we can understand and facilitate counter speech within a space like this.

Amanda Anderson: So this is connected to your next book, what you’re describing: how can we best respond to the kinds of conditions of discrimination that animate network systems? Can you talk a little bit about the book that you have completed and, I understand, is forthcoming?

Wendy Hui Kyong Chun: Yes, it’s in copyedits, which is really exciting. So as far as I can tell, every book takes five years. It’s been a while. And so, Discriminating Data looks at the ways in which segregation, eugenics, as well as multiculturalism are embedded within current network algorithms. So again, back to the discussion: We’re talking about how it’s not an issue of adding social sciences or political theory to these algorithms, but the ways in which these certain concepts are already within them. And so from that, trying to understand what we can do about it. And so it follows a five-step program.

The first [step] is to expose how difference amplifies discrimination. So many of the machine learning programs that are currently sued for being discriminatory, such as Compass, which is a program that’s used by some U.S. courts to determine the risk of recidivism, were actually launched during the Obama administration as a way to counter racism. Because somehow these machines wouldn’t have race as a factor and so therefore wouldn’t be racist. Clearly this is incorrect. It’s what I call the dangers of hopeful ignorance, because race is there via proxies, etc., and so part of it is to figure out the ways that’s operating.

And then the next [step] is to interrogate the default assumptions and axioms. So again, homophily and the impact of homophily as well as eugenics. So big data was sold as ending theory, as changing the world because of correlation. But correlation was developed by and for 20th-century eugenicists. And if you go back to the historical record, the same claims that were made about big data in the 21st century were made by these eugenicists in the 20th century about correlation.

And what’s important about correlation and these tools is that they’re part of this closure of the future. Because the past ... You figure out what doesn’t change in the past, i.e., what correlates in order to determine the future in the same way so that things cannot be radically different. So part of that has been figuring out why, when, and how these predictions work.

And then the other stage is to think through and to use these discriminatory algorithms a little perversely. So to be as concrete as possible: Amazon recently scrapped its hiring tool because it was shown to be discriminatory against women. So if you had “woman” anywhere on your CV, you lost points. Rumor has it if you had “Chad” somewhere, you gained points. And so Amazon just stopped using its program. But what if we, instead of scrapping them, use them as evidence for discriminatory hiring, right?

Almost every hiring decision Amazon made went into training that model. So what if we just thank them for spending all this money documenting their discrimination and so use these programs like global climate change models? Global climate change models show us the most probable future based on the past, not so that we’ll blindly accept it, but so we’ll change it.

Amanda Anderson: That’s wonderful. One of the elements of your analyses that I find really striking is the emphasis on forms of psychology. In Updating to Remain the Same, you talk about shame and vulnerability. In your current work, authenticity is a key concept. And I’m wondering, particularly with respect to looking at what we can do to change things, or how we can respond to the effects of these complex and proliferating systems, what psychological approaches do you think are important, or how does psychology play a role in attempting to respond to the challenges of new media and big data?

Wendy Hui Kyong Chun: One thing I argue in the book that I just completed is that big data is the bastard child of eugenics and psychoanalysis. And so this is because big data allegedly reveals what people really want, and they say this is true because your actions are more true than your speech, or moments when you’re searching for something are moments that you’re most truthful. And big data, again, is all about correlations. It’s about figuring how X and Y move together. So if you know that X stands for Y you might not be able to get to this hidden truth but you know this thing, right? And psychoanalysis as well is all about correlations.

But whereas for big data these tend to be linear, for psychoanalysis they are far more complex, right? So Lacan talks about the slippery relationship between signifier and signifie[d]. Drawing from Freud he talks about condensation and displacement or, you know, the classic literary terms for a metaphor/metonymy. So I think that, at that level, psychoanalysis or the notion of analyzing X to get to Y — latent and manifest — are embedded within and drive big data.

And so one thing that the humanities or psychoanalysis can do is understand the ways in which things aren’t linear — or if they are linear, they’re only linear at the point in which people are so agitated or so triggered they’ll act in a predictable way. So there is this other entire realm of correlation and action. And so what I think is really key is to get to these other kinds of relations which are intimately tied with the notion of indifference. Where by “indifference” I mean, not, you know, not caring, but living in difference — like, what do you need to care about in order to live indifferently?

Amanda Anderson: Yes, I think that’s one of the things that I found most striking in your essay “Queering Homophily,” where you emphasize how important it is to dwell in certain forms of discomfort, right? That to the extent that we can be lured into feelings of comfort when we are with people who are like us, or when we feel that we are, you know, in a homogeneous context,  it’s important to resist that, and to be able to experience certain forms of living within difference, as you put it, that might not be comfortable, but that might be profoundly transformative in other ways, or that might introduce new forms of experience that become pleasurable.

And I think that’s really important. I think that that connects with a lot within the humanities, where there’s a kind of invitation to dwell with difficulty or, as Donna Haraway says, stay with the trouble, right? And that, to me, seems like a really important sort of psychological dimension or sort of psychological invitation, I would say, of your work. I think absolutely that the psychoanalytic dimension powerfully informs your systemic analyses and the principles that guide your research.

But I just was really struck by that, because it seems to me so pertinent to so much that’s going on today, you know — and struggles for racial justice, and in trying to understand a polarized political landscape — that ability to live with difference. I think that’s an amazing aspect of your work and a real contribution.

Wendy Hui Kyong Chun: Thank you.

Amanda Anderson: I also just want to say it’s been an absolute pleasure having you on the show today. Thank you so much for joining us.

Wendy Hui Kyong Chun: Thank you. It’s been my pleasure.

Amanda Anderson: Thanks for listening to this episode of Meeting Street. Wendy Chun’s new book, Discriminating Data: Correlation, Neighborhoods and the New Politics of Recognition, will be published later this year by MIT Press. You can find a transcript of today’s show on our podcast page. Our show is produced by the Cogut Institute for the Humanities at Brown University. Our sound editor is Jake Sokolov-Gonzalez. We hope you will join us for our next show.