Feminist Data Set

Facilitated by Caroline Sinders

Feminist Data Set is a multi-year project that interrogates every step of the AI process that includes data collection, data labeling, data training, selecting an algorithm to use, the algorithmic model, and then designing how the model is then placed into a chat bot (and what the chatbot looks like) through intersectional feminism as an investigatory framework.

Every step exists to question and analyze the pipeline of cre-ating using machine learning—is each step feminist, is it intersectional, does each step have bias and how can that bias be removed?

Making Critical Ethical Software

In machine learning, data is what defines the algorithm: it determines what the algorithm does. In this way, data is activated: it has a particular purpose, it is as important as the code of the algorithm. But so many algorithms exist as proprietary software, as black boxes that are impossible to unpack.

Data inside of software, and especially in social networks, comes from people. What someone likes, when they talk to friends, and how they use a platform is human data—it’s not cold, mechanical, or benign. Data inside of social networks is intimate data, because conversations and social interactions, be they IRL or online, are varying forms of intimacy. How people interact with each other are what they like, and post, and dislike are “things” that are completely human; those “things” are also data.

Feminism and Technology

Feminist Data Set imagines data creation, as well as data sets and archiving, as an act of protest.

In a time where so much personal data is caught and hidden by large technology companies, used for targeted advertising and algorithmic suggestions, what does it mean to make a data set about political ideology, one designed for use as protest? How can data sets come from creative spaces, how can they be communal acts and works? What is a data set about a community that is made by that community? It can be a self-portrait, it can be protest, it can be a demand to be seen, it can be intervention or confrontation, or all of the above. It can be incredibly political. What about how a system then interprets that data? What if that system were also open to critique as well as community input?

Thinking Through Feminist Data Collection and Creation: A Feminist Data Set Design Thinking Exercise

Intersectional Feminism as Framework

The goal of this exercise is to explore intersectional feminist thinking in data, but also in society. Who technology is designed for, and how it acknowledges harms, pains, and pleasures, can have political implications.

Intersectional feminism, a term coined by Professor Kimberlé Crenshaw, is the acknowledgment that these marginalizations, and the addition of marginalization identities and aspects of identity, can lead to different experiences for groups of people, and thus, these differences need to be taken into account. For example, separating race from gender in the case of a black woman (as Crenshaw wrote in a law paper in 1991, where the term intersectional feminism was invented). Intersectional feminism acknowledges that the experiences of a black woman would differ from a white woman or a man of color, and that race and gender must be viewed as intersecting and overlapping identities. Intersectional feminism asks that we confront our biases and differences, and acknowledge the nuances of identity and privilege when investigating systems.

Thinking through Data

Creating intersectional feminist data doesn’t have to be data ‘about feminism’ or using the word feminism, but it can be from a feminist point of view. An example of intersectional data could be an article that is on wage inequality. An intersectional feminist article would go beyond a binary wage inequality (of cis men being paid more than cis women) and acknowledge cis women of different races also face varying pay discrimi-nation. An intersectional feminist article would outline what different women of different races make, since white women are paid more than black women, latinix/hispanic women, asian women and indigenous women.

But to do this, we need to first think about what we want to explore in the Feminist Data Set, and how can we find the ‘right’ topic? We can use design thinking exercises to start exploring a topic.

This is an exercise I run in the beginning of the Feminist Data Set workshop to help get participants thinking deeper about topics they will be exploring and researching. This works as an icebreaker or first activity because it’s quiet, iterative, and really fast. Set a timer for two minutes and start writing down any ideas or topics that come to mind. Try to use as few words as possible—not full sentences, just a pair of words to describe an idea. What are you interested in exploring? Is it wage inequality, the pink tax, transgender rights, the history of indigenous activists, misogyny in music, copyright, citations and women scholars? Really, it can be anything. What’s important to note is that we’ll be doing this exercise multiple times, and you should view this as a quick way to get some ideas out there. Not all ideas will be your final ideas. But we are viewing this as a way to cast a net, and then whittle down to the ideas we want to explore. So start writing down any ideas that come to mind. When the timer goes off, spend one minute looking at what you’ve collected. Pick your top two ideas. If you’re in a group setting and feel comfortable, everyone could go around the room and read those top two ideas.

Now set a timer, and repeat the exercise, but try to go deeper on your ideas. Let’s say you selected ‘trans rights.’ That’s a big topic area—what specifically are you interested in? Is it how trans women are discrimi-nated against? If so, what are the different kinds of dis-crimination? Are you thinking of an experience you’ve had or a friend has had or a story you’ve read? Try to write that down. Another idea may be pay inequality—what interests you about that? Is there a specific slice or sub topic? If you’re generally interested in a broad topic (like pay discrimination, like the pink tax, like trans rights, bias in technology), try to write down all of the smaller and more specific ideas or topics within that broad topic.

Repeat this exercise up to three or four times.

Afterwards, take your sticky notes and start to organize them by theme. How are they related? If you’re not sure yet, just start making small piles. After mak-ing these groupings, try naming them. This will help you organize your thoughts to then start searching for data. The more cohesive or concise your thoughts, the easier it is to find text on that topic like a blog post, an interview, an article, or archive.

The Facilitator

Caroline Sinders is a machine-learning-design researcher and artist. For the past few years, she has been examining the intersections of technology’s impact in society, interface design, artificial intelligence, abuse, and politics in digital, conversational spaces. Sinders is the founder of Convocation Design + Research, an agency focusing on the intersections of machine learning, user research, designing for public good, and solving difficult communication problems. As a designer and researcher, she has worked with Amnesty International, Intel, IBM Watson, the Wikimedia Foundation, and others.

Sinders has held fellowships with the Harvard Kennedy School, the Mozilla Foundation, Yerba Buena Center for the Arts, Eyebeam, STUDIO for Creative Inquiry, and the International Center of Photography. Her work has been supported by the Ford Foundation, Omidyar Network, the Open Technology Fund and the Knight Foundation. Her work has been featured in the Tate Exchange in Tate Modern, Victoria and Albert Museum, MoMA PS1, LABoral, Ars Electronica, the Houston Center for Contemporary Craft, Slate, Quartz, Wired, as well as others. Sinders holds a Masters from New York University’s Interactive Telecommunications Program.