your face dataset cannot be “racially” diverse

What does it mean to describe a face dataset as racially diverse? This label carries hidden assumptions about the nature of race and its relationship to facial appearances. Unpacking these assumptions raises a 🚩: 


The face industrial complex

Face datasets are often developed and used by scientists interested in understanding or engineering how human or machine perception functions. Such datasets gather images of people’s faces depicted in various configurations (facial expressions, lighting conditions, angles, etc). These images come from many sources. In-person photographs might be taken for the purpose of developing databases dedicated to specific regional or demographic representations; widely scraped pictures from the internet/social media might be used for databases meant to reflect a comprehensive array of human appearances and social groups.

Once a face dataset is collected, it gets released into the world where it can influence various endeavors. These datasets circulate in broader knowledge and political economies. Whether it’s expanding research stimulus sets, advancing technology, or powering algorithmic surveillance, face datasets have been and will continue to be transformed into nourishment for scientific, profit-driven, and imperial projects. For example, larger face sets are being developed under the guise of “nationality security” by companies eager to profit from amassing publicly available images for surveillance. But… maybe it’s not all doom. Maybe face datasets can serve the purely pro-social purposes they often promise… but what are these purposes, and where do they go wrong?

Marketing diversity

One prosocial benefit face datasets commonly claim to provide is representation; They’re developed in the name of diversityCalls for more diverse face datasets beyond “White faces emerged in a context of a broader push for diversity and inclusion across many areas of life, and alongside well-documented phenotypic & colorist racism in face algorithms. In this context, dataset developers frame their efforts as an ethical imperative to make face research and algorithms fairer and more inclusive. This push for representation likely fed a pressure to develop and market face datasets as “racially diverse”.

Check out the following examples, some datasets are new and some are widely used in face research:

  • Hawaiian face set: “a racially and ethnically diverse set of facial stimuli/ facial databases should be comprised of more culturally, ethnically, and racially diverse stimuli to keep pace with growing diversity.”
  • FairFace: “a face image dataset which is race balanced”
  • RADIATE: “The racially diverse affective expression (RADIATE) face stimulus set/ the diversity of this stimulus set reflects census data showing a change in demographics in the United States from a white majority to a nonwhite majority by 2020
  • MR2: “this face database provides a valuable resource to psychological science. With its variety of mega-resolution photographs of diverse races, it enables researchers to conduct studies with both experimental and mundane realism”
  • Diverse Face Images: “The stimulus set includes high-quality still images of female faces that are racially and ethnically representative”
  • AfricanMaskedFaces: “most of available masked faces dataset are not racially balanced/ If you need to balance your dataset, your can use the AfricanMaskedFaces dataset”
  • Chicago Face Database: “Increasing the racial diversity of available databases makes a significant contribution to many areas of study/ In addition to racial diversity, the CFD also offers a large number of targets within each racial category”

At first glance, these examples may seem unproblematic, even necessary. However, these seemingly benign descriptions hide a deeper problem: no matter how inclusive a face dataset aims to be, it cannot truly be “racially” diverse in the way it is often claimed. This is because facial diversity cannot be “racial” diversity if race is not a biological (nor social) reality.

Race realism in face datasets

The concept of racecraft exposes how common sense social practices and ways of thinking repeatedly conjure the pervasive illusion that races are real entities in the world (like witchcraft’s beliefs in witches). The function of racecraft is to compel people to rehearse a conjuring trick: transforming “racism, something an aggressor does, into race, something the target is, in a sleight of hand that is easy to miss“. This habitual conjuring trick ensures that the material disparities produced by capitalism, which are imposed on people classified into different race categories, are misattributed to the supposedly inherent traits of these illusory races. From this perspective, racism’s operations do not require races, but only the belief in races. I have previously published and blogged about how racecraft manifests in standard scientific analyses that try to characterize and compare the psychologies of differently racialized people, but end up instead producing caricatures and actively racializing those same people. I have also published on how racecraft manifests in face research, what I call facecraft: the practices and supporting ideologies that mobilize faces and their features to serve as representations of or evidence for the realness of social categories while obscuring the processes that map categories onto faces”.

In face datasets, race labels are assigned to faces in various ways: facial morphology, the self-identified race of the person depicted, and/or race category judgments from third-party observers. These labels are then “validated” using metrics grounded in the logic of accuracy—which assumes there exists a correct answer to what a face’s race truly is or should be. For example, training algorithms to “detect” the race of a face or measuring how “accurate” people are at perceiving a face’s race assumes that race is an objective property that can be identified from a face. But accuracy requires a target, so we need to ask: what is the “racial” referent being measured? The methods and marketing of the above face datasets reveal two referents. One is a biological referent they frame as “racial” – variation in genetic ancestry and/or facial features. The other is a social referent they also frame as “racial”– variation in personal or social identity and/or demographic categories. Both are markers of racecraft.

Biological racial diversity

On the biological side, it should be obvious that marking different genetic ancestries as racial diversity rehearses racism’s production of racial naturalism (race as a biological reality). So much ink has been spilled on this problem, more than I can ever cover here. What’s interesting to me is that some of the datasets show inconsistent recognition of this problem, only to revert back to race realism when it comes to their marketing. For example, the MR2 paper frames the faces as “photographs of diverse races“, but their download page describes the face set instead by ancestry: “74 full-color images of men and women of European, African, and East Asian descent“. Similar slips occur for the chicago face database which waffles between racial vs. ethnicity vs. ancestry descriptors in their paper vs. download page. The Hawaii paper uses terms like “White ancestry” and “multiple minority ancestry” and their download page states the faces represent “eight different racial groups“. These equivocations are examples of how biological race realism creeps in when these databases are packaged and promoted. Yet many of these datasets are also conceptualized, designed, and marketed with racecraft from the start.

One sneaky example relates to the other biological referent, which frames physical facial diversity as racial diversity. This racist alchemy relies on sorting logic – if faces can be physically or perceptually sorted into race categories, this seemingly validates the working scientific assumption that race is in the face (and can be accurately read from it) rather than read into a face (as compelled by regimes of power that shape sensory experiences). This sorting logic commits Edward’s fallacy: just because people or algorithms can classify biological features into race categories is not evidence for the realness of race categories. An analogy: anthropologists can successfully sort skulls into folk race categories, does this mean that skulls are inherently “racial”? That skull diversity reflects racial diversity? “Forensic anthropologists are able to classify human remains according to “race” because “racial” categories are also geographical categories, and the little amount of human biological diversity that exists is distributed fairly smoothly along geographical lines”. What follows is that the reason faces can be sorted into folk race categories is that facial features vary by geography (although displacement and migration patterns likely disrupt smooth clinal patterns), and geography is a major feature used for assigning race categories, for racializing people. our biology itself is not racial. face datasets are diversifying geographic & geopolitical, not racial, representations.

Social racial diversity

one might attempt to sidestep the above racecraft critique by shifting the referent, arguing that “racial” diversity is instead about the personal or social identity of the face owners. This can be seen in the Hawaii and RADIATE face datasets above, where their importance is promoted as necessary to better represent growing or shifting U.S. racial demographics. The logic here is that naturally changing “racial” population dynamics alter the visual-facial landscape of the U.S., which researchers should respond to by diversifying their stimulus sets to visually represent these demographic changes. However, this social referent also biologizes race through another form of racecraft: demographic naturalism.

As Michael Rodríguez-Muñiz warns: “Demographic naturalism holds three major assumptions. First, it views populations as “real,” natural, and actually existing entities. In politics, populations are regularly conflated with peoples and attributed collective agency and coherence. Second, it conceives of population trends as akin to natural forces with the potential to affect social and political life unmediated by modes of perception. For instance, demographic anxieties and fears are regularly depicted as automatic, seemingly unavoidable, outcomes of population dynamics rather than as sentiments of political cultivation. Third, it believes that demographic knowledge—as a product of science—more or less reflects or approximates said demographic realities […] Population is not an observable object but a way of organizing social observations […] population trends cannot be studied, known, or managed apart from the political relations, social imaginaries, and statistical techniques and conventions through which we constitute populations… Naturalistic assumptions pervade public discussions about ethnoracial population change. Claims are routinely made about “racial” populations that presume their actual existence and treat racial statistics as plainly objective. But such claims not only express naturalized assumptions about demography, they also rest on and further reify assumptions about race… Demographic rationalities tend to ‘essentialize’ social relations by ascribing fixed characteristics or properties to specific population groups and by introducing reductionist and reifying forms of analysis.”

The race realism gets revealed once we ask: Why would (ostensibly) naturally changing “racial” demographics change facial appearances? The answer is that “racial” groups are assumed to be the source of facial variation, rather than understanding that race classifications are mapped onto people and their faces through racializing political projects. Facial variation will always naturally exist. How facial variations (people) are sorted into race categories is what is potentially diversifying or changing. “What must be asserted vigorously is that there is nothing inherent about individuals, peoples, or groups that makes them “racial” or be seen as “racial” by others. Race does not rest in “the eye of the beholder or on the body of the objectified”.

This same problem occurs in datasets that frame personal identity as racial diversity, for instance, datasets that use the racial identity of the face owner as the face’s racial ground truth (exemplified by phrases like “the black faces”, “the multiracial faces”, which presumes no other possibility for those faces). But a predictable relationship between someone’s racial identity and their facial appearance can’t be presumed. It should be widely known that facial variation exists among the appearances of people who share a racial identity. This is because people who vary in their facial appearance self-racialize (i.e., take on a race label) for various reasons- facial appearance being only one possible ingredient out of many (e.g., genetic testing, cultural connections, experiences of exclusion, etc). Likewise, there are various sources of racialization: a race label is imposed by themselves, by their social networks, or by researchers and dynamically across different times and spaces for cues beyond facial appearance (context changes the relevant cues). So no matter how diverse the face dataset is in terms of the “racial” identities of the owners, the faces themselves will continue to be subjected to perceptual racialization processes that are context- and perceiver-specific and not necessarily anchored to the identity.

If we understand that race is not in the face because race is not a biological nor social reality, then we can understand that assigning a race label to a face is itself an act of racialization, it cannot be “accurate” in a general sense, it reflects how that face is perceived by the specific assigner (the researcher or a survey respondent) in that specific context, which will enact specific consequences for the face’s owner.

Facial diversity is an effective canvas that continues to be (re)signified by racist regimes of power. Framing it as racial diversity does the work of racism.

Stop scientific racialization

Ultimately, the language of diversity and representation in face datasets often operates as a sleight of hand- masking the deeper entanglements between facial classification, race realism, and histories of scientific racialization. In this dynamic, science effectively performs the labor of maintaining human/non-human distinctions (a problem I will write about soon). the intention may be to create more equitable resources or technologies, but the violence of racial categorization remains intact. When researchers are careless or ignore the potential r/facecraft in their methods, the whole project (the researchers, the faces, the race labels, the marketing) transforms into a racializing assemblage that (re)cycles racecraft back into the face industrial complex. Filtered through the framing of diversity and representation, race labels in face datasets establish a regulatory caricature or benchmark of what races are supposed to look like for other face datasets and for current or future users.

If researchers are truly committed to ensuring their work has a positive impact, this problem must be confronted with clarity about the racializing assumptions underlying common research practices. The challenge is not just to refine existing methods but to imagine new ones that take seriously their political and societal entanglements, and consider the possibility of having to start over by escaping our current epistemic foundations/order or even the impossibility of successful methodological reform in this world. This will require a willingness to reflect on a tough question: could my research and good intentions be reinforcing the racism I aim to dismantle?


p.s. cant wait for the day when “white face” or “black face”, which are so commonly written in social science papers, start feeling as problematic as “fag face” (the symbolic face presumed by experimental psychology research on the supposed accuracy of facial gaydar).