A modeler's manifesto: Synthesizing modeling best practices with social science frameworks to support critical approaches to data science

In the face of the "crisis of reproducibility" and the rise of "big data" with its associated issues, modeling needs to be practiced more critically and less automatically. Many modelers are discussing better modeling practices, but to address questions about the transparency, equity, and relevance of modeling, we also need the theoretical grounding of social science and the tools of critical theory. I have therefore synthesized recent work by modelers on better practices for modeling with social science literature (especially feminist science and technology studies) to offer a "modeler’s manifesto": a set of applied practices and framings for critical modeling approaches. Broadly, these practices involve 1) giving greater context to scientific modeling through extended methods sections, appendices, and companion articles, clarifying quantitative and qualitative reasoning and process; 2) greater collaboration in scientific modeling via triangulation with different data sources, gaining feedback from interdisciplinary teams, and viewing uncertainty as openness and invitation for dialogue; and 3) directly engaging with justice and ethics by watching for and mitigating unequal power dynamics in projects, facing the impacts and implications of the work throughout the process rather than only afterwards, and seeking opportunities to collaborate directly with people impacted by the modeling. ‡, § © Eitzel M. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Introduction
Data science has the potential to work towards a more sustainable, more equitable world. However, whether this potential is realized depends on how our modeling practices move us towards or away from those goals. For example, methods from predictive advertising in industry are being enthusiastically applied to academic, governmental, and nongovernmental contexts (Brown 2015) and to more and more knowledge domains and contexts. But these models (used uncritically) tend to create the conditions that erode trust in modeling because only some people have access to this opaque modeling infrastructure which can impact all of us (Brown 2015). Increasingly, statistical analysis is in the hands of private companies who have no motivation to be transparent (Davies 2017). These conditions mean that modeling can maintain unjust situations or make them worse, exacerbating historical inequities in resources and power (O'Neil 2016). The public may simply not believe or rely on scientists for knowledge, not seeing them or their models as a trustworthy source of information about pressing contemporary concerns (see, for example, the success of climate change denial factions, Dunlap and Brulle 2020).
Modelers have created manifestos for better individual and collective practices (e.g. Derman and Wilmott 2009, Munafò et al. 2017), but the broader field of data science can benefit by grounding these practices in theory and framings from social sciences. In particular, data science can benefit from critical approaches to research, where "the goal is to critique and challenge, to transform and empower," (Merriam and Tisdell 2015) questioning who has power, what structures reinforce that power distribution, and how to make changes if necessary. There are a variety of critical approaches and traditions in different social science disciplines, and in this piece, I engage from a data science perspective with feminist Science and Technology Studies (STS) and related fields to give theoretical grounding and support for critical practices and ideas that many modelers are already discussing.
The paper is organized as follows: I first review several technical and ethical concerns about working with high-volume and high-variety big data (and particularly issues with machine learning techniques). I then describe the methodology behind and research questions motivating this review. Finally, I place modeling literature in interdisciplinary conversation with feminist STS and related literatures, collecting framings and practices into my own modeler's manifesto. This interdisciplinary collection of practices is organized around three themes: 1) context (epistemic consistency, data biographies, and mixedmethods), 2) collaboration (triangulation, uncertainty as openness, interdisciplinary fluency), and 3) justice (power dynamics, impacts and implications in society, communitybased modeling).

Technical Concerns with High-Volume Big Data
While there are definite advantages to incorporating large amounts of data into modeling, many statistical issues multiply for large datasets ("high-volume big data"), including collinearity, false positives, and significant but tiny effects. Collinearity refers to the situation when several variables which are themselves correlated are used to predict an outcome of interest, potentially giving a result that the group does predict that outcome, but one may not be able to tell which variable is really responsible. There is no way to remove collinearity, though efforts to work with it are long-standing (see Dormann et al. 2013 for a review). The big data attitude of throwing many predictor variables at the model and allowing the data to "speak for themselves" can be problematic when the large number of predictor variables in the model are highly correlated with each other. The larger the quantity of correlated variables, the larger the problem with collinearity, and the less one can say about what individual variables really predict the phenomenon of interest.
If one chooses to use hypothesis-testing methods, the approach of including many predictor variables also means a higher chance of having a "multiple testing" problem. Getting meaning from hypothesis-testing depends on the idea of controlling the rate of false positives (the alpha level or Type I error). This means that an individual statistical test might have a 5% chance of accidentally suggesting that something of interest is going on (when nothing is going on). There are ways to correct for this problem (e.g. "False Discovery Rate," Benjamini and Yekutieli 2001), and different inferential frameworks one could use (model selection, for example Akaike's Information Criterion, Burnham and Anderson 2002); however, the problem can magnify for larger datasets. As the number of variables increases, the number of tests increases, and the number of Type I errors can increase. In addition, sometimes very large datasets have high statistical power, which means that they are likely to detect significant effects because the large number of data points brings down the error in the calculation of test statistics. However, though these effects are statistically significant, the magnitude of the estimated effects can be very small (Spiegelhalter 2017). This amounts to a variable or process of interest being statistically significant but not practically significant. The more data, the more likely that there could be many very small effects which represent very little of interest in reality.
There are many techniques for handling these issues, but my point is that more data is not automatically better (Boyd and Crawford 2012) --methods and handling of statistical concerns still matter. Of course, these problems are also present in smaller datasets, but claims based on smaller datasets tend to be more modest than those currently being made by users of "big data."

Context Issues with High-Variety Big Data
Data collected over multiple decades by a large and constantly changing set of people raise issues of how to handle high-variety big data. This kind of big data is characterized more by the lack of control over its source than its quantity (Platin et al. 2017, Salmond et al. 2017. Technical issues can include changes in experimental design, missing data, and observer effects. Finding ways to curate, harmonize, and re-purpose long-term datasets is a difficult -and iterative -process, as demonstrated in the establishment and enactment of a data standard for the Long-Term Ecological Research sites (Millerand et al. 2013) and the assembly of data on model organisms into large databases (Leonelli 2014). Statistical models of these datasets may have to be tailored to the changes in experimental design and observation processes, or else a subset of the data that is sufficiently complete will have to be used (essentially "throwing away" some of the data). To create a model which explicitly represents the observation process, analysts need detailed information on the methods of the study ("pre-reproducibility;" Stark 2018).
Unfortunately, Porter (1996), summarizing a number of historians, notes that "books and journal articles must necessarily be inadequate vehicles" for sufficient information to actually reproduce another person's study. Stodden et al. (2014) point out that "traditional materials and methods sections of most journal publications are simply too short to allow for the inclusion of critical details that make up an analysis. Often, seemingly innocuous details can have profound impacts on the results." Porter even goes so far as to say that "experimental regularities should perhaps be interpreted in terms of human skill rather than of stable underlying entities." So while modeling the observation process can be crucial to re-purposing high-variety data, this can be difficult if there is insufficient information available. This is particularly problematic in situations where many ecological datasets are "going dark" --meaning that they are unpublished work which may be forgotten and lost, metadata and all (Heidorn 2008).
Accounting for measurement methods is also a part of using participatory citizen science data, another form of potentially high-variety big data which is currently growing in popularity. (Note that I use the term "citizen science" in the broadest possible sense, referring to participatory research of all kinds, e.g. Eitzel et al. 2017.) Though some professional scientists may continue to doubt the quality of participatory data (Burgess et al. 2017), these projects can produce data of similar quality to professional scientists (Danielsen et al. 2014). In fact, a wide variety of data validation procedures are used by participatory projects to account for both observer effects and the measurement processes themselves, including methods which apply before, during, and after data are collected (Wiggins et al. 2011). One example of a long-standing participatory project is the contributions of birders to ornithological studies, in which the data are used to study changes in bird populations over temporal and spatial scales that would have been impossible without public participation (Link and Sauer 2007). And because of the potential for participatory science to democratize knowledge production (and the facilitating role scientists can play in the process, Lave 2015), this source of big data is particularly important for justice reasons as well as technical reasons.

Limitations of and Justice Issues with Predictive Models
Faced with enormous quantities of data and many variables, many analysts turn to predictive accuracy as a metric to avoid some of the issues raised above regarding highvolume big data. Concerning oneself only with the model's ability to predict known data can be a robust way to check models (Breiman 2001); however, the way these methods are currently practiced in data science can be problematic.
First, commonly-used predictive machine learning methods are notoriously opaque, though this can vary for the specific model in question. Many of them epitomize Latour (1987)'s "black box": the results do not speak to why a phenomenon is happening, or how the modeling result is produced. I agree with Efron (2001) that "the whole point of science is to open up black boxes, understand their insides, and build better boxes," whether that is in order to create a new technology, to improve human or ecosystem health, or to better understand the fundamental nature of matter and reality. And though some proponents of machine learning models even claim that we are witnessing the "end of theory" (Anderson 2008) because these models can successfully make predictions without representing mechanism, this proclamation sweeps the underlying assumptions under the rug (Salmond et al. 2017). Often these assumptions are simplistic: that the data are independently and identically taken from the same statistical distribution (Breiman 2001), or that underlying relationships between variables or categories are linear (though the results and model behavior may be nonlinear). And even "automatic" methods require an analyst to choose an algorithm and to select values for its parameters (Bechmann and Bowker 2019). Furthermore, in many of the pressing problems of the 21st century, we are heading outside the range of our historical data, and we cannot rely on linear predictive assumptions as we move beyond known system behavior (Cox 2001). We must have some idea of mechanism if we want to make guesses about behavior outside the range of conditions we already know about. There are efforts underway to gain insight into these types of models (e.g. Azodi et al. 2020), and predictive models can still be useful even when we are interested in mechanism, for example as part of data exploration (Moore et al. 2019) --but they are not beyond theory.
Not only do black box models not help us understand the answers they give --they can also be dangerous societal tools. O'Neil (2016) describes situations in which algorithms become self-fulfilling prophecies which disproportionately affect vulnerable populations, with no recourse to appeal opaque decisions due to the often proprietary nature of the models. She gives the example of questionnaires given to people convicted of crimes, asking about their friends, family, and neighborhoods, then correlating this information with the probability of being incarcerated again (recidivism), and basing the length of their sentencing on their probability of recidivism. Upon completion of their incarceration, the person then has a record with a long prison sentence, may not be able to obtain employment, and may then end up back in prison. The model validates itself, and the person cannot appeal to the black box. These algorithms are also often scalable to ever larger populations of people and transferable to many different domains. These factors, along with the conflation of quantification with "truth" via big data, entrench any biases inherent in the input data ("garbage in, garbage out"), often reinforcing systemic injustice.
In addition, personal data gathered for one purpose is often re-purposed, potentially in ways which violate commitments to privacy and ethics (Boyd and Crawford 2012). Certain groups may be disproportionately impacted by these practices but (for a variety of reasons) have no access to the tools to either criticize existing analyses or to create their own counter-analyses (Mah 2017). All of these justice issues are swept under the rug of the assumption that large volumes of data will benefit the greater good.
I have been especially critical of predictive, algorithmic models. It should be noted that there are efforts underway to audit these kinds of algorithms (e.g. Bodo et al. 2018, and not all predictive modeling is unjust, while not all mechanistic modeling is just. My point here is to raise the issue that modeling, regardless of the type, should be done more critically and transparently, in ways that make sense for that type. And all kinds of modeling can benefit from being placed in their larger context: Rather than more mathematical validation tools, modeling needs more contextual validation tools. Interdisciplinary engagement between modelers, with technical skills for working with big data, and STS scholars, with skills for critiquing the processes of science, could be a fruitful way to move towards these kinds of tools. In particular, data scientists could work to make modeling more transparent and accessible to critique, and STS scholars could work to gear critique towards applied changes to practice. The need to fill this interdisciplinary gap led me to seek social science training that could point me towards modeling practices that might address some of the problems I have described.

Data Resources for Review
The issues I have raised in the introduction led me to seek out discussions with colleagues in a wide range of social science disciplines, including anthropology, communications, environmental justice, feminist studies, geography, psychology, science and technology studies, and sociology. During 2015-2018, through one-on-one meetings, participation in group events, auditing courses, and joining reading groups, I compiled a bibliography of pieces that addressed my questions regarding "better" modeling that could mitigate or avoid some of the issues I'd encountered. In reading this material, I reflected on how these social science framings (especially those from feminist STS) could inform my own modeling practices. From 2017-2020, I thematically coded these reflections via an iterative writing process (Gibbs 2015) into a modeler's manifesto of potential practices distilled from the range of materials I was reading. Because this method of synthesis is subject to my own biases about what practices are most effective and what changes in modeling are most important, I provide an appendix detailing my background and values and how they influenced my choices (Suppl. material 1). This strategy follows the feminist STS practice of "situating oneself" (Haraway 1988), which I will describe in more detail in the Discussion under "Contextualizing Modeling." Below, I outline the questions behind the project and my methods for evaluating and validating the results.

Research Questions
The questions that drove my literature review and synthesis process were the following:

1.
How do we make self-consistent epistemic choices based on quantitative modeling results (i.e. understanding how the methods are designed to produce knowledge and evaluating how well we follow those methods)? How do we use both qualitative and quantitative information and reasoning to do so? 2.
What is an effective way to re-purpose existing data, and is the gain from using this data worth the effort it takes to properly model a variety of quirks in experimental design or obervation process? 3.
How can we use social science framings, and Science and Technology Studies in particular, in an applied way to work with issues of irreproducibility (the inability to confirm the results of published research) and declining trust by the public in scientists in general and modelers in particular? 4.
How can we create and use models, especially algorithmic and big data models, in a more just way (specifically, to ensure equitable benefit for those who are impacted by the models)? 5.
How are modelers already discussing and addressing these issues?

Evaluation/validation Process
Assembling the manifesto practices was a qualitative research process, therefore I followed best practices for assessing qualitative research in checking the credibility, consistency, and transferrability of the manifesto (Merriam and Tisdell 2015). I used the following cross-checking processes: 1. I grounded the manifesto practices in literature from a variety of social sciences (in particular, triangulating between anthropology, geography, sociology, science and technology studies, and feminist studies). (This cross-checking establishes believability or credibility, as a qualitative version of internal validity.) 2.
I checked my practices with the published writings of other modelers who were discussing ways of making modeling better and more trustworthy. (Establishing logical consistency by checking with outside reasoning, as a qualitative version of reliability.) 3.
I cross-checked my thinking with others both in and out of my own personal network of modelers and social science scholars: both colleagues as well as editors and anonymous peer reviewers gave insightful comments on the practices and literature I had assembled. Both modeling and social science colleagues indicated they would use the manifesto in teaching their students, indicating good potential for transferrability (qualitative generalizability). 4.
I continued to read new literature suggested by colleagues until the practices had stabilized at the nine presented below; that is, new literature I encountered tended to fit in with one of the existing practices. This resembles saturation in qualitative sampling (similar to ensuring a large enough sample size in quantitative work).
As a tool for modelers to apply to their own practice, for those developing data science pedagogy, and as a conversation-starter, I believe the manifesto is relatively rigorous as per the above validation processes.

Discussion
I have already noted situations with big data and predictive algorithms where modeling involves hidden and not-so-hidden biases. Because both critical scholars and practicing modelers are aware that modeling and data science can be subjective and deeply entangled with justice issues, we need modeling practices and ways of thinking that enable us to articulate and address these aspects of modeling. In the sections that follow, I draw from a multi-and interdisciplinary range of literature to collect critical data science and modeling practices that could serve this need. The organizing themes of these practices draw from my engagement with feminist STS, so I begin by briefly reviewing important ideas from Haraway (1988)'s "Situated Knowledges" and putting them in conversation with allied thinking from modelers.
One way to try to understand the world through a lens distorted by known and unknown biases is to use more than one kind of lens, noting how each may distort the picture. This strategy is a key component of "feminist objectivity," which is "about limited location and situated knowledge" (Haraway 1988). "Feminist" here refers to the discipline in which this idea originated, and because it focuses on how our identities and social positioning shape how we know things, but the approach applies well beyond women or feminism. "Limited location and situated knowledge" refers to the context-sensitive nature of our claims to knowledge. If researchers are transparent about the kinds of biases in our particular view of the world, we can bring our knowledge together with other views to create a more complete picture of reality, or "better accounts of the world," as Haraway puts it. She suggests that each way of knowing, no matter how technologically mediated, is "a wonderfully detailed, active, partial way of organizing worlds." So, to create a better account of the world, more than one of these partial accounts or partial knowledges is necessary. She means, "not partiality for its own sake but, rather, for the sake of the connections and unexpected openings situated knowledges make possible." We can therefore potentially see things with multiple partial knowledges that we could not see with only one perspective.
This framing of each model as a partial picture of the world aligns with modelers' thinking as well, particularly the understanding that "all models are wrong, but some are useful" (Box and Draper 1987). Further, Shackley et al. (1998) conclude that "in any particular practical application, a variety of models are likely to be required, each of which is tailored to a different, fairly specific objective and defined at an appropriate scale of behaviour and measurement." So, simpler models may be used to generalize and inspire questions for further investigation, while more complex models may be used to represent a particular system and help make management decisions (Holling 1966).
Below, I explore these ideas of models as partial knowledges further and outline nine proposed practices, merging my own observations as a practicing modeler with advice and thoughts from both modelers and social scientists from many different disciplines. I name the disciplines of the authors I cite in the manifesto practices (largely based on their affiliations), to give a sense of the breadth of the advice base. The practices are organized into a set of three themes: context, collaboration, and justice.

Theme 1: Contextualizing Modeling
In my pursuit of science and technology studies training and my examination of my own modeling experiences, it became clear to me that contextualizing modeling was a key practice, in a broad sense: both in from the perspective of feminist objectivity ("situating" models) and in the sense of detailing methods to improve reproducibility (research question 3). The practices I group under "context" help to allow for better use of highvariety data (research question 2), and also speak to the evaluation of technical considerations associated with high-volume big data (research question 1).

Practice 1a) Epistemic consistency: Know the epistemology that underlies methods and draw conclusions appropriately
A key context for quantitative modeling is the epistemological background of a method: what is the underlying reasoning for how we believe we can learn about the world from the method? In my experience, understanding this is surprisingly difficult when canned software (whether open-source or proprietary) makes it easy to apply methods without being familiar with the underlying assumptions. Fortunately, I have found abundant online resources and training courses regarding the underlying assumptions of models and have been able to ask computer science and statistics colleagues to explain the often impenetrable documentation and underlying assumptions accompanying canned procedures. In more deeply investigating the epistemological background of statistical methods I have used in my work, I found that the history of the methods is important. For example, Fisher (1926), in the famous statistics paper that originated the criterion of p<0.05, originally mentioned that criterion as a guideline in illustrating a point --Fisher even used the phrase "personally, the writer prefers." Perhaps in a particular case, p < 0.01 is more appropriate, or p < 0.10. Consulting the history of science literature, I found Porter (1996)  A final note on epistemic guidelines: once I felt confident in the epistemic underpinnings of a method, if the restrictions of a particular quantitative epistemology felt uncomfortable, I have investigated alternatives. Hypothesis testing uses a construction of "null" versus "alternative," and either of these terms could be interpreted negatively or dismissively, while model selection approaches (Burnham and Anderson 2002) do not label or conceptually privilege one model over another, instead ranking a set of models based on predictive fit and parsimony. This conceptual framing may be desirable if some of the models reflects the knowledge of a marginalized group (e.g. Indigenous people, women, people of color, etc) because small differences in framing can have subtle but important impacts on research participants and collaborators. Also, Bayesian inferential philosophy in the form of "updating" prior knowledge with data appealed to me because it could more closely mirror aspects of learning (e.g. constructivist educational psychology, in which learners build on what they already know, McInerney 2013) than hypothesis-testing strategies.

Practice 1b) Data biographies: Report the details and learn the ethnography of modeling
Another way of conceptualizing "giving context to modeling" is to provide more detail on the experimental apparatus, broadly construed, as a "data biography." Feminist studies scholar Barad (2007) points out that researchers of all kinds really study phenomena rather than objects. She quotes physicist Niels Bohr's conclusions regarding the wave-particle duality of quantum physics: "the unambiguous account of proper quantum phenomena must, in principle, include a description of all relevant features of the experimental arrangement." Barad extends this to the political and social contexts of an experiment along with the description of experimental equipment. Along the same lines as this broader "apparatus list," science and technology studies scholar Taylor (2010) suggests mapping the connections of one's work: "things that motivated, facilitated, or constrained [your] inquiry and action." In addition, sociologist Clarke (2005) suggests that the experimenter themselves is a part of the apparatus to be described, and Indigenous scholars Walter and Andersen (2016)  Acknowledging that modeling frequently does not proceed in a linear, logical fashion --and that recording the reasons for various decision points may aid understanding of how the final result came about --is recognized by both engineers, suggesting describing modeling "paths," (Lahtinen et al. 2017), and by geographers and environmental scientists who seek to reveal the modeling "improvisation" process (Landström et al. 2013). Some methods have been proposed to give more background detail on models, for example TRACE (TRAnsparent and Comprehensive Ecological modelling documentation, from complexity modelers Grimm et al. 2014). Writing a data biography is also a way of "re-centring the agency of human actors [scientists] in generating, collecting, and collating big data," (from geographers and environmental scientists Salmond et al. 2017) or reporting on the "relational empiricism" of your work: being attentive to "the relations that constitute its objects of study, including the investigator's own practices" (from science and technology studies and feminist studies scholars Verran 2001, Kenney 2015. Being aware of and representing one's own biases and values will help with triangulation processes so that other knowledge on the topic with different biases can be compared and contrasted, as argued by epidemologists Munafò and Smith (2018).
Methods for writing such "data biographies" could include extending a methods section to describe more details of the work or creating appendices with additional detail. Some journals now offer or suggest venues to publish more detailed accounts of methods (for example, www.protocols.io). One could publish a companion paper which describes the contexts in which the model arose, how the people involved interacted, what each of their backgrounds and perspectives were. In order to write a data biography, one may need to keep a modeler's notebook or journal to keep track of choices and the reasons for them (proposed by modelers Grimm et al. 2014, Lahtinen et al. 2017. The presence of thorough data biographies would help immensely in the analyses of high-variety big data and reproducibility of existing studies (helping to meet the requirement of "pre-reproducibility," according to statistician Stark 2018), and could also be valuable for historians and science and technology studies scholars. Perhaps interdisciplinary collaborations could provide guidance for modelers about which details could be most useful to record --details they might not themselves think to mention.
Several issues arise from this idea of a data biography. First, how does one know which details of a context to include as part of the "apparatus list" (raised by geographer Bergmann 2016) --where does one stop? One proposal is to conduct a thought experiment: would the results change if one substituted something different? If so, it could be mentioned, at least briefly, in the data biography. A similar practice could be employed for describing one's own biases and background, highlighting those aspects that most influenced the choices made in the work. A second issue is that in all forms of scientific transparency, the issue of vulnerable populations and data privacy cannot be ignored. In those cases, it is appropriate to work directly with the populations themselves to determine how to balance their privacy and sovereignty with scientific transparency.
As a final note, I have sometimes been on the other side, trying to re-purpose data or conduct a meta-analysis when the original study does not provide enough detail to model the observation process. In these cases, I have found myself using qualitative research methods (often learning them inefficiently "on the job"), e.g. interviewing the researchers about their methods. I would have benefited from more formal training in ethnographic methods, either via interdisciplinary instruction on mixed methods, getting advice from colleagues with these skills, or directly collaborating with them, in order to more effectively obtain the needed information in an appropriate timeframe.

Practice 1c) Mixed-methods analysis: Frame modeling processes as including quantitative and qualitative components
There can be an element of qualitative data synthesis at work behind quantitative models, which can also be valuable context to acknowledge. Many modelers openly recognize that data can be quantitative or qualitative (Grimm et al. 2014, Elsawah et al. 2017, Munafò and Smith 2018 and there are many modeling methods which can incorporate a wide range of data (examples: agent-based modeling from anthropologist Agar 2003; expert solicitation of Bayesian priors from statisticians and ecologists Kuhnert et al. 2010). Would it be a stretch to acknowledge qualitative reasoning as well? For example, one could qualitatively integrate one's prior (possibly quantitative) knowledge and past experience into the qualitative aspects of models one creates (functional forms, variable choices). This idea is consistent with the "information processing" model of constructivist educational psychology, in which learners build on past experience as they take in new information (McInerney 2013). Acknowledging qualitative thinking and methods need not weaken the credibility of a model: according to educational researcher Le Roux (2017) "rigour does not lie in the chosen method per se, but in the judicious application of the method and explaining how the process was implemented." Clear accounting of qualitative reasoning and applying established qualitative analysis like the "constant comparative method" in Grounded Theory (described by sociologist Charmaz 2014) could improve the rigor of a study through greater transparency in and understanding of the process. And qualitative analysis could also be useful in assessing models: one could use individual cases as a kind of "ground-truthing" of heavily quantified modeling results, treating the model as a narrative (as described by geographers and environmental scientists Millington et al. 2012). It is sometimes challenging, however, to retain and report the qualitative knowledge in a highly quantitative and possibly automated modeling process, particularly in typical Methods sections of journal articles, so modelers could find ways in appendices and other venues to include this information in their data biographies.
As I have reflected on how to define quantitative and qualitative analysis, I found that I iterate between both ways of thinking. As I searched for guidance on what distinguished quantitative from qualitative methodology and how to combine quantitative and qualitative data, I eventually concluded that the two are surprisingly commingled, even in existing modeling practice. I encountered many different definitions of "qualitative," and even having learned qualitative research methods, I see easy ways to flexibly apply quantitative assessments to qualitatively-generated data. Similarly, I can identify many points in a quantitative analysis which involve qualitative assessments. For example: looking at a graph of quantitative measurements and then proceeding based on a qualitative observation about the shape of the graph or the clustering of the points; integrating quantitative knowledge about many different sources upfront into a qualitative sense of what to expect from a model result or model performance indicator; or using qualitative arguments to explain quantitative results. And qualitative approaches to data collection can provide valuable insight into designing model structure or determining model parameters. In many ways, the distinction between qualitative and quantitative methods is less about theoretical differences and more about analytical cultural differences. I am an example of a quantitatively trained researcher who has since learned qualitative methods, and I find that the ability to choose between methods in either category --and sometimes combine them --lends richness and rigor to my analysis. And there may be cases where qualitative analysis is taken more seriously if the researcher can speak the language of quantitative analysis as well (e.g. anthropologist Levin 2019).

Theme 2: Collaboration with Other Partial Knowledges
Giving modeling better context allows better perspective on why phenomena might appear a certain way through a given modeling process, but this way of seeing is still only a partial view of the object of interest. Feminist objectivity suggests that we must also find ways to bring a given model into dialogue with multiple other partial views or knowledges in order to get a more complete picture (as per Haraway's "Situated Knowledges"). The practices I group under "collaboration" address ways to implement this strategy. They can help in assessing the technical issues with high-volume big data and lay the groundwork for addressing the issues of algorithmic knowledge (research question 1); this set of practices also connects with issues of reproducibility (research question 3).

Practice 2a) Triangulation: Because no modeling technique is truly objective, seek ways to check models using other knowledge
All models are partial representations, so one way to get a more complete picture is to use "triangulation," a method used by social scientists to bring multiple different datasets or sources of knowledge to bear on the same question (see educational researchers Merriam and Tisdell 2015). Borrowing metaphorically from the mathematical method of locating a point based on distances from other points, this kind of comparative data analysis allows researchers to see where different data connect, substantiate each other, or offer different perspectives. Feminist studies scholar Haraway (1988) suggests that "the knowing self is partial in all its guises, never finished, whole, simply there and original; it is always constructed and stitched together imperfectly, and therefore able to join with another, to see together without claiming to be another." When bringing together multiple partial knowledges, one possible set of outcomes centers on how and where the knowledges intersect or agree (see engineers and epidemiologists Lahtinen et al. 2017, Munafò and Smith 2018). Historical ecology (in which multiple kinds of historical accounts are used to reconstruct how landscapes appeared in earlier times) is a good example of triangulation of different datasets to find agreement (e.g. Grossinger 2012).
Bringing multiple knowledges together does not always result in consensus, however, and allowing for knowledges not to eliminate each other when they do not agree is critical: collaborative modeling should not erase difference (see geographers and interdisciplinary scholars Klenk and Meehan 2015). Where data or analyses do not agree, we can strive to represent the multiple stories that emerge from the models, possibly in the data biography. When accounts conflict, we can benefit from knowing why. Is it because we have not fully understood each account? Is there missing information from one or both? Has one side been marginalized and one enabled and there are political and social reasons why these accounts do not intersect? Is the mismatch due to some difference in the epistemic underpinnings of the two accounts? By investigating these differences we may come to understand more about the system as well as the people studying it, and the methods they (or we) use.
There are established methods for triangulation in the form of meta-analysis. One prominent example from health research is the Cochrane method for combining information from different studies (Higgins et al. 2019). In addition, models can incorporate observation process explicitly, for example statistical ecologists Link and Sauer (2007), who represent the improvement of birders in identifying birds by including an "observer effect" in which each birder's count can increase over time independent of the actual number of birds present. Models like these position the observer as a force in the model rather than ignoring the human process of observation. This could be a way of situating the knowledge of multiple studies relative to each other, and while it may begin as a mainly methodological choice (e.g. representing the uncertainty of the measurement process), explicitly modeling observation processes could provide an entry point for other aspects of situating knowledge (e.g. asking questions about why things were measured, who was involved, and so on). Some possible ways to bring together different datasets and models to see how they agree or disagree include: comparing modeling results or initial data exploration with other researchers' data or with other groups' knowledge (for example, Indigenous or local peoples' knowledge), or with other datasets; formal sensitivity analysis of model parameters and assumptions or subsets of data to determine what model assumptions change the results; using more informal "analytical flexibility" (as per statisticians, epidemiologists, and psychologists Munafò et al. 2017) to check the robustness of results by trying a variety of inputs and choices and comparing the outputs; and/or using the literature review component of the project's write-up as a way to triangulate, reorienting a "background" or "discussion" section of a paper to emphasize the comparison of the results with what other studies have found using different data or methods. These suggestions are not novel modeling or writing practices --rather, I am suggesting that we frame them for ourselves more explicitly as triangulation processes.
Triangulation also has something to offer the reproducibility debate. As described in the Introduction, actually reproducing studies requires more than reading methods (from historians of science and statisticians Porter 1996, Stodden et al. 2014, Stark 2018. The technical aspects of reproducibility are challenging enough, and many researchers are working to make this easier (see environmental scientists Boettiger et al. 2015, Kitzes et al. 2017, Ram et al. 2018) --however there is a level at which replication may be theoretically impossible because the researchers, the place, and the time are all different from the original. Therefore, more broadly, we could re-frame scientific reproducibility as a triangulation process, explicitly acknowledging that each study is necessarily different. Then, if the aim is to replicate results, at best a study may only be partially confirmatory of the original.

Practice 2b) Uncertainty as openness: Reframe uncertainty as an invitation for collaboration rather than failure
Multiple stories are another way to reframe modeling uncertainty, as well. Environmental scientists Petersen et al. (2011) outline several different attitudes towards uncertainty: the "deficit view" in which uncertainty is a number to itself be quantified, reduced, and eventually eliminated; the "evidence evaluation view" which "focuses on generating robust conclusions and widely shared interpretations of the available limited knowledge" through building scientific consensus; and finally, the "post-normal view" (drawing on the definition of "post-normal science" from philosophers of science Funtowicz and Ravetz 1993), in which uncertainty is an inherent property of complex systems. Anthropologist Mol (2002) describes this last view as the "permanent possibility of alternative configurations," calling the potential for different ways of defining and viewing a scientific object "doubt." However, embracing uncertainty does not prevent us from acting: Mol reassures us that "the permanent possibility of doubt does not lead to an equally permanent threat of chaos" and that "open endings do not imply immobilization." Futhermore, modeling practices (and many other research processes) benefit from recognizing that "capacity-building in the face of uncertainty has to be a multidisciplinary exercise, engaging history, moral philosophy, political theory and social studies of science, in addition to the sciences themselves" (from science and technology studies scholar Jasanoff 2007). Therefore, rather than viewing model uncertainty as a problem, we could view it as an opportunity to engage stakeholders and interdisciplinary teams, an invitation to study a particular aspect of a system more closely, or a call to triangulate the analysis with other knowledge. Petersen et al. (2011) also suggest that their three views of uncertainty are complementary and not mutually exclusive.
Uncertainty could guide us towards what to investigate further (whether quantitatively or qualitatively). For example, sensitivity analysis in population viability models involves investigating which demographic parameters like growth or survival have the largest impact on overall population growth or decline. The results of the sensitivity analysis can therefore point to biological quantities that are important to know precisely in order to accurately assess population viability. If these parameters are not well known, the analysis helps direct research priorities towards better constraining them (see mathematical ecologist Caswell 2001). We could construct models like these that invite us to find those places to look for more information (this is closest to Petersen et al.'s "deficit view" of uncertainty), or point us to places where we should work especially carefully to triangulate with different datasets and knowledges (similar to their "evidence evaluation view"). There are also established methods for summarizing the certainty of meta-analyses, especially when decisions must be made based on this collection of evidence --for example Grading of Recommendations Assessment, Development and Evaluation (GRADE) in medical studies (Meader et al. 2014).
Engaging multiple interested parties with an attitude of openness and a knowledge that we may not be able to resolve the uncertainty but must act anyway (the "post-normal" view) might be a way forward for some of difficult contemporary issues of global concern (see geographers and environmental scientists Voinov et al. 2016). We could seek ways in modeling and writing to quantify uncertainty, explore consensus, and recognize that we may not be able to resolve some things -and to take an attitude of opportunity and openness in this process. Particularly in situations where we have smaller amounts of data, rather than viewing the lack of predictive power as a problem, perhaps we can use the situation as a starting place for a conversation between different "stakeholders" or people who care about the topic. For example, collaborative modeling under uncertainty can include scenario modeling with expert knowledge elicitation (Voinov et al. 2016). And though communicating uncertainty to the public while maintaining a sense of authority is tricky, this kind of transparency is critical in establishing (or renewing) trust (economist Shafik 2017).

Practice 2c) Interdisciplinary fluency: Be aware of epistemological, normative, and vocabulary differences in diverse collaborations
Seeking interdisciplinary training and experience is key in working on critical contemporary problems (potentially assisting with triangulation, according to epidemiologists Munafò and Smith 2018, and increasing statistical power, according to statisticians, epidemiologists, and psychologists Munafò et al. 2017). Choosing to do interdisciplinary work is also a prescription for encountering differences in epistemology, vocabulary, and research norms (see environmental scientist and ecological economist Lélé and Norgaard 2005) -meaning that having ways to manage the experience of difference is also important. Difficulties can arise from differences in ways of knowing, recognizing and incorporating values into research, defining the questions of interest, and referring to the objects of study, among many others (Lélé and Norgaard 2005). Learning how to talk to each other is a first step in what geographer Wyly (2009) calls "trust" or "deference" between specialists which can allow for effective collaborations. One specific challenge when communicating between disciplines is working with differing vocabularies. For example, the term "community" is used differently by ecologists and participatory action researchers. Usage of discipline-specific technical terminology should always be accompanied by a willingness to slow down and clarify, whether it is an unusual (geology: "terrane") or common term with a particular meaning (statistics: "estimating," sociology: "bracketing").
In learning interdisciplinary collaboration skills, practical experience is key, and explicit training can be invaluable ( Then, when one later encounters differences while working in intellectually diverse groups, one has some training to fall back on to manage those experiences. Interdisciplinary work can involve more time and effort than monodisciplinary work, but it opens up many possibilities --and training in how to do this work supports many of the other practices outlined here. Interdisciplinary collaboration should also enable modelers to address ethical issues at all stages of a project, either by collaborating with ethicists (see biologist and statistician Boden and McKendrick 2017), or by being more aware of ethical issues themselves; to triangulate between different kinds of knowledge; to do ethnography of the datasets they use if the data biographies are incomplete, or to work with ethnographers who can do so; to work with science and technology studies scholars in writing data biographies; and to work more effectively in collaborations outside the academy when in search of more just and democratic modeling processes. Often, where discipline and epistemology differ, interest in or commitment to a particular place, people, technology, or issue can bring colleagues together (Science & Justice Research Center (Collaborations Group) 2013).
Another key point for interdisciplinary collaboration is to include people with different backgrounds from the start: Rather than bringing a statistician in to analyze data after it is collected, or a social scientist in to bring a "social perspective" after the study is already designed, involve everyone in the project from the beginning (Andrade et al. 2014).
As we collectively and mindfully re-imagine contemporary modeling practice, we need interdisciplinary teams of practicing modelers and critical scholars. We can strive to be patient with different timelines and create an environment of mutual respect and trust where collaborators are able to admit ignorance and ask questions. These practices can be fostered at multiple levels, by both institutions and individuals.

Theme 3: Engaging with justice implications of modeling processes
When facing the reality of collaboratively bringing together different kinds of knowledge, issues of justice --especially whose knowledge counts --quickly crop up. This phenomenon is especially apparent in issues of algorithmic injustice (see mathematician/ economist O'Neil 2016), but it can arise with any kind of modeling. The practices I have grouped under "justice" raise both general and specific questions about how modelers can engage with these issues (research question 4).

Practice 3a) Power dynamics: Watch for and work to mitigate unjust interactions in collaborations
The ways in which different collaborators' relative circumstances emphasize their knowledge production over others can be critical in the success or failure of research. Feminist studies scholar Haraway (1988) suggests that "we need... the ability partially to translate knowledges among very different --and power-differentiated --communities." ("Power" here refers to the critical theory sense of the word, not "statistical power" which has a specific mathematical definition, see above). For example, philosopher Fricker (2007) describes epistemic injustice as "a wrong done to someone specifically in their capacity as a knower," and details a kind of testimonial injustice in which prejudice causes some peoples' knowledge to have lesser credibility than others. Unfair treatment in knowledge production can relate to financial resources, workload, authorship and credit, among other dimensions. Power imbalances can arise between academics from different disciplines (in which some are marginalized due to normative assumptions about rigor or usefulness, see environmental scientist and ecological economist Lélé and Norgaard 2005), between individuals with different job titles or seniority levels, and between academics and communities they work with (where communities may be at a disadvantage due to perceived academic authority).
For example, in writing and working on interdisciplinary grants and projects, collaborators from different disciplines do not necessarily receive equal financial (and other) benefits. Even within a discipline, the majority of the labor can often be pushed onto less senior, more vulnerable team members (e.g. graduate students and postdoctoral researchers). Ensuring that greater labor and responsibility comes with appropriate authority, compensation, and credit could help to mitigate this imbalance. Even in explicitly participatory research, many authors do not give authorship credit to the communities they engage with (see interdisciplinary scholars Sarna-Wojcicki et al. 2017). Authorship, along with other benefits of the research, should be explicitly discussed with community partners. In addition, sensitivity to labels and terminology can be important (see interdisciplinary scholars Eitzel et al. 2017): some terms can marginalize collaborators because of preexisting systemic or structural disadvantages. Attention should also be paid to specific histories of injustice, for example colonial histories and negative past relationships between Indigenous people and universities.
Learning to look through different lenses and to perceive the impacts of power differentials between collaborators is not typically part of modelers' training, but modelers can learn to facilitate discussions of these issues with colleagues --explicit facilitation training could be useful here. Even being aware of or receptive to hearing about problems can be an important first step, and documenting these issues in the data biography may also be appropriate. Modelers and their collaborators may need to experiment with ways to mitigate power imbalances, even potentially pushing back on institutional or structural barriers that constrain them, where possible.

Practice 3b) Impacts and implications: Engage with what the work does in the world
Modeling does not take place in a vacuum, and engaging with its justice implications should involve asking who will be harmed and who will benefit from models, or as political ecologists might put it, who are the "winners and losers?" (Robbins 2011). Feminist studies scholar Haraway (1988) says that "feminist objectivity ... allows us to become answerable for what we learn to see." Therefore, we should be responsible for being aware of and open to our own relationships to what we study, understanding how we are located in a web of power relations, and to be accountable for what we observe, including its impacts and implications --to be "noninnocent" as Haraway puts it. We can strive for our science to be "response-able" (i.e. able to respond to the situations it uncovers or finds itself intentionally or unintentionally embedded in, sociologist Reardon 2013). Further, if we want to do what sociologist Thompson (2013) calls "good science," we need to do science "with" ethics, in which ethical, legal, and social implications of research are examined when research is planned and in process, not just as an after-the-fact or downstream step (Raji et al. 2020). Frequently, people who should benefit from the production of knowledge and have a hand in directing it are shut out of the process and potentially harmed by it (see sociologist Mah 2017). How can researchers engage with the political contexts of inequities that their work may exacerbate? While we cannot completely control how the world will use our work, that does not absolve us of responsibility in considering these questions before and during our research processes (see economists and biologist and statistician Derman and Wilmott 2009, Boden and McKendrick 2017). There can be a sense that scientists should stay out of politics in order to be "disinterested authorities" --but feminist objectivity already points out that we are not objective in the sense of "seeing from nowhere" (Haraway 1988 , and recently Access Now and Amnesty International (2018) have created a Declaration "Protecting the rights to equality and non-discrimination in machine learning systems." Modelers and organizations can subscribe to this declaration and others like it as part of a commitment to working with the impacts and implications of their work. Interdisciplinary collaboration with critical scholars can also help modelers to investigate potential benefits and harms when designing and implementing projects.
Practice 3c) Community-based modeling: Find opportunities to collaborate directly with people who will be affected by modeling results Where possible, modelers can work directly with the people who will be impacted by their results (though one challenge is identifying not just intended end-users of models, but others who will be impacted as well). Mathematician and economist O'Neil (2016)'s examples of algorithmic injustice are so damaging because the algorithms are proprietary and therefore not accountable to those that they harm. Re-imagine O'Neil's recidivism example if the modelers engaged with incarcerated people to find out what interventions or information they thought would keep them from returning to prison when designing models that represent the relationship between the length of prison sentences and the chance of being imprisoned again later. Finding ways to solicit user input on the results of predictive algorithmic models, which could then feed into the improvement of the algorithms, could be a way to reduce the harm done by these models. Even better than holding black-box models accountable, however, is creating models in partnership with the people they will be applied to from the start; environmental scientist Étienne (2013) calls this "companion modelling." This kind of modeling enables researchers to create "better accounts of the world" (feminist studies scholar Haraway 1988 Collaborative modeling also engages people with the modeling process and products and invites the opportunity for more just modeling. This approach incorporates some of the concepts of Participatory Action Research (PAR), in which research is "generally not done on participants; it is done with participants" (educational scholars Merriam and Tisdell 2015) and "seeks to develop and maintain social and interpersonal interactions that are nonexploitive and enhance the social and emotional lives of all people who participate" (participatory scholar Stringer 2013). Therefore, participatory modeling can be one way to bring together multiple partial knowledges in a more just way. The community could even direct the modeling process from the beginning, focusing on the questions that are most important to them, working with data they collect, and validating and analyzing the results with them as well (interdisciplinary scholars Eitzel et al. 2020a, Eitzel et al. 2020b). Communities ideally should be involved in ongoing future developments and applications of the model, as well (interdisciplinary scholars Eitzel et al. 2021).
While it may not be feasible to engage with users during all phases of modeling, making an effort to increase engagement where possible could be a way to improve modeling transparency and trustworthiness. Many modelers are encouraging working with model users (see mathematicians, geographers and environmental scientists Jakeman et al. 2006, Voinov et al. 2016) whether this means decision-makers (e.g. complexity modelers Grimm et al. 2014), journalists (e.g. statistician Spiegelhalter 2017), or others. As this trend continues, the modeler's role as an expert may shift, but perhaps this is ultimately beneficial: according to geographer Lave (2015), "we might achieve more of our political and intellectual goals by embracing the progressive aspects of our reduced authority than by fighting its erosion." If modelers want to improve equity in modeling, we may need to face a future scientific process in which we are not the gatekeepers of knowledge production.

Conclusions
I set out to understand how I could improve my modeling, and ultimately found that modelers are already engaging with many of the issues I had discovered from my work and my training in science and technology studies (research question 5). I also found that there was solid theoretical support from a wide range of social sciences for the practices modelers were already proposing and implementing. My manifesto reflects this movementin-progress and also the potential for further work, and is meant to grow and change and be elaborated on: There are many different types of modeling, and the issues raised in the introduction and throughout the manifesto practices may apply differently to simulations, mathematical models, statistical models, and machine-learning driven predictive models.
Future work should therefore involve investigating which practices apply to what kinds of modeling, and to which stages of modeling --for example, project development and choice versus implementation, evaluation, and/or publication (I have made an initial attempt in Fig. 1). And how do other contexts change the applicability of the practices? What does assessing model accuracy or justice look like in physics versus ecology versus sociology? What constraints are there on modelers who work directly with decision-makers? How can models help shape policy and how are they shaped by it? How are these conditions different for models created in industry or the non-governmental sector? How do these practices of contextualizing, triangulating, and collaborating work (or not work) for these different groups and modeling applications? And finally, in what contexts are different manifesto practices politically appropriate? In a political climate which can be openly hostile to scientific knowledge, critical modeling practice should work not to weaken modelbased knowledge production but to strengthen it, i.e. "strategic positivism" (Wyly 2009) -ideally by reinforcing its empirical basis for knowing and by creating openings for more engagement and better justice for a wider range of people. As we acknowledge our values in our modeling, are we able to triangulate with others with different values to create better, more accountable modeling knowledge and restore public trust? Modelers using different methods and coming from different disciplines all think about these questions, often in different ways, but there are commonalities, and where there are differences, comparison and discussion could be fruitful. To assess these questions, I and other modelers could engage in reflective methods like autoethnography --a process of self-observation and analysis that could help reveal how the manifesto practices play out in real modeling situations (Ellis et al. 2011, Newman and Farren 2018, Mendonça et al. 2017, Mulvenna et al. 2018, Bodo et al. 2018, Caesarius and Johansson 2013, Levin 2019. In addition, it may be valuable to investigate if and how this process of creating a manifesto, in itself, shifted my thinking. Modeler's manifestos have also pointed to the importance of institutional change as well as individual change (Munafò et al. 2017). My ability to create the manifesto was directly supported by the US National Science Foundation's interdisciplinary Science, Engineering and Education for Sustainability fellowship program, which has since been discontinued. So perhaps even bigger questions remain as to how these on-the-ground practices can be supported by larger institutional structures. How can we collectively shift organizations and infrastructures to support better modeling? Finally, these structures can be supported by greater interdisciplinarity. I have made suggestions that should make modeling practice more transparent to non-technical readers, and suggestions for places where STS critique could be applied to modeling practices. There is great potential for critical STS scholars and practicing modelers to link together across disciplinary divides to address the ethical challenges facing data science. It will take action at many scales and from many perspectives to realize the potential of data science to help build a better world. Workflow diagram for manifesto practices, showing which project stages may benefit from which practices. Interdisciplinary fluency, engaging with community-based modeling, and paying attention to power dynamics as well as impacts and implications are all important at all stages of modeling work. Epistemic consistency is important throughout model development (the three middle steps of model choice, construction, and description) and communication, while triangulation and mixed methods contribute largely to model development. The data biography is most important in the model description stage, though one may need to keep a journal and track details of the model development process in order to create the data biography. Treating uncertainty as openness is most important in model communication and application; however, this could feed back into iterative model development steps as well, or one could design models to aid in treating uncertainty as openness.