FAccT 2021. Automated decision-making, causal accountability, and robustness

Summary of Day 2 at FAccT 2021. Automated decision-making, accountability and recourse, and model robustness
conference
explainability
causality
robustness
responsible AI
ML
Author

Hongsup Shin

Published

March 9, 2021

Keynote: In Praise of (Flawed) Mathematical Models

One of the reasons I try to attend the FAccT conference is because I always get inspired by novel approaches from this community to quantify and measure social concepts such as fairness. However, this approach has its own demise. It’s possible that these attempts get lost in the weeds while discussing abstract and theoretical aspects of the models which can easily alienate researchers from the real world. Today’s keynote by Katrina Ligett from Hebrew University addressed this issue.

During the talk, she discussed the pros and cons of using mathematical models in general, but also in the context of studying FAccT-related topics. Many positive aspects of mathematical models she mentioned were exactly the same reasons why I have been drawn to this area; mathematical models are explicit about their assumptions such as what’s included and what’s not, usually come with provable guarantees, and provide solid foundations for future research. Mathematical models can also easily become a fig leaf by letting bad actors hide behind the models’ specificity and complexity. People also have a misperceived notion on mathematical models that the models are objective (and thus have authority).

She talked a lot about this duplicity during the Q&A. It was nice to hear from Katrina that theoreticians have tendency to be drawn to intellectually stimulating problems and sometimes lose touch with the main purpose that our models serve. She also talked about how to bridge the gap between people who build mathematical models and people who do not have the background but are impacted by the models. She first talked about the importance of translational type of work which explains theoretical concepts in real-world meanings. This way, it is going to be much easier to critique models’ basic assumptions where many problems begin.

She ended the talk in a positive note by saying that even with the flaws of mathematical models, our approach should not be banning mathematical models but rather encouraging modelers to acknowledge the shared responsibility within the community they serve, to take advantage of the rigorousness built in mathematical approaches to carefully examine the problems, and to constantly critique the models.

Automated Decision-Making

Reviewable Automated Decision-Making: A Framework for Accountable Algorithmic Systems presents an easy-to-follow and straightforward guide for automated decision-making (ADM) system review. Being able to review something means that we can hold companies easily accountable for the consequences from ADM systems, which already have serious issues of lack of accountability. The authors borrowed the idea from administrative law’s approach to reviewing human decision-making, and presents a practical and holistic framework. The suggested idea consists of various parts such as commissioning, model building, decision-making, investigation, group deliberation, and assessment. I liked their idea originated from law because this will help legislators regulate ADMs.

Speaking of regulation, An Action-Oriented AI Policy Toolkit for Technology Audits by Community Advocates and Activists was an excellent paper in that it provided an example of algorithmic equity toolkit designed for diverse stakeholders. I assume it was possible because the authors come from diverse background; computer science, social and political science, and activism. The toolkit aims to increase public participation for AI-policy action and consists of various formats including flowcharts, maps, and questionnaires. With this diverse methods, it tries to deliver comprehensive of information of AI technology such as the type, functionality, and its potential impact. This looks like a handy way to facilitate communication between many different stakeholders such as communities, government officials, and even technologists.

Accountability and Recourse

Designing Accountable Systems suggests that accountability requires a causal model because if something happens and we need to hold someone accountable for the consequences. The authors bring an interesting distinction between forward- and backward-looking causalities. When we build causal models, we use the forward-looking approach such as “A causes B”. However, when we need to use causality for accountability, we lean on the backward-looking approach such as “the fact that A happened last year causes B to die yesterday”. Thus, the authors use Structural Causal Models to evaluate a system’s design for accountability.

In a similar context, Algorithmic Recourse: from Counterfactual Explanations to Interventions also emphasizes the importance of causal models. Many previous studies on counterfactuals as a means of algorithmic recourse overlooked how to achieve the counterfactual scenarios. Without addressing the feasibility, algorithms may provide nonsensical counterfactual explanations such as “in order to get a loan, change your ethnicity”. This study shows that causal models can mitigate this issue because causal explanations inherently avoids nonsensical explanations from the beginning.

Finally, Epistemic Values in Feature Importance Methods: Lessons From Feminist Epistemology looks at explainable AI (particularly feature importance) through the lense of feminism studies, which was highly enlightening. The authors say explanations and epistemology (study of human knowledge) are inseparable; useful explanations have epistemic values. And providing AI explanations in a feminist way is means we focus on understanding of situated knowledge and avoid totalization. To elaborate, explanations should respect different perspectives, particularly from marginalized individuals, and encourage active conversation about the knowledge. If I look back on how I used feature importance as a tool for explainable AI, it is true that it’s usually a one-way communication that was static and lacked communication with stakeholders. The authors suggest that it’s crucial to evaluate explanation in context and keep it interactive. More importantly, since explanations can often be used as an authoritative way, it’s important to encourage users to avoid over-trusting one interpretation, which was such a thoughtful insight.

Robustness

It’s well known that proxy features can contain information about sensitive attributes such as race and gender. Often people try to remove these spurious features from the dataset. In fact, this practice is not just limited to applications that use datasets with sensitive attributes. ML practitioners drop correlated or spurious features to build more robust models. Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately shows that removing these features can lead to performance drop. The authors mitigated the problem by using a technique called Robust Self-Training. The authors of this paper also investigated how the performance drop is manifested among different groups in the data.

Fairness Through Robustness: Investigating Robustness Disparity in Deep Learning is an interesting paper that they define the group fairness in terms of model vulnerability against adversarial attacks. They first address that in deep learning models, it is possible for a certain subgroup to be more vulnerable to those attacks, and thus the model is less robust for particular subgroups. You can easily picture this as a classification with a decision boundary. If the model is not robust and changes its decision boundary when the attack occurs, samples that are closer to the original boundary may be more likely to be misclassified with the attack. The authors identified that for deep learning models, this is a complex matter because it is about data drift and the learning process of the model. However, as a first attempt, they added a simple regularization term to the model to equalize the robustness bias among the subgroups in some examples.