Research Ideas and Outcomes :
Conference Abstract
|
Corresponding author: Ronit Purian (purianro@tauex.tau.ac.il)
Received: 03 Oct 2022 | Published: 12 Oct 2022
© 2022 Ronit Purian, Natan Katz, Batya Feldman
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Purian R, Katz N, Feldman B (2022) Explainability Using Bayesian Networks for Bias Detection: FAIRness with FDO. Research Ideas and Outcomes 8: e95953. https://doi.org/10.3897/rio.8.e95953
|
In this paper we aim to provide an implementation of the FAIR Data Points (FDP) spec, that will apply our bias detection algorithm and automatically calculate a FAIRness score (FNS). FAIR metrics would be themselves represented as FDOs, and could be presented via a visual dashboard, and be machine accessible (
First we may discuss the context of this topic with respect to Deep Learning (DL) problems. Why are Bayesian Networks (BN, explained below) beneficial for such issues?
What are Bayesian Networks?
The motivation for using Bayesian Networks (BN) is to learn the dependencies within a set of random variables. The networks themselves are directed acyclic graphs (DAG), which mimic the joint distribution of the random variables (e.g.,
Real-World Example
In this paper we present a way of using the DL engine tabular data, with the python package bnlearn. Since this project is commercial, the variable names were masked; thus, they will have meaningless names.
Constructing Our DAG
We begin by finding our optimal DAG.
import bnlearn as bn
DAG = bn.structure_learning.fit(dataframe)
We now have a DAG. It has a set of nodes and an adjacency matrix that can be found as follow:
print(DAG['adjmat'])
The outcome has this form Fig.
Where rows are sources (namely the direction of the arc is from the left column to the elements in the row) and columns are targets (i.e., the header of the column receives the arcs). When we begin drawing the obtained DAG, we get for one set of variables the following image: Fig.
We can see that the target node in the rectangle is a source for many nodes. We can see that it still points arrows itself to two nodes. We will discuss this in the discussion (i.e.,
So, we know how to construct a DAG. Now we need to train its parameters. Code-wise we perform this as follows:
model_mle = bn.parameter_learning.fit(DAG, dataframe, methodtype='maximumlikelihood')
We can change ‘maximulikelihood’ with ‘bayes’ as described beyond. The outcome of this training is a set of factorized conditional distributions that reflect the DAG’s structure. It has this form for a given variable: Fig.
Discussion
In this paper we have presented some of the theoretical concepts of Bayesian Networks and the usage they provide in constructing an approximated DAG for a set of variables. In addition, we presented a real-world example of end to end DAG learning: Constructing it using BN, training its parameters using maximum likelihood estimation (MLE) methods, and performing and inference.
FAIR metrics, represented as FDOs, can also be visualised and monitored, taking care of data FAIRness.
Bayesian networks (BN), causal inference, conditional entropy, deeg learning, dimension reduction, directed acyclic graph (DAG), neural networks, tagging behavior
Ronit Purian
First International Conference on FAIR Digital Objects, poster