Counting Sand

Machine Learning: Your Right to Explainability

Episode Summary

Technology has been helping us make decisions for decades. Sometimes the decisions they help us with can be easily explained. But in machine learning we take all this data and mix it together. And so it becomes hard for us to find the thread that caused this decision. Guest Nikos Myrtakis, a PhD candidate from the University of Crete, joins Angelo to talk through making explainability possible.

Episode Notes

How do we make the next generation of machine learning models that are explainable? How do you start finding new kinds of models that might be explainable? Where do you even start thinking about that process from a research perspective?

Nikos begins with a discussion on how we make decisions in general.  In the scientific world, we mostly reason through statistical or cause-and-effect type scenarios.  We can predict outcomes and train our models to produce the results we traditionally expect.

He then discusses other early pioneers in this work, for example, back in the 70s, a rules engine was developed to help clinicians make diagnoses.  It turns out that humans are very complex and hard to codify.  Dr. Charles Forgy wrote his thesis on the Rete algorithm which is what modern-day rules-based engines stem from.

After the AI winter period, there was the introduction of neural networks that would encode the rules.  This became an issue for explainability on why the rule was created.  The neural networks create a mathematical weighted data model evaluated against the outcome.  Without the ability to open up the network to determine why some data was weighted higher than another, has been the challenge in explaining the results we see.  

There is also a concern from the European Union General Data Protection Regulation (GDPR) where a human has the right to obtain meaningful information about the logic involved, commonly interpreted as the right to an explanation.    

We want to look at explainability through two factors: a local point of view and a global point of view.  The global objective is to extract a general summary that is representative of some specific data set. So we explain the whole model and not just local decisions.  The local objective is to explain a simple prediction as a single individual observation in the data. But you have a decision according to a neural network or a classifier or a regression algorithm, so the objective is to explain just a single observation.  

There are five problems that present themselves in explainability:  Instability, Transparency, Adversarial Attacks, Privacy, and Analyst Perspective.

For Instability, we look at heat maps as they are very sensitive to hyperparameters, meaning the way that we tuned that network.  How we adjusted the sensitivity then impacts the interpretation. Transparency becomes more difficult the more accurate machine learning is.  We call that transparency because machine learning models, neural networks, are black boxes with very high dimensionality. But what's interesting is that we can say that their prediction accuracy makes explainability inversely proportional to that.  An Adversarial Attacks example is to imagine that interpretability might enable people, or programs to manipulate the system. So if one knows that for instance, having three credit cards can increase his chance of getting a loan then they can game the system by increasing their chance of getting the loan without really increasing the probability of repaying the loan.  Privacy can impact your access to the original data especially in complex systems where boundaries can exist between other companies.  You might not have the ability to access original data.  Lastly, the Analyst Perspective. When a human gets involved to explain the system, important questions include, where to start first and how ensuring the interpretation aligns with how the model actually behaved.  There are some systems by which the ML has multi-use and the human is trying to understand the perspective of use for the result given.  These are some specific ways we have found that create the complexity and challenges in explainability with machine learning models.

We continue to learn and adjust based on those learnings.  This is a very interesting and important topic that we will continue to explore.

 

Citations

Dr. Charles Forgy (1979), On The Efficient Implementation of Production Systems, Carnegie Mellon University, ProQuest Dissertations Publishing, 1979, 7919143

Nadia Burkart, Marco F. Huber (2020) A Survey on the Explainability of Supervised Machine Learning, arXiv:2011.07876 (cs)

 

Further Reading

https://openaccess.thecvf.com/content_CVPR_2019/papers/Pope_Explainability_Methods_for_Graph_Convolutional_Neural_Networks_CVPR_2019_paper.pdf

https://towardsdatascience.com/explainable-deep-neural-networks-2f40b89d4d6f

 

Nikos' Papers:

https://www.mdpi.com/2079-9292/8/8/832/htm

https://link.springer.com/article/10.1007/s11423-020-09858-2

https://arxiv.org/pdf/2011.07876.pdf

https://arxiv.org/pdf/2110.09467.pdf

 

Host:Angelo Kastroulis

Executive Producer: Kerri Patterson

Producer: Leslie Jennings Rowley

Communications Strategist: Albert Perrotta;

Audio Engineer: Ryan Thompson

Music: All Things Grow by Oliver Worth

Episode Transcription

Angelo: Technology has been helping us make decisions for decades, whether  that decision is which medication should we take or which restaurant should be  eat. Sometimes the decisions they help us with can be easily explained. For  example, in rule-based systems, we can just traverse the tree that the decision  was recommended and we can find out why. 

But in machine learning that is a much more difficult problem because we take  all this data that we use to make the decision. We munge it together. We might  apply some statistical analysis or put it into millions of neurons or maybe turn it  into multidimensional space. And so it becomes hard for us to find the thread  that caused this decision to come out of these technologies. That is an open  research topic.  

Joining me today is Nikos Myrtakis, a PhD candidate from the university of  Crete who's studying that very thing. How do we make the next generation of  machine learning models that are explainable? I'm your host, Angelo Kastroulis  and this is Counting Sand. 

First, let's start with the etymology of the word, explain. It's Latin in origin. It's  really two words, ex-, which means out and planus, which means to spread. So  in other words, an explanation is trying to learn about a topic and spread out  from there. And it makes a lot of sense if we're trying to understand the details  behind a particular decision. 

For example, if you told me that I should not take this medication, if you have a  decision tree and it went down some path in the tree, for example, the reason  this medication won't work is because it interacts with another medication that  you took. That explanation makes sense and it's actionable. 

However, if you, as we mentioned earlier in machine learning, take this one  piece of data and perhaps there are millions of other data elements and when we  combine all this data mathematically it becomes impossible to find, sometimes,  which are the most important facets of it. Or we can know it, but the  dimensional space is so large that we as humans can't wrap our heads around  300 dimensions for example. 

So Nikos from a computer scientist or a data scientist perspective, how do you  start finding new kinds of models that might be explainable? Where do you start  even thinking about that process from a research perspective? 

Nikos: So from the scientific perspective, before we are trying to explain what  is using machine learning, we have deductive explanations that are more like 

causal explanations. But we try to find the possible causes that lead to specific  phenomenon, for instance, where all gases expands when heated. This gas was  heated so we know that this gas expanded. 

Afterwards, we have statistical explanations. So for example, most people that  use tobacco contract cancer. This person used tobacco so this person contracted  cancer. These are statistical explanations. And unfortunately there is no clear  agreement regarding what an explanation is, nor what a good explanation  entails. In the machine learning context, we have a model that is fitted using  some data.  

So it summarizes, somehow, the data that we provide. Then we need to  understand the underlying mechanisms that lead to specific decisions of the  model. In other words, we have a very complex, for instance, decision surface  and we try to do the reverse engineering and understand what this decision  actually means. 

Angelo: So that's a really important point. We don't really have consensus on  what exactly a good decision is or how you go about getting one. I like this idea  of there being two major categories. There’s, as you mentioned, the causal  explanation and the statistical ones and in machine learning we tend to rely on  the statistical ones, I should say, not just in machine learning, but in research  because the example you gave about tobacco and cancer, it doesn't give you the  answer as to what exactly in tobacco causes of the cancer. It just mentions that  there's some statistical correlation. And then we can use that to abstract higher  level, more difficult concepts.  

Because as you mentioned, the design surface is so big that there could be  millions of variables that contributed to something. So one way we could do  this is to try to figure out which variables mattered, as you mentioned. So now  let's go back in time a little bit. Let's see if we could figure out where this all  started with expert systems. Nikos what's an expert system? 

Nikos: They are AI systems that are embedded with domain specific  knowledge. So experts use the systems to have enhancement on their decisions.  So the first thing that they did in the Seventies was for health care. Actually  rule-based systems that had a knowledge based on the underpinnings that  several doctors may have different diagnoses and they won't find what is the  most probable one for a specific patient, given some data. 

Angelo: Yes, exactly. Diagnosing though is interesting because a diagnosis in a  human is actually impossibly hard and complex, but it would make sense for  computer scientists to want to try to deconstruct the thinking process of a  human being of a clinician to say that they're coming up with diagnosis and let's  try to deconstruct the decision tree that they use to get there. 

While I think using a rule-based approach to diagnose won't ultimately work, it  is an interesting case because diagnoses are very complicated. We don't know  exactly what caused the disease but what we can do is try to follow the thought  process that a human uses to be able to make their decisions. So if we can kind  of mimic what a clinician is doing then we can produce a system that does that.  

Back in 1979 when Dr. Charles Forgy was writing his PhD thesis who, by the  way, is the father of modern rules engines. He invented the Rete algorithm,  which is the progenitor of many rules engines that we use today. 

We call this kind of technology expert systems because an expert has to encode  the rules but it's exceptionally good at executing the rules. Now that we've  talked a lot about these in the past, rules engines, but the one thing I think that's  very interesting about expert systems, because somebody encoded the rules, we  can now follow the path of execution in order to find out why a decision was  made. 

So let's talk a little bit about the kinds of explanations that a rule-based expert  system creates. 

Nikos: So two types of explanations having experts systems back then. Then  first one is called line of reasoning. This is nothing more than a trace of the way  that the inference rules are used to produce a decision. And the next thing is the  problem solving activity type of explanation, where it consists from a line of  reasoning. So the inference rules a specific previously, but now we have a story  that the human can understand. It is a storytelling form of explanation, it’s not  just something technical. It tries to communicate on a human level with you. 

Angelo: Okay. So now we can frame what we were talking about before this  path of execution. That's the line of reasoning. And then there's this other  storytelling method. So now talking about artificial neural networks and  machine, we can't really benefit from the things that expert systems can because  an expert encoded the rule, whereas machine learning is trying to find the  mathematical strength between each of the different factors and that becomes  harder to explain, right?

Nikos: So in the Nineties, it was before the AI winter, whereas people wouldn't  believe with devoted to neural networks. 

Unfortunately, resources were not sufficient back then, but people understood  the necessity to explain the very, very complex surface that the neural network  builds to separate for that in a classification problem.  

Angelo: Back in the late Nineties and early 2000s researchers were studying,  how could we use artificial neural networks to extract rules? In other words,  rules that would normally be coded by an expert, but instead, now become  coded by a neural network. 

And so now we have explainability concerns because we don't know where  those rules came from or why they were created that way. 

After that research continued a bit, right? 

Nikos: So in game applications who have a simulated environment in a  simulated reality and usually we have AI systems to perform a specific task. So  the whole objective in 2004 was to decrypt, in some sense, why an AI took a  particular decision. And now in 2018, very recently, (there is) a specific  established conference on fairness, accountability, transparency, and  explainability. 

And this conference called F.A.T.—fairness, accountability and transparency,  actually has focused on sociotechnical systems and many of them include  artificial intelligence. So we have now formalized, in some sense, this concept. 

Angelo: Before we move on to the next one, there's a few things you said that I  want to explore. So, we spend a lot of time building expert systems. In fact, I  think that's a lot of what I do is build healthcare-specific expert systems. And  the explainability of rule-based systems is kind of encoded in the rule the way  that you said it, right?  

Then you talked a little bit about the AI winter—that’s Rosenblatt, the  perceptron then MIT, the paper where they wrote the perceptron book that was  full of false information and bad data and bad research. 

And then everybody just said, yeah, perceptrons will never work and then it  turns out they're actually really good. And when they were disproven 20 years  later so that's really interesting. So there was a little bit of work in neural  networks that we missed, but neural networks are actually very hard to explain.

So you mentioned that they had a little bit of work and explainability done,  then. How did they actually explain neural networks then?  

Nikos: One of first things that for instance in images is easily perceivable by a  human being. What we actually try to do is to understand which pixels actually  played a significant role to the decision. So you pass, again, the picture inside  the network to go until we output the output layer. 

And then again, back in. And we try to understand which features of the final  picture played a significant role. So you have a heat map where some of the  pixels are really shiny and some pixels are really dark. This is also called a  saliency map. 

Angelo: This gets into a very interesting area of explainability because  explanations are not simply just traces like in a tree that, say you went down  this path or this medication isn't working because it interacts with this other  medication. This is a completely different way of thinking about it. Saliency  maps, those heat maps are very interesting and we'll put one in the show notes  so you can see it. 

But what's interesting about this heat map is, in a neural network you can't just  pull it apart to see what's inside the network very easily because again, it's these  mathematical weights. And so it's hard to think of things in that way, but  images, two dimensional seeing are things that we can see. 

So for example, if you feed a picture of something into the neural network and  then you were able to look somewhere in between these and then what you did  is you took the weights and you kind of created a mask over the top of the  image, that lightened some pixels that mattered and darkened the ones that  didn’t, in terms of mattered in the ultimate outcome of classification, you would  see which parts of the image the network found more important and which ones  didn’t. Now that registers with us as humans. 

And we can see it. And so you have this picture of a cat and maybe its nose and  its ears light up more meaning that those parts of the picture were more  important to the training, the rest of it, it kind of factored the rest of it out. So  that shows us there are more dimensions to this kind of explainability than just  simply listing out what it thought. 

And so now that takes us in time to your paper and we'll have that also in the  show notes. I encourage everyone to take a look at it and read it when they get a 

chance. So I'd love to hear your thoughts on why do you think it's important?  The stuff that you're researching today? 

Nikos: There is also a measure of concern of European Union’s general data  protection regulation, or GDPR for short, where there is a specific article in the  GDPR that states that a human has the right to obtain meaningful information  about the logic involved—commonly in the commonly interpreted as the right  to an explanation. 

So this is why it is important. You have to bypass the algorithmic bias  discrimination, the unfairness, and then there is privacy and the soundness of  the decision made. 

Angelo: So, I'm glad that you mentioned the GDPR. On the previous episode,  we talked about the right to be forgotten and how that's a computer science  challenge. But now we're talking about the right to an explanation, which is also  a computer science challenge depending on the kind of technology that is used. 

So thinking about that and why it's so important, it's absolutely true that we  should have a right to know what our data was used for and why it is a major  decisions, for example, our credit or our healthcare, we do deserve to have that  information. 

And why we deserve that is an extremely difficult problem in machine learning,  because how do you say that these 20 million neurons came together to come to  some conclusion? It's kind of hard to explain that. 

One of the papers you sent me, it was the Burkart paper where it talked about  survey explainability. Interesting, and this'll be in the show notes. 

The interesting thing about that paper is it makes a really strong case for, I  think, an explanation of why explainability is something that humans tend to  need in certain times and other times we're totally fine not getting an  explanation. For example, aircraft collision avoidance has had no explainability  and complete lack of human interaction with it. 

And we're okay with it because it has worked that way for years, because it  carries with it completeness, right. It does what it needs to do. The factors are  involved. We trust it, but whenever a decision carries with it, some form of  incompleteness, we need an answer as to why. Healthcare is a great example.  There is no way you can know all of the factors involved in something. So we  have to ask it well, which did you consider? So if it says you should not 

prescribe this medication, of course I should know why not. And if the answer  is, it will interact with this other medication you had. Okay. That's a reasonable  explanation, but if it doesn't have any of that information you just don't know. 

What are some of the things we should be looking for as we're looking for  explanations in machine learning. 

Nikos: Okay. So we first need to define the basic two different times of  explanation. The first type is the local explanation. The objective is to explain a  single prediction, a single individual observation, in your data. But you have a  decision according to a neural network or a classifier or a regression algorithm.  So the objective is to explain just a single observation.  

Next, we have the global explanation, where the objective is to extract a general  summary that is representative for some specific data set. So we explain the  whole model and not just this local decision.  

Angelo: Okay, that's great. Now I don't want to paint the picture that no  machine learning is interpretable. There are interpretable machine learning  models, for example, decision trees or some linear models.  

Now just for our listeners, a decision tree is a machine learning model, much  like a decision tree that a human would make except the machine learning  model figured it out itself. And it determined, say, for example, I was trying to  predict the risk score for a heart attack. It could maybe determine cholesterol  and if I was diabetic, that those were the two most important contributing  factors. It would weight them the most. And if those were the two things in the  tree, then now it would be a decision tree with a depth of two. 

Nikos: So it is very easy to understand why the system came up with a specific  decision. On the other hand now we'll have more complex models, nonlinear,  highly parameterized models, such as SVMs, such as neural networks, random  forest etc. These models are not interpretable by nature. So what we do is  explain them in a post-hoc fashion. 

Angelo: Okay. So for our listeners, an SVM is a support vector machine. Where  what it does is it tries to draw a decision line between two parts of data and then  it draws some supporting lines on either side and says basically, if you come  inside this buffer you're now on the other side. So it tries to draw a line and  you're on one side or the other. 

Neural networks—we've talked about a random forest is basically a grouping of  many decision trees and we kind of just either average them or weight them or  do something like that to kind of bring them all together. That's what makes  them hard to interpret because now you, again, munged it all together and you  lost that definition. 

So because these models do that, you can think of them as just black boxes,  because you can't really see inside that easily to figure out what it is  mathematically that they're doing. For example, an average loses all of the  fidelity of an individual decision. So then how do we know or say that this  model is better than another?  

One way to do that as we use accuracy. So we just try to predict how much we  measure actually some data we give it and then the amount of answers it gets  correct, the proportion is accuracy. And the better they get for data it's never  seen before or unseen data, that's one way for us to explain how good the model  is. However, it doesn't tell us exactly at the individual point why something was  decided. 

Nikos: So imagine that we try to explain a mechanism, why this model doesn't  work or why this model took this decision. We have to be sure that the  explanation method that we are going to use follows the decisions and the parts  of a model of the input. So imagine that we have a person, but the model says,  okay, this person will have cancer in five years, for instance. 

And we said, okay, we need to explain why. So normally we train a different  model on the output of the complex one that tries to approximate the decision  surface for this particular human. Okay. Imagine now, if this explanation says  that I don't think that this person will get cancer in five years. 

Angelo: Okay, so here you're talking about using one machine learning model  to train or to explain another machine learning model. What happens though, if  the machine learning models disagree? 

Nikos: So this deviates with the model that we want to explain with a decision  that we want to shed some light on. This is something that, for instance in our  work, we say that it is not necessarily a bad thing. So fidelity means that the  explainer must follow the model that we want to explain.  

Stability represents how similar are the explanations for similar instances. So if  I have a nearest neighbor algorithm, we assume that the explanations should not  be the same, but they should inherit some common properties. 

Explanation collision methods today are not actually very stable. And then we  have the explanation types. The explanation times I choose to talk about is the  global surrogate model which is explained machine learning with machine  learning. Imagine that we have an SVM and the SVM produces some  predictions. We then train a global surrogate model on the output of the SVM  that tries to approximate what the SVM considered as a decision surface with a  very simple model, like a decision tree.  

Angelo: Okay, so let me get this right. So we build a complex model, like a  support vector machine, an SVM, and we use that to find the decision surface.  Then we train another model on that SVM that is trained to try to come up with  a decision as approximately close as it can with that SVM using an explainable  machine learning model, like a decision tree. 

Niko. That's a very big idea. Now, these two aren't always going to match, right.  You're using a machine learning model to train another machine learning model. 

For example, what if I was denied a loan? And there could have been many  reasons to my credit profile, but the decision tree simplified it down to just  income. And my income was $30,000, but if I increase my income to $45,000, I  would have been offered the loan. It actually then gives me prescriptions on  how to be able to circumvent the machine learning model but in reality, I  wouldn't have, because there are probably many other factors that played into it  that may have tipped the decision either way, but they aren't represented in the  tree because they're not going to be identical. 

Nikos: So this requires, actually, user constraints and our objective is to  decouple explanation from hard-coded rules as we have in the past in the rule based systems. So explanations should be able to explain whatever is provided  as input and be decoupled. 

Angelo: So decoupling these seems really important and we get to understand  the models so that we can affect the change and change the outcome of the  model. In previous episodes we talked about computing your biological age.  You can't change for example, your physical age if the tree is telling you that  that's the most important thing, but there are some other things we could do to  affect that age and turn back the clock or make changes to other kinds of  decisions. 

So then, has this solved the problem of providing explanations? Are there any  other problems with providing explanations to machine learning models today?

So this hasn't solved the problem of providing explanations in machine learning.  There are five problems we're going to talk about. We've already talked about  instability. And for example, we use the heat map where we talked about how  the input pixels can kind of be put on the map and then we can highlight the  ones that provide evidence for and against the collective reclassification by  brightening and darkening the pixels so we can get an idea of what it's seeing.  But those attribution methods, like a heat map, are very sensitive to the hyper  parameters. Meaning the way that we tuned that network, how we adjusted it,  adjust the sensitivity then of the interpretation. What other reasons are there,  other problems we've run into with explainability? 

Nikos: So the next thing is transparency and the weakest thread, in a simple  sentence, explainability of machine learning is usually inverse to its prediction  accuracy. So the higher the prediction accuracy, the lower the model  explainability. Neural networks are actually, are state-of-the-art in predictive  performance for several tasks. From speech recognition to image recognition,  object detection, etc. Natural language processing and the list goes on. But the  problem is that they're very high dimensional and their surfaces are so complex  that no one can understand. On the other hand, as we said, a decision tree is way  more simple and you can interpret it pretty much straight forward.  

Angelo: And so we call that transparency because machine learning models,  neural networks, are black boxes. And because of their very high  dimensionality, as you mentioned, they're so complex, but what's interesting is  that we can say that their prediction accuracy makes explainability inversely  proportional to that. 

So the better they get at predicting, in the more complicated maybe they get the  more factors they can take into account, the less simple they become to  interpret. 

Nikos: Next, another very subtle issue is the adversarial attacks. So imagine  that interpretability might enable people or programs to manipulate the system.  So if one knows that by, for instance, having three credit cards can increase his  chance of getting a loan. 

Angelo: Ah, so then they game the system, basically increasing their chance of  getting the loan without really increasing the probability of repaying the loan. 

Nikos: It can be very dangerous for some of them with adversarial purposes. 

The next thing is privacy. To explain anything, we need access to the original  data. So imagine that you have a database and you're provided with a model that  you get from a different company. So, we give the company the data in an  encrypted manner. 

For instance, you anonymize the feature names. So actually the company  doesn't know which features correspond to a real-world instance. And then for  the model to be explained you need access to the original data, otherwise there  is no way to explain it.  

Angelo: Ah, yes. And this goes back again to privacy—a recurring theme that  we've had over several episodes. In order to explain it, I need access to the  original data so that I can explain it to you. But by doing that, I leaked private  data. 

We've actually seen some of this happening in use of high dimensional models,  for example, when you type in the Google search bar, you'll notice that your  type ahead the search is different than the results you get. You'll type three or  four words, right, and they'll try to complete your sentence. That is a completely  different model than the search. 

So that machine learning model, they use word vectors to produce that. And  Google will produce 300 dimensional word vector data. And if you haven't had  a chance to play with Word2Vec, W-O-R-D-2-V-E-C, I encourage the audience  to play with it. It's very interesting. But Google will compile this for us and give  it out to the world. 

So you can actually download the machine learned version of all their search  data. And so that you can do these, little word vectorization type ahead kind of  things yourself. But what's interesting is because it is well known it can be  reverse engineered. So if you use their same set to say, do something with  patient data, someone could reverse engineer that because they know their 300  dimensional data. 

You can't crack it. However, we all know how it was created. And so we can  reverse engineer it. What's the last item, the fifth item that you think is a  problem? 

Nikos: The last thing is the analyst perspective and it remains open about which  observations in a dataset an analyst should explain first. Imagine that we do the  global explanation where we try to explain the whole model, it is not trivial to  decide what an analyst should examine first to start his investigation. This is not 

trivial. And we propose to our paper when we tried to explain anomalies, for  instance. 

The first thing you see after you have explained the anomalies is the  disagreement between the explainer and the model. The disagreement captures a  lot of information. And then you can actually understand why. 

Angelo: And you're saying at this point we need to get a human involved  because the human has to be able to forensically, kind of, do the spelunking to  figure out what it is that’s gone wrong. And that's where your research is  centering which is amazing. Nikos really fascinating. I want to thank you so  much for joining us today. 

I think what you're doing is really great and appreciate your insights. 

But before I let you go, I just want to ask you one more question. Where do we  take it from here? 

Nikos: The first thing is that we don't know exactly how we should evaluate the  explanation methods. Normally, it’s either by human inspection or to isolate  some relevant features in a dataset. So imagine that we can have a dataset that  the predictive task is to assess whether or not a person will get a heart disease. 

So we inject, for instance, artificial noise in this data and then we try to see  whether the explainer was able to identify only the initial features and not the  noise. So we try to perform, in some sense, some feature selection knowing that  the relevant features are there and assuming that they are indeed relevant. 

Angelo: I see. So we don't really have a way of testing the quality of this kind  of thing as a standard as we do in other data systems. 

Nikos: And the next thing , is a more futuristic one—to have adaptive  explanations. So imagine that each human understands the world differently. So  for you, maybe an explanation that highlights the feature relevance in an image  is good. But for me, maybe a storytelling explanation is better. 

So it would be interesting if intelligent systems interact with the user or the  analyst, learning information about his mental model by asking some question  answering procedure and then they adapt the provided explanation, building a  story in a stage-wise manner. So they learn your mental model in order to  approximate how it is better to be served and then decide the method. This is 

like building AI to choose which explanation collection method fits better in  your psychological profile. 

Angelo: I have seen this kind of thing firsthand, as I mentioned a few times,  we’ve built a lot of rules-based engines in the past and especially in the clinical  setting, we have found that it's important to have explainability of the rule. And  to the different consumers of this engine you will have different reasons for  explainability.  

For instance, a computer scientist who wants to track down bugs or a data  scientist who wants to find out how the data was fed into the system so they can  see what it does, will want certain kinds of explainability from the system. It  wants to see the data with a tree or some other kind of way to help them debug  it. 

A clinician though, will get inundated with that kind of stuff. You know, stack  traces and things. They don't want to see that. They want to see really what is  the key piece of data that caused you see this piece of information maybe? And  then again to a user, I like what you're talking about stories. 

So for the listener, when we talk about stories, it's something like, it might say, I  understand you're having a problem with pain and I recommend you do the  following actions and then, you know, this could work and if this doesn't work,  consult this doctor. It provides more of a narrative for you to know what you  should be doing next. 

Now, depending on who the consumer of this engine, the same engine could be  consumed many times and so you'd want to have different kinds of  explanations. That is an amazing and open research topic. 

Niko, thanks again for joining us, really appreciate, all of your research that  you're doing and look forward to see how it's going to continue to grow and  contribute back to all of what we're doing.  

I also want to take a moment to thank the listener for joining us on this journey  of our first season of the show. This is the final episode of the season, but we'll  have a recap episode coming up that I hope that you will just love. I'm your  host, Angelo Kastroulis, and this has been Counting Sand. Please take a minute  to follow rate and review the show on your favorite podcast platform so that  others can find us too. 

Thanks again for listening.