Counting Sand

The Promise of AI: Opportunities and Obstacles

Episode Summary

AI is heralded as the future. Indeed, it has had a lot of success and promise, but what prevents it from fulfilling its true potential? What are the hidden dangers lurking in AI, and how do we regulate it? In this episode of Counting Sand, Angelo is joined by Dr. Eric Daimler, who leads MIT's first-ever spin-out from its math department. He co-founded six tech companies and was a presidential innovation fellow during the Obama administration. He helped drive the agenda for US leadership in research, commercialization, and public adoption of AI in this arena.

Episode Notes

This show often discusses artificial intelligence and ideas to consider as technology progresses. We have discussed the deep tech of how it works and its implications on privacy. In this episode, we'll talk about the complex and controversial topic of AI policy and speak about some of the things we should be worried about regarding its future.

In a time crunch? Check out the time stamps below:

[01:15] - Guest Intro

[03:38] - Western technology leadership

[04:50] - Regulating AI

[11:00] - The promise of self-driving cars

[13:05] - AI data audition

[17:50] - Neural networks to train AI

[19:00] - Reducing mathematical knowledge, AI bottleneck

[20:35] - What is in the way of the promise of AI

[24:20] - Eric Daimler book

[27:50] - The uses of trained AI models

[29:30] - Health care industry data usage

[33:25] - AI to speed up research

[33:50] - What is rural AI?

Guest Links:

https://www.linkedin.com/in/ericdaimler/

https://conexus.com/

Our Team:

Host: Angelo Kastroulis

Executive Producer: Náture Kastroulis

Producer: Albert Perrotta

Communications Strategist: Albert Perrotta

Audio Engineer: Ryan Thompson

Music: All Things Grow by Oliver Worth

Episode Transcription

Angelo: AI has been heralded as the future. Indeed, it has had a lot of success, but what is preventing it from fulfilling its true potential? What are the hidden dangers lurking in AI and how do we regulate it?

Thanks for joining us today. I'm your host Angelo Kastroulis, and this is Counting Sand.

Joining me today is Dr. Eric Daimler, who leads MIT's first ever spin out from its math department. He's also co-founded six tech companies and as a presidential innovation fellow during the Obama administration, he helped drive the agenda for US leadership in research, commercialization, and public adoption of AI. So, Eric, first of all, thank you for joining me. I'm really excited about our conversation.

On this show, we talk a lot about artificial intelligence and some of the things to think about around the topic, be it, the deep tech of how it works, some of the implications on privacy. Today, we're going to talk about policy related to it. Maybe some of the things we should be worried about.

I'm really excited about your perspective on that because you have a really unique perspective working on it from the policy side, during the Obama administration. I'd love to hear more about that. Your upcoming book is really interesting. And we'll hopefully going to get into that just a little bit. It fits right into the kinds of things we do.

Before we get into the policy side of things, I think it makes sense to set the stage for a second and talk about some of the things are a little bit dangerous or maybe things that can be misused in artificial intelligence.

Now our listeners probably heard me say a lot of times that AI introduces bias. It's just inherent in what it does. Artificial intelligence is also usually very opaque. It's hard to figure out what those things actually do, it's kind of a black box. But what I want to get your opinion on is what do you think is the biggest danger lurking inside nost AI implementations? Lurking sounds a little sinister, but I think it's a fair way to describe it. What do you think?

Eric: Y’know it's funny, in quite how I hear that the biggest risk in AI implementations, I think is from us not understanding the limitations of AI deployments. There are so many ways in which these can go badly, that people can often think or misattribute, where their weaknesses are. You know, the old adage that, if you're not knowledgeable in the art that what you think is hard is actually easy and what you think is easy is actually hard, really applies to AI

implementations. I think the biggest issue right now is the bottleneck for AI, which is bringing the right data to bear. You know, everyone seems to have gotten the memo that data's the new oil and all that.

So, we find that many, many, organizations are collecting more data. But they're not actually making use of that data. They may bring some data scientists in, but that creates even more data. And those that are not digging into it deeply will sometimes find and I see this in some of our organizations with whom we work.

I saw it in the government. They're trying to find the needle in the haystack. They will just add another bale of hay to get to the needle or, you know, there's some XKCD cartoon that says, well, if I just keep stirring the haystack, you know that the answer will come out sooner or later. You know that actually happens a little too often I think in data science where, you know, if I just apply enough statistics, I'll get something and I'll pass it off as an answer. That's actually the biggest danger in AI.

It's not from Nick Bostrom's excellent work around AI consciousness, which I agree with. It's not in the issues of bias, which I agree with. And I admire the people working to solve that problem. It's not in explainability, which I agree with and I admire the people working to solve that problem. I think it's actually in the misunderstanding of its limitations and therefore the misapplication of these technologies with some very talented people, you know, doing their best.

That's where I think the biggest danger is. The outcome of that could be, then, resistance. A reluctance to adopt this tech. And I think that is the biggest danger to American technical leadership. Maybe even say Western technical leadership. If we as a society resist the adoption of, sometimes, these lifesaving technologies. We want more people to be engaged and understand. Not necessarily being programmers themselves, but understand enough to have a conversation about where and how these life-saving technologies can be deployed.

Angelo: Yeah, I, first of all, I really appreciate the XKCD reference. That was great. But I think that's a really good point. You know, AI is extremely important technology. And I think that sometimes we can, if we're not careful, we can become too afraid to implement it because we look at all the negatives and all the downsides.

There are certain areas that it's going to make a very quick impact with little effort, but you're right. You know, adding more data growing the set of

problems. We all know that probably the most important piece of data for any AI is the data we didn't include. We just didn't know to include it.

That was probably the biggest factor. So I appreciate that. I know there's been a lot of talk. Elon Musk, for example, has been talking for a very long time, but regulating AI and that's challenging. What are your thoughts on the idea of how do we, how do we even do that? How would we regulate AI?

Eric: Yeah, I appreciate the conversations we have around regulation. I chaired many of these when I was working during the Obama administration. The things I can share publicly I put out in an open letter to the new administration about how we can approach this regulation. I can lay out a couple of those issues.

The first easy one, just for people to visualize not necessarily to implement, is separating the data from the data model. You know, we can have bias in the data. We can have bias in the data models, the algorithms. But separating those out can give us some purchase and then thinking about what we want to have happen.

The data that we collect, think about an easy example, where Delta airlines, where I often find myself, will know my height, and my age, and maybe where I fly from and to. But then they have their data model that may determine some configuration of that data for their election to give me some other sort of benefit, upgrade, call it. Those separations, I think, are going to be really useful in where we want to regulate.

So the data on me and characteristics of me may, in some sense, become easier to find and therefore somewhat meaningless, but the data model, the behavior I exhibit, when I say, walk past a coffee shop or have a potato chip, you know, you know, some of these companies will end up knowing more about my physiology than me.

Some companies will end up knowing more about my behavior than my own. That is their knowledge. That's a data model. That's increasingly valuable. These companies will be data firms, AI firms, even though they may own airplanes or own coffee stores or make potato chips, because they will have learned about the relationships between these data as represented in the data model.

So, we have a data model and we have data. We can then look to see where these data models are linked. So these data models can be, essentially, a set of

automations. Will we have these automations operate in discrete units or be linked, is a question right now being executed team by team. You know, people often think about Meta or Google operating as these large organizations, which they certainly are, but really how their technologies get expressed are through the product managers’ quarterly objectives.

And the 18 million programmers around the world that are just working in their own self-interest and their best understanding about how these technologies can be providing value. We then as citizens and as policy makers can be thinking about where we want to put in circuit breakers. Now we do this occasionally and it's kind of implicit in some of our thinking, but we need to make this more and more explicit.

So an easy example is the automated car rolling down the street, coming across a crosswalk. So I'm here in San Francisco where I see Zoox cars and Waymo cars about every 10 minutes, you know, testing out the different lighting in the neighborhood. As one of those Zoox cars or Waymo cars comes upon a crosswalk, does it slow down? Does it stop? Or does it keep going? Interpreting the light to be a person or a shadow or some random tumbleweed that happened to find its way into the city. That is an instance of a circuit breaker opportunity, you know, does the driver intervene, is how we would describe that.

That is not just a conversation for the manufacturer. That's a conversation for us as society. Where do we want people to interact? We can place those circuit breakers across a lot of different links of data models, whether that's in autonomous cars or whether that's in the granting of credit. You know, there is another good example for this, which is to think of us applying for a mortgage with a house.

So a lot of people during COVID were looking at redoing their housing configurations, moving into bigger spaces with our own offices and so forth. So what that often entails is people searching for homes and then separately searching for a mortgage and a lot of other different machinations.

If you were to link these sort of automations then it might be a little uncomfortable to say, for instance, enter your income, enter your employment, and then up comes a list of houses. That would be a little weird. I'd say, wait, I missed the intermediate step. Just because you can link automation, just because you can link data models just because you can link, we'll say these learning algorithms, doesn't necessarily mean that we want to as a society.

Angelo: That's great. So there's a lot of thought around that kind of thing. A great example you used is self-driving cars. The same can be true, I've seen a lot of times we'll use artificial intelligence to help clinician, not necessarily come to a decision because that still is a complicated problem, but assist them in very small parts of the decision-making process.

And there's this interesting phenomenon where there's a fatigue that happens over time. When it's right 99% of the time you end up trusting it so much that you just stop paying attention to it. And then the 1% of the time you just take the answer. Cars can be like that too. I would imagine that when it's always right we'll have some fatigue and we'll just always trust it and we won't always be there to intervene. Is that part of what you're thinking when you say, you know, kind of putting these circuit breakers in place, is that part of it?

Eric: That is absolutely part of it. We are getting glimpses of this today. There's an often quoted phrase that the future is already here, it's not just widely distributed. And we read in the press or at least the listeners to this would read and listen in ordinary discourse about how automated cars will require driver intervention. And they'll require it in different ways.

So some cars require a little shake on the steering wheel. Others have haptic sensors now. That's going to be the latest tech, where you can just touch the wheel, but you don't need to shake it. You know, that's, I think, this suggestive future for how we're going to need to claim alertness. You know, Mercedes now has a new tech that tracks your eyes.

So the passenger's video, which is now, kind of a weird thing to say, but it's going to be a common phrase. The passenger's video will turn black if the car senses the driver being distracted, right? That's a future about how we're going to have sensors around this that interject themselves into our world and we will have to display a level of attention. This is all something that we're going to experiment with as a society, whether or not this works. You know, another part of our future that didn't exist, I would say what five, six years ago, was electric cars now making a little bit of a noise. That's again, communicating where the car is at.

You know, the car is declaring itself moving and moving towards your direction. That's where we're going to be over the next decade. I tell you additionally, around regulation, we can begin to audit these data models. So if you separate the data from the data model, then the data model itself can be subject to an audit.

We're getting a lot of productivity increases from the deployment of these AIs. We can take a little bit off of that to ensure that they're doing what we said they would do. So we deploy a data model. We have the authors declare what it wasn't intended and then we can have people expert in the art to audit those and say, well, is this algorithm doing what the author said it would do or not?

And that can either be in a transparent way, you know, there's many algorithms that are just open for examination, or we can have that in a zero knowledge proof way, such as the way FICO scores, credit scores are determined. We don't really know what's in a credit score exactly, but we run enough models through it that we have a pretty good sense.

And we begin to have trust in the outcomes relative to the input. So those are two ways to do regulation. One is through circuit breakers. Another is audit, but you gotta start with separating the data from the data model. That's how society can begin to have a conversation.

Angelo: Yeah, I love that. I love that we're thinking about that because it could be fearful to think, well, okay, let's just give the stock market over to the AI. That is not what anyone's talking about here. What we're talking about is using models in a way, and I'd like this idea of interjecting other sensory capability to the technology. Bolting other technology to this AI so that the AI is not alone making these decisions.

There are other things that provide feedback. The other thing that you said that I really want to make sure that we bring out here a little bit more and that was, that you have these models and it's okay to back off of accuracy a tiny bit to get a more understood provable model, you know, something that we can test.

And I think that that's interesting because it reminds me of the Netflix prize. Remember the winning Netflix prize, they wanted a better algorithm for matching movies, and the winning one was an ensemble of 640 models. And then they said, okay, that's great. We'll give you the million dollar check, but we're never going to implement this because it's too complicated.

It's impossible to implement. So instead we'll take one that's just a little bit less accurate and it's extremely explainable, simple, and straight forward. I thought that that was really a good point that you made and I want to make sure we highlight that.

Eric: Really, there is a bigger theme to what you just said. Angelo, you're really defining, I think a, perhaps a larger epic that your listeners may benefit in

hearing which is that, as scale increases, complexity necessarily increases, right? That's just a truism. And that complexity increases to such a point that we can no longer begin to understand it.

You know, this represents itself in prosaic ways, such as encryption. It also represents itself in technologies such as the smart contracts on the blockchain. It also expresses itself in quantum computing. Quantum computing is certainly a nascent field. You know, it's not widely deployed yet and not really even useful yet in many ways, but it has reached a sufficient level of complexity such that we don't understand it. Or we can't actually understand whether or not it's true because it's so complex. You know, the quantum compilers have to depend on a new type of mathematics. That's mathematics based in categorical algebra. That category theory, type theory, to even interpret the quantum computers.

That's a function of complexity. That's a different epic than the one preceding it, which is one of logic. The logic that allowed as a framework for us to be where we are today, building computers and relational databases and all that, is shifting to one of compositionality in order to handle this increased complexity. That's the big, big change becoming compositionality.

Angelo: I'm glad you mentioned quantum. It's an area that we've really been focusing in a lot lately. When you think about machine learning and where it's evolved and how we came back from the winter, the AI winter, and neural networks are making their resurgence. Some of the things that we're bringing back, we're kind of reincarnating ideas because of the way that we can compute things.

And so, what we're trying to ultimately do here is in most of the innovations lately in traditional computing have been around ways to reduce the amount of time things take to train or to create some kind of computational efficiencies. But thinking about quantum, right now, at least the kneeling is in its infancy still, but that's the most promising, immediate way to implement quantum.

And so we can try to use it to optimize some of these things. Do you see any interplay with, I know this is like really far away, but do you see any interplay with quantum and training artificial neural networks or AI, or do you see it as creating a whole new kind of artificial intelligence and leaving the old behind.

Eric: So I am not of the religious view, and that's what I'll call it, that all AI is now going to be about neural networks, deep neural networks. I am of the view that the AI winner was certainly caused by a little bit of overstretching and an over promising. But it was a result of this lack of scalability from the

deterministic AI models of that era. You know, RDF and then OWL came about to try to alleviate some of the problems about integrating data in the early 2000s after the scale of the public internet kind of made this become even more of a widely known problem. But those have failed. That technology is fading. I am of the view that, many, many, solutions will benefit from the interplay of machine learning AI and a non machine learning AI.

So the machine learning AI being, you know, including the subset of neural networks and deep learning. So that you'll have symbolic and probabilistic AIs working in harmony on different applications. That's what I see as the future and symbolic AI, the rule engines there will find themselves to be expressed different than the sort of knowledge base and commercial failures, such as Watson. Those are different today, not just because of compute and the other bottlenecks that I know that you've talked about in other previous podcasts of yours.

It's really because of discoveries in math that are more foundational. Much of what we live with today is based on the mathematics from the 19th century that helped fuel an industrial revolution. But calculus, maybe even geometry, and trigonometry are fading. They're going to become a little bit more like Latin, where it's interesting, but useful in fewer and fewer contexts.

And that's going to be replaced by statistics and probability, which is what we see, and the mathematics of category theory or categorical algebra.

Angelo: Yeah, that's a really good point. A friend of mine who does a lot of work in quantum, right now, has said, really it's about, you need math PhDs. It's about finding a way to represent this in a very complicated mathematical model. And he said that will one day be a thing of the past. That's not how we're going to build these kinds of systems.

So you're definitely, I agree with you on that. I think we're both on the same mind there. So there's all this promise then in AI and neural networks, what do you think is in the way? What's keeping us from really realizing it.

Eric: You know, as we've talked about a little bit now, and you've talked about in previous podcasts, these blockages emerge and fade, you know, the problems get solved and then blockages move to other areas. What I now see as the blockage, that is not yet widely identified, you have a few organizations confronting it, is around bringing the data to bear in a way that's useful.

So everybody, as we’ve talked about, got the memo about collecting data, but there's some layer between data science and data engineering where my firm, Conexus, works, that is becoming more appreciated as representing the bottleneck. So we work with this firm, Uber, everybody's aware, you know, they had grown up, not with this ideal IT infrastructure, but, just in the way of trying to be responsible for their business, in their case city by city, that then didn't allow for the AIs, the other sort of learning algorithms to be properly applied. They couldn't do a privacy lattice the way they wanted to, driver's license versus license plates, varying by jurisdiction.

They couldn't do basic business intelligence questions, driver supply, rider demand. They had to do statistical comparisons between different cities and that produces a certain amount of friction, but not in a good way. You know, we're talking about friction being maybe interesting for regulatory purposes, but in these areas of trying to answer business questions, you don't want friction in either accuracy or speed they happen to have had, maybe it's an extreme, but 300,000 databases.

I know I'm not saying anything out of school by disclosing this, but 300,000 databases. So we, Conexus, worked with them to consolidate that into an analytical framework that allowed them to then save, you know, some 10 million plus a year and that reduced friction and increased the alacrity with which they were able to organize their business. So the answer to your question is that the big bottleneck in AI deployments is something that I began to see during my time working during the Obama White House, which is that these massive data infrastructures represent the 90s still, or maybe even the 80s. They don't represent the 2020s or 2030s where the data scientists need to bring the data to bear at the rate of their intuition.

Angelo: Okay, great. Thank you for that. I think that I'm going to brag on you just a little bit here. And, and I have a long relationship with Carnegie Mellon. In fact, my original grad schoolwork, Carnegie beat me to publication side to change my, uh, thing. So, I have a little, little history there, definitely a lot of respect for that school.

And that's where you got your PhD and you served as an Assistant Dean there and assistant professor in computer science. And one of the things that I really like is your area of research was not just machine learning. And computational linguistics, which, you know, for those listeners, you know, what's computational linguistics?

It's not NLP necessarily. That's in the family. It's a big family of computing technologies that represent things like NLP, text to speech, and anything that

kind of fits in there, but graph theory as well. Which has really, lately, has shown a lot of promise because it's taking some of the places that some of the neural networks used to go into, especially fraud detection and things like that.

You know, so the point of bragging on you, I guess, was I wanted to talk a little bit about your book, your upcoming book, The Coming Composability: The roadmap for using technology to solve society's biggest problems. Now that is always something that I've really felt strongly about.

And I hope anyone who works in computer science feels that computer science can help solve, it's not the thing that solves every problem, but it can certainly help solve some of our biggest problems. And so, anyway, I want to talk a little bit about your book. Can you tell me a little bit about it?

Eric: Yeah. I mean, there's certainly, I can talk about the motivation, you know, coming out of government, people were asking me to write about my experience. And then it's taken me a few years to try to formulate my thoughts enough that it would be worth reading, you know, two years from now, three years from now, which is pretty hard in a fast moving area such as the one we're talking about. The greater theme, that it seems to represent a future that I think will be relevant 10 or even 20 years from now, is one of composability. This is a notion that we can expand infinitely in any direction. It's a framework that I think describes our digital world better than any other that I've come up with. To describe compositionality relative to modularity, which is, people often think about Lego blocks for example.

You can think of better than Lego blocks, think of a train. So a train can interchange box cars. We have a container ship also can interchange Conex boxes, but a train can interchange boxcars, but a train can only go so long. It goes in two dimensions, maybe three, but we'll say two dimensions.

And it can only go, you know, a couple miles long, but the train system is something of a different order. A train system at any point can be infinitely expressed and it can infinitely expand. That's the nature of compositionality relative to modularity. We have expressions of this compositionality, as we talked about earlier, in smart contracts. In quantum computing. Maybe even in Minecraft, you know, Minecraft infinitely expands. You can even have Minecraft inside of Minecraft, right? That's an infinite expanding world. There are many, many, many emerging expressions of compositionality in the world.

What I work to do in the book, The Coming Compositionality, is just describe this phenomenon and how people can participate in their interaction and

participate in their creation, that we, we will more and more see these emerging compositional systems. That then could become, an infinite supply of these data towers of Babel, but the math that we've talked about earlier of category theory, categorical algebra, that's the math of compositionality.

That's the math that helps analyze compositionality. The application of that math allows for a proof that there is integrity to these models. That's why it's required in quantum computing. That's why it defines smart contracts. And this will be applied more and more from commercial aerospace, where this is one, this is actually the technical solution to disasters like the 737 Max, to other areas that we might find, also life and death contexts. So it's the coming compositionality, that, that defines this new world for the next 10 or 20 years.

Angelo: Okay. So that's interesting. So I know it's probably not, you know, not exactly what you're talking about, but we've been talking for a long time, I think, about creating models that are reusable within other models. For example, that's kind of in the ML, AI kind of space. Thinking of, you know, if you create a network that's been trained on something and the data is very sensitive, especially in healthcare, the data is extremely sensitive.

So we can't just make this data available for anyone to create new models. But maybe what we can do is we can use the trained model in other ways. And we've been talking about that in terms of composition. I don't know that's exactly what you're talking about here, but it feels like it is a piece of it. Is that right? It kind of fits in the model?

Eric: Angelo, I think you actually have it exactly right. I think the future will be owned by those that can recognize these configurations of different building blocks and then redeploy them in different contexts. You know, you and I can talk about these sophisticated models inside machine learning and maybe non machine learning AIs, but we can think more prosaically about how that shows up right now in some direct consumer goods. We have many consumer goods, really an explosion of consumer goods that are available to us now that weren't available 10 years ago because of this redeployment of building blocks. You know, I can start a company that is just based on a particular subject matter around which I feel like can provide value, but I can outsource my marketing.

I can outsource my analytics. I can certainly outsource my fulfillment, my manufacturing. You know, Shopify handles the commerce. I don't have to do much at all except focus on my particular part of that e-commerce stack. That is an everyday example, about how the redeployment of these blocks will represent the future.

Angelo: And that's so critically important. I think if anyone listening, your information, yeah, okay, there's a lot of information proliferated on the internet, everyone knows a lot about us. They're recollecting the same information. But the reality is there's really key, critical information that only exists in certain contexts, right?

The people who are training car algorithms have certain pieces of information. It's not about privacy, it's about getting to the right information to build these models. The same is true about healthcare. There are very few organizations, but they have hundreds of millions of patients worth of information that if we could enable this kind of compositional approach, then we could really advance everyone.

And that's the hard part of this. Well, if they're sitting on this data, they don't know how to use it. Well, if we could, instead of boiling the ocean, instead, look at it in a very small slice of it and say, well, if you could provide this solution, then we can build on it in many different ways. I think that that is so tremendously important to not only the evolution of the country, but the world.

Eric: There's a lot to say in there. You know, my firm, Conexus, it works on these issues in healthcare and the scale, it still boggles my mind. It gives me goosebumps even to talk about it. When I say that one of our clients handles 1.4 trillion transactions a year, right. You can't go at those sort of problems with a traditional proof of concept.

Oh, hey, we can handle one or two records or even we can handle one or 2000 or one or two, a couple of hundred thousand, like it just doesn't work. When you start scaling to the trillions you need to, a phase change happened where you need to be thinking about the problem differently. There's just no amount of compute you can throw at that.

It just becomes infeasible and that's where we solve those problems uniquely with our software expressions based on the discoveries in math. So the math of categorical algebra is going to be what powers these revelations in healthcare. You can't double check all of the possible ways in which 1.4 trillion transactions could go wrong.

So you risk leaking data and violating HIPAA regulations, or getting a patient data wrong and then prescribing the wrong drug. There's something that we find ourselves involved in is even around drug research, where we want to make sure that every population gets represented in the side effects of a particular drug.

And that comes down to then having these 10 year longitudinal studies maintain a guaranteed data integrity during those transformations. There's really no other way to be doing that in a way that you can prove for accuracy. That's where healthcare is going.

Angelo: Yeah, I, and I love this because we're bookending, I think, healthcare a little bit here. This season actually we're going to talk about using technology to simulate compounds and situations that we could help develop drugs, but then once you do that, you still need the information in retrospect, or we have to analyze what actually happened in real life.

And that's what we're talking about here. That's what you're mentioning is that, taking all that, things that actually happened to humans and then seeing if we can feed that back into the system to produce, could be better drugs, but certainly better outcomes or maybe changes in care, addressing gaps in care.

There are so many things that we don't normally think about. It's not taking in pill. It's also access to the pills and in other things that we have to solve, so many problems to this. And I think that's what we're talking about is both sides of this. We also need to then use it to speed up research, you know, kind of after all this happens.

Eric: We work with several organizations and I can tell you one story that still surprises me. We worked with a big hospital group in New York that we found represented their definition of diabetes differently throughout the organizations. And that sounds weird to me because I think, well, can't you just look up diabetes in Webster's or the Oxford English dictionary, like diabetes, it is what it is.

But, you'll find that either temporarily or by context, the needs for the definition of diabetes change. In one area, the organization may represent diabetes, yes, no. But others may say, well, diabetes, how am I treating it? Another one may be diabetes, how long ago? Another one might have some well-meaning clinicians that will put that whole thing in a cell?

Well Eric had diabetes and then we treated it this way and then it didn't. You know, how do you bring that data together within one hospital system so that you maintain the richness that you spent money collecting, the fidelity of that data, you know, while not having it be a big ETL burden on your data engineers, you know, that's an issue.

And then, we work in their research group where you don't want to have the data misinterpreted and that can have some really serious consequences, you know, H2O, you cannot afford to get misinterpreted as O2H you know, bad things can happen in chemistry. You know, so that, that sort of data integrity guarantee is critical in many areas, but healthcare it's acute.

Angelo: Yeah. Yeah. That's a really. I have spent a lot of my career, both on the standards side helping to develop these things and then from the implementer side, trying to implement them. And we have a saying, and that is, if you've seen one representation of data, you've seen one representation of data. There is no diabetes, pneumonia, all of those are, even all in a standardized way there's a myriad of ways to represent it in a standard. And so they do kind of grow locally and you have to find a way to normalize. Because that's normally the problem we have with gaining insights. You have to create one method of, kind of normalize all the data to something, so that then we can run analytics on it. And that is a tremendous challenge.

Eric: I'll tell you where that's going Angelo, where that's going is rule engines. So you've talked at other podcast series about rule engines. The rule engines now are powerful enough, and unfortunately this is now symbolic AI so the rule, you know, it's not in all ML. You know, symbolic AI rule engines are now enabling a mathematical relationship or a formal relationship between those representations. That's the short answer, the shorthand for this audience. That's how you would represent those differences in definition between diabetes.

Angelo: And what you're describing there is what we call expert systems and we will talk a lot about that on the show. I mean, that's part of a lot of the things that are dear to my heart. How do we marry artificial intelligence with expert systems? So I'm glad we got there. Eric, I want to thank you so much for joining us today.

Thank you for your time. Really enjoyed the conversation. Looking forward to following you and your company and seeing where you guys go.

Eric: Thanks, Angelo, this has been fun.

Angelo: And thank you all for joining us. I'm your host, Angelo Kastroulis, and this has Been counting Sand. Before you go, please take a moment to subscribe and we appreciate your listening to the podcast. Feel free to rate us so that others can find us and follow us on Twitter @AngeloKastr, or you can follow my company @BallistaGroup, and also feel free to reach out on LinkedIn, AngeloK1.

And you can follow Eric on Twitter @EAD or you can find him on LinkedIn as Daimler. Also take a look at the show notes for other ways to connect with Eric and his company.