Podcast
Episode 
13

What Fascinates Tempest Van Schaik, PhD?

Pondering racial bias in machine learning, the fallibility of humans as test subjects, widespread soil-sample technology for farmers and strategies for tapping into the wealth of tech talent in Africa.

0:00
0:00
https://feeds.soundcloud.com/stream/1141152910-bobby-mukherjee-563753774-loka-podcast-with-tempest-van-schaik-senior-machine-learning-engineer-at-microsoft.mp3
What Fascinates Tempest Van Schaik, PhD?

AI fairness in practice. Wearables to fight depression. Innovations at Microsoft. What an episode we have for you today! Tempest Van Schaik is a renowned biomedical engineer who’s passionate about improving our lives using sensors, data and AI. But that’s really just a jumping off point for the projects she’s spearheaded over the last two decades. She joined me virtually from her home in Washington D.C.

Transcript

Bobby: Welcome to Loka’s podcast, What Fascinates You?—conversations with entrepreneurs, engineers, and visionaries who are driven to bring innovations to life. I'm your host, Bobby Mukherjee, and with me today is Tempest van Schaik, PhD, from her home in Washington DC. Tempest is a senior machine learning engineer at Microsoft and a leader in healthtech innovation.

Some of the ways she's driven impact is through projects like SoilCards, Cognition Kit, and Project Fizzyo. She's also graced the TEDx and South by Southwest stages and is a member of CSE’s Responsible AI Board and is an Ambassador for Diversity and Inclusion. I'm very excited to talk with Tempest, so let's jump right in.

Welcome to the show. It's a pleasure to have you here. Go ahead, please introduce yourself to our audience.

Tempest: Thanks very much. So my name is Tempest van Schaik and I am currently a senior machine learning engineer at Microsoft in a team called Commercial Software Engineering. And my background is in biomedical engineering, so my PhD is in biomedical engineering, so I mainly do machine learning for healthcare.

Bobby: Terrific. Yeah. Something that's near and dear to my heart as well. So in preparing for this podcast, I was looking at your LinkedIn and I noticed that you had a picture with Her Royal Highness, Princess Anne, and there aren't speech bubbles, so we have no idea what the conversation was, but I'm assuming surely you were asking her if she was gonna have a bigger role in the Netflix show The Crown.

TempestYeah, absolutely. That was top of my mind. It was the Women in Science and Engineering Awards in which I was nominated for a project called SoilCards. So she came around and spoke to all of the winners. And she was asking me about soil chemistry and we had a nice discussion about agriculture, which was pretty unexpected, but lovely.

Bobby: Wow, that sounds super interesting because these are people that people don't really know. And then when they ask really thoughtful and profound questions, they can be fun in a different kind of way. So sounds like a great evening. So I think congratulations are in order. I know, I believe you gave a keynote at the conference, Forty-two Responsible AI and Health and Principles in Practice. We'd love to just learn a bit more about that. Why was that important to AI and health? What was the main takeaway from the sessions?

Tempest: So I've been working in healthcare for a long time and recently started working in responsible AI and how that relates to healthcare. So we recently published a paper discussing a model that was used to predict survival in the ICU.

This model was not just an academic model, not just something theoretical, but a real world model that is actually used. To benchmark ICUs. So we benchmark how the ICU really performed versus how we predicted it to perform in terms of patient survival. And it's really important when a model like that goes into production to think about responsible and in terms of monitoring data drift, having full observability of how the model is doing in the wild or potentially years into the future. And also monitoring the fairness of the predictions of the model. So how does this model behave for different patient groups? So patients of different races and genders and different diagnoses, and rather than assessing fairness once off during model development, we actually productionized fitness monitoring so that it becomes routine. So when the data scientist goes to check accuracy of the model, they see the fairness metrics right there upfront with the same kind of visual importance as the traditional metrics.

And so we look at not only kind of accuracy of how the model does overall, but how accurate is the model for each patient group? What type of areas is it making for different patient groups? And we also showed how you can actually use these fairness metrics to compare models in new ways. Not just saying this model's more accurate, but specifically how has it improved in terms of how it treats different patients fairly? So that was a piece of work that I had recently been speaking about.

Bobby: What would you say, were there things you discovered that surprised you after the fact, after you went through and did this work?

Tempest: The main learnings that I had during this project was not to shy away from topics like race and gender in healthcare, because yeah, we know that race and gender can bring in bias into healthcare, but they can also be really predictive, right?

Like we've seen with Covid, it really affects different races and genders, it affects them differently. So it's a difficult one in healthcare where you have these sensitive features, but they are predictive. Some approaches have been to just leave these features out of the model.

Like we don't train the model on race and sex, just leave it out. So it's colorblind. But this approach doesn't actually work. We've seen models that excluded race specifically, but then were very racially biased because they used features like cost of healthcare. And in the US cost of healthcare bakes in a lot of other biases like racial bias.

So if you're not careful, the model will be very biased, but very racially biased, even though you've excluded race. And if you exclude race, you have no way of tracking it. So if you actually include race, you can know the race of different groups and then you can order your model afterwards.

So it seems like what I learned from this project is actually to capture that data and to have it, and then to use it for auditing afterwards instead of just shying away from it completely. That was a big learning for me and surprising one, fascinating. But it does make sense. If you at least are capturing those features, you can then do ethically responsible things with them. If you're not capturing them, you don't even have that option, I would suppose. Yeah. Then it's just then you're flying blind. So I would recommend keeping those metrics front and centers if you can audit them afterwards and see how is the model performing for these different groups.

Bobby: So I know you typically keep a pretty full plate, and you have a number of projects that you talk about and you're quite passionate about, and I thought it might be fun for you to give a little introduction about some of these and why you're excited about them and what you would want people to know about them. There are a couple of these that, that we've learned about, that you're involved with. So I'll go through these and you can give us like a little bit of a tutorial about some of these.

The first one I wanted to talk about and understand was SoilCards. Tell us more about that.

Tempest: So SoilCards actually I was working on when I was working in a startup, and it started as a medical device because that's my background and that's how I've always had this lens for medical devices, and I wanted to build a kind of a rapid low cost test that was many years before covid. I think now everyone fully appreciates why rapid low cost tests are very important. But a couple of years ago that's what I was interested in. However, I found that working in a startup, it's very difficult to get approval and to develop a medical device. It costs a lot of money to do trials. We didn't have a lab. I couldn't handle human blood in a startup office, so that was quite challenging. And then I pivoted to agriculture instead. So this was a huge pivot from my spending my whole life in healthcare.

And then realizing actually rapid, low-cost tests and diagnostics are super valuable in agriculture too for many of the same reasons. I started developing what ultimately became a rapid low-cost soil test. So whereas I would've been measuring biomarkers in blood, I was now measuring chemical markers of soil health in the soil sample, partly because it's really easy to get some soil from a potted plant and test it.

So that's how it came about. And many of the same reasons that I was interested in the medical diagnostics made this a good diagnostic for health of agriculture. So for example, people in rural settings find it difficult to get to a clinic. So point of care diagnostics are really valuable for them.

Likewise, farmers in the middle of a field, in a rural area find it difficult to get to the National Laboratory for soil testing. So for the very same reasons a rapid soil test is useful. And yeah, so doing some research, I found that farmers, there's a huge amount of more than a billion farmers below the poverty line, and it's really difficult to get a good yield from your crops. So you want to fertilize, but fertilizer is expensive and you need to know exactly what nutrients you need so you buy the right fertilizer and use it judiciously. And this is also important for environmental reasons. So people aren't using excess fertilizer that runs off into, so if you can measure precisely your soil chemistry, then you know exactly which fertilizer to use and how much to use.

So that's for SoilCards, was these rapid tests. I use a technology called paper microfluidics to make these devices. That are small, like credit card-sized and rapidly give you a result, which at the time was quite groundbreaking. Now, as I said, with Covid, I think everyone appreciates these sorts of devices and why they need to be accurate and fast-acting.

Bobby: That's hugely impactful. Was the offering aimed at farmers that were below the poverty line?

Tempest: Yes. Yeah, so we estimated that we could manufacture them at scale for a couple of cents each. Because they're made of paper. So yes, those are the users would be these farmers who could have a pack of them in the field and just use them once off.

Bobby: You use it once off, it gives you a result, and that result tells the farmer what type of nutrients and fertilizer they should use for their specific situation and crop.

Tempest: Exactly. And what's unique about it is that traditional chemistry tests, which have been around for decades, you can get a kit on Amazon for $11, but they're very fussy to work with. You need lots of little containers and little spoons. You need to measure water and they're really fiddly. And then at the end you get like a gradient color change. And then you look at it and you're like, is this purple or pink? I'm not sure. And you measure against the color chart, and it's a bit vague. These have been around for decades, but no one uses them. So a big part of the project was the usability, redesigning these tests so that they're super easy to read. Taking inspiration from a pregnancy test, which is one stripe, two stripes.

These actually had four stripes to say which nutrients and how much. The difficult part was developing this much better, I think much easier to understand, more usable device.

Bobby: Yeah. As a software guy, this is really fascinating to me cuz the first thing that goes to my mind is from the time that you and your team in your head came up with the idea of, look, let's draw inspiration from pregnancy tests. Let's come up with say, four stripes and these are what the stripes are gonna mean, from that moment till the time it got the first like beta got in the hands of a farmer or someone to try it. I don't even have any idea how long would that even take, to develop and get out and to try?

Tempest: I guess it took about, it took a couple of months actually strung together some some innovation grants and then use them to go back to my old PhD lab to do some lab work. But this was also new for me. Like I didn't know anything about soil chemistry or paper microfluidics, so I had to figure it out and yeah, so I guess it took a couple of months to a year to actually get it right in the lab sort of part-time while doing my normal full-time job than the software development, I would say.

Bobby: So before you started this project, you didn't have experience with paper fluid dynamics?

Tempest: That's right. Yeah. I had seen the paper microfluidics in the lab, but I'd not done it myself. So I thought, oh, that looks like cool technology. And then, years later, thought, oh, that would be, that technology would be very useful right now. And so I had to figure that out.

Bobby: That's amazing. I feel like I should just, I should go have an episode just about paper microfluidics, if nothing to say it properly. But again I just have so much respect for, the hard sciences and what happens inside of labs and so forth, because it just feels, I feel like we're quite spoiled in software sometimes because things can move so quickly. We can get together in a room, whiteboard, a concept and an idea, and have a beta to go very quickly. Whereas I think, with the hard sciences, you have to be a bit more patient. In this case, it sounds like you managed to get from a standing start to something out in the field in a fairly aggressive period of time. I think that's hopeful for a lot of the more difficult problems that are out there in the world that need innovation, I would say.

Tempest: Yeah. Actually I think investing in the hard way in the sensors makes the software engineering and the data science easier. Because with traditional kind of gradient, color changes, those are really hard to pick up with cameras. Whether you're in daylight or in sh sunlight or in shade makes the color look different. And then you have to build like a box around the device. You can take a reliable picture. That's really difficult in different environments, but if you have a, like discrete stripes, that's really trivial to capture that kind of information.

Doing that also makes the computer vision and then later the data processing much easier. So I'm a big fan of investing in getting the data capture right and the sensors right to make the data science right, easier down the line.

Bobby: The old adage in computer science of garbage and garbage out continues to be true many years later after it was first thought about.

So let's move on to the next project. Tell us a bit more about Cognition Kit.

Tempest: Cognition Kit is a suite of tests that test cognition and mood and using everyday wearable devices. So traditionally to test your cognitive function, you would go to a lab and you would do it in a very controlled environment with maybe a pencil and paper test.

And the reason the kinds of people that would want to test their cognition would be anyone maybe with dementia, Alzheimer's, ADHD, bipolar disorder, depression. So depression affects not only your mood, but your cognitive function too. So what we were developing was wearable tests that test cognition in everyday life.

We wanted to get a fuller picture of cognition and everyday life between those lab visits. So my job was translating these really old school cognitive tests onto new devices like the Apple Watch, and making them thinking about how they could fit into everyday life, and also creating some mood tests that you know to test your mood.

If in the case of depression, they wanted to test mood and cognition every single day. When I was involved, it was the early stages developing those tests and they have gone on to be used in a couple of clinical trials.

Bobby: Now, was it limited to the Apple Watch or were you trying a bunch of different wearable type devices?

Tempest: We tried a couple of different ones. We tried the Microsoft Band, we tried the Apple Watch. I think it was just those two in the end.

Bobby: What bio signals were you looking for? Is it like heart rate or some something else?

Tempest: You can obviously get a heart rate, but what we were measuring was the results from these custom tests. So tests that measure, for example, your short-term memory. So you get a notification, you get a test that says time to test your memory, and then almost like a puzzle or a game that appears and it's done, fairly quickly and under a minute. And that's captured, howevermany times a day. And then the mood is also done through questionnaires and things on your wearable.

So I think we were capturing some biometrics, but it was mainly the results of these cognitive tests.

Bobby: Gotcha. So one of our ML leads recently worked with NASA on wearables for astronauts, and I was just curious through your experience with the cognition kit, if what you felt was a trickier thing to solve with wearables in general, like on the hardware side, on the software side or something in between?

Tempest: Yeah. So I've done a couple of wearables projects not really intentionally, they just say, find me wearable projects. But I'd say it's probably human factors that are the hardest things. So we did a project with children wearing Fitbits and kids have really small wrists, and the Fitbit just doesn't fit snugly around like a five-year-old's wrist.

So the signal you get from it is just a bit flaky, and that's just like a human factor that, that you have to accept also on, on many, in various trials I've seen in including one with the kids and the cognitive function. One, people forget to charge their wearables cuz they're human beings. The data just drops off sometimes.

There's also sometimes miscommunications when you're doing a trial where you say, please wear the device every day for 12 hours a day. And someone will carry it in their handbag thinking that counts as wearing the device. And you'll get a really weird signal and spend like hours trying to figure out what it means.

It's because someone had misunderstood what it meant, the definition of wearing wearable. So I think even great hardware that works in the lab really well can struggle in everyday life just because humans. And this makes the data processing really challenging, as I'm sure you know, because either you've got these raw sensors which are incredibly noisy and are, have this drop off, or you have consumer or commercial wearables.

But then as a data scientist, you're at the mercy of the device API. You get data whenever it wants to give you data, and it's semi-processed and you have no control over that. The human factors lead to a lot of software and data science challenges, even if the sensors are great.

Bobby: Yeah. And any, yeah. Which is just so common with human nature being what it is.

So were there any tips, tricks, hacks, best practices that you came across that helped with the human compliance side of the equation?

Tempest: I think if you're running a trial with lots of people wearing wearables all the time, it's really important to have a good. Good communication with everyone in the try, like a good hotline and regular check-ins to see, how's it going?

Or you know what, if you're not charging it, why are you not charging it? How can we help you? And just understanding why people are not wearing it. And you have to just accept at some point that kids take their wearables off and go run around the garden or go for a swim. There's nothing you can do about that.

And it can be frustrating when it comes time to write up the research. Because you have these gaps that you have to explain, but that's just the nature of like human subject research. So you have to be accepting that the dataset is gonna be full of holes at the end and not be too upset about it.

Bobby: Yeah, patience helps. Okay. The next project was Fizzyo.

Tempest: So this project started with Microsoft Research where they built these special sensors for kids with cystic fibrosis. Cystic fibrosis is a disease that affects the lungs and there's a buildup of mucus, and so children need to do these physiotherapy exercises to expel the mucus.

That's the main treatment as physiotherapy. The physiotherapy is a huge burden for these children, and some of them have said that they dislike the physiotherapy. More than they dislike the disease because they have to do this coughing, huffing, and then, expelling mucus for many times a day.

So the idea for Microsoft Research was to make the physiotherapy a little bit more bearable and actually maybe make it fun. So when they're doing the physiotherapy, they have this device that they have to blow into. So they put special pressure sensors into the device. So that when the kids are doing their physiotherapy and blowing the device can actually control a computer game so the kids when they're doing their physiotherapy can then, jump from platform to platform and all these different fun games that were developed.

So what I was doing was analyzing the data that was coming from this trial. So we were doing a trial with UCL and Great Ormond Street Hospital in the, with, I think it was about, 130, 140 kids, and that is still ongoing. Yeah, so that was really interesting. So seeing how they do their physiotherapy in the wild.

So again, this is a situation where normally to test how you're doing, to check in and test how you're doing your physiotherapy, you have to go to the hospital and the physiotherapist watches you. But once you go home, there's no visibility into how you're doing your physiotherapy. So as far as we know, this was the first time that physiotherapists could see what are kids actually doing at home?

Are they doing the physiotherapy we think they're doing, and what is the right way to do physiotherapy? What has the best effect? And so we're seeing if the games are helping at adherent, but also seeing for the first time this real-world data about physiotherapy. Yeah, that's fascinating. Having worked with physical therapists in the past, I'm pretty sure they'd wanna have one of those for me.

Yeah. Between statutes for exactly the reasons you said it all sounds great in the physical therapist's office and the moment you leave, they have no idea what's really going on. So I'm seeing that, by the way, as a pattern across a lot of different sort of healthcare oriented things, which is the provider sees a patient and.

When they're in front of them and they can get labs right away. There's a certain level of care that they can provide, but what they worry about is what happens when they leave the point of care and their home, because things can get into the negative and they'd like to know sooner than later.

And patients don't necessarily, they can't keep on top of their biomarkers as efficiently. I'm getting the sense that in the next five to 10 years, you're gonna see more and more of this kinda thing. Yeah. And one of the surprising things was we wanted to know, so what makes a good physiotherapy session?

Okay, so they have to exhale for this long at this kind of frequency, and they need to do it in sets of 10 pause sets of 10 pauses. So we wanted to see, okay, so let's try stratify the patients into how well they adhere to physiotherapy. Do they do their, how well do they do their sets?

And we found that almost no one was doing sets. So people are. Prescribing, go and do your sets. And in reality, it's not really being done. This is from the preliminary data. That's a major learning if you're prescribing something, but in reality it's not happening at all. That was a really surprising finding.

So it'll be really interesting to see when the full analysis is done, what they find there. Yeah, that'd be interesting to see how that plays out. So I'm just curious, like with your day job and all the things that you have to deal with, how do you continue to learn to stay on top of things within your role and then the industry at large?

I'm quite lucky in that my job forces me to stay. On top of things, because I work with lots of different customers on different projects and they're using different tech stacks. I constantly have to stay up to date. We're constantly learning like new languages, new frameworks, new Python libraries.

Some customers like to use TensorFlow, some use. Some use. So we have to be quite flexible. What I enjoy about our team is that we always check what is the new and best way to do something before making a decision about technology to use. We always go and do a review of what's out there, pros and cons.

Therefore, we're doing this one, which is very different to academia where you default to how have we always done this? Let's continue to do it that way. So I really enjoy that practice. So my job forces me to stay on top of these things. However, it is very challenging data science and machine learning, as is very fast moving.

It's quite overwhelming how many new technologies are coming out. Some, new, big data, tools, papers, and algorithm. It's incredibly hard to stay on top of all of that. But one thing I've learned is just accept that I can't be an expert in everything and just make peace with that. Like I cannot keep up to date with everything that comes out, and that's okay.

I would definitely agree. I'm guessing that you, you have a great sign of teammates that can give you scale as well. Yes. Yeah, exactly. Like we can't all be an expert on everything, but we can rely on each other because, people have different interests and Working in a big team, that is quite reassuring.

Yeah. So you're known to, promote positive change as a leader in the industry. I'm curious, what advice might you have in terms of what companies and entrepreneurs can do who are expanding or building the foundation of their business so that diversity and inclusion are baked into what they're doing?

So I guess there's first, there's attracting diverse talent and then there's retaining diverse talent. So when it comes to attracting diverse talent, I would say, Definitely expand your search to look in new places instead of, always looking at the same grad programs to do what you are hiring.

Really review your hiring process too. So do you have diverse interviewers? I've been interviewed by very homogenous or whites or male panels before. And I just, it made me feel like the odd one out in the room and made me feel like, are there any women in this company? And one of the things I enjoyed about my current team was that there were diverse people in front of me.

And that was just a cue to me that the team was more diverse. So just subtle things like that tend a really strong signal to people that are being interviewed. And then I also think that the software engineering and data science interviews need to be revamped a bit. The fact that you need to go and take time off to study your first year textbook and refresh on obscure first year algorithms that you don't use in your current job, I find really frustrating.

It takes a lot of time, and it just goes to show that you're not hiring this person for what they can do on the current job if they have to go to some old textbook. So I would love to see more, I think it would be more inclusive to have more project-based interviews. Tell us, tell us about a project that you worked on.

Step us through your thought process and challenges. And I think that's a much more inclusive way to, to interview people. And then in terms of retaining diverse talents. So once you've got these diverse people, creating a an inclusive environment doesn't. It doesn't seem to come naturally or organically as we've seen.

If you just let companies, carry on the way they have been, it just doesn't work. You need to actively invest in diversity and inclusion and actively be able to 0.2 initiatives that like proactively do something. So it could be programs that come from the company at large, or it could come from individuals, but you need to actively work on it rather than passively.

Company will be inclusive. And then I guess for entrepreneurs, solving a diverse team from the start would be a good thing to do. So when you're, as you're growing, build that diverse team from the very first few members that you have, and then think about how you're gonna actively work on DNI rather than just crossing fingers that it works out.

Yeah, that definitely resonates with me. I think if you're serious about this as a startup, entrepreneur, founder, you really have to do it on day one. It's not something you can kick the can down the road and worry about in a couple of years. It's too late at that point. Exactly. Yeah, that definitely resonates with me.

So I was thinking with that. I think, I'm sure both of us are constantly looking for great new ml and AI and software engineering minds to bring to the table, to bring to the team. And recruiting continues to be challenging. We certainly made a habit of not looking in, like you said, the traditional grad programs for talent and expanding it and trying different areas, and One idea that we briefly touched on, but I would love for you to expand, is if you expand the global talent pool and start looking at places, I know you have a strong connection to the community in Africa.

What should people know about the current state of machine learning talent in Africa? So what people should know is that it's a really untapped talent pool and what people might not be aware of is how much activity is going on. So some of the really big tech companies have realized that this is a talent pool and have started investing in this talent pool.

And I think it might be interesting for smaller companies to know, but some of the initiatives I'm really excited about are the fact that there's a Microsoft Africa Development Center in Nairobi and Lagos and they're doing really cool AI and VR stuff, really cutting edge stuff. There's also a Microsoft Africa Research Institute in Kenya, which hires AI researchers.

And then the first major cloud data center that opened in Africa was the Microsoft one in South Africa. Then Google and Facebook have partnered to make a masters in machine learning with the African Institute of Mathematical Sciences. So there's that master's program that's coming out with, I think they have teachers lecturers from Google and Facebook.

DeepMind has started a scholarship for masters in machine learning with a couple of South African universities. This was a couple of months ago, so that's really new. So they're nurturing talent there. There's the Google AI Center in Ghana. There's some really brilliant homegrown AI startups like InDEEP in Tunisia that do reinforcement learning.

And then there's this wonderful conference called The Deep Learning in Daba, which really brings a lot of these people and organizations together. And each year it's hosted in a different African country with like amazing world class speakers bringing all of these people together. There's a really a lot going and I would love for more people to be aware of that and also to get involved.

Bobby: It seems like with those names that you mentioned, that surely will start a very positive snowball in growing the ML community in Africa. We're probably just at the dawn of what it could be in that community, right?

Tempest: I think so. And just the sponsorship and investment is really helping that some of these conferences students can get sponsored to go to Europe. And that can be like a really life-changing prize for someone. Some of the students that attend, they face in incredible difficulties, which kids in Silicon Valley might not like. I've met students who want to print their poster to present at the conference, but there are no large format poster printers in the whole country. So they have to actually print their post overseas.

And I've met students who can't find a professor who can supervise their masters or PhD project. In their country. So they have supervisors in the next door, country. And then, even a applying to a university in the US that fee, just the application fee can be prohibitive.

So nevermind the cost of conferences and flying to, flying to Canada for Europes, or those are completely prohibitive. So these kinds of initiatives that do sponsor travel and study make a huge difference. I can definitely believe that opportunity to really enrich their lives and grow.

Bobby: So you mentioned some really heavy hitting names. The usual names that are investing in that region. Microsoft, Google DeepMind, which is part of Google. But, one of the constituents that are near and dear to my heart are startups. The entrepreneurs that are trying to break through and do something really innovative. They might be ex Google or Microsoft people for them. I'd like to believe that there might, that it isn't necessarily the case, that only a Google or a Microsoft can participate in the ML community in Africa. That there might be a way for startups to, to access this talent and maybe allow that talent to have exposure to, the Silicon Valley startup type adventure.

And so if I'm a startup entrepreneur and I would like to try and tap into this fantastic talent pool, ML talent pool in Africa, what might be a good first step?

Tempest: The first step would be to follow up on any of these initiatives that I've mentioned, get to the conferences with these students, are, get involved in these centers of, where you find the talent because it's not just, we don't just do it for social good.

There's real business impact in hiring diverse people with diverse perspectives, with an understanding of, emerging markets. So this is really beneficial for businesses, which is why the big companies are doing it. But certainly there is a really vibrant startup community too of home homegrown startups.

And I guess one thing I'd like to highlight is the, so there was an article that came out in The Guardian last year. And it was called Silicon Valley has deep pockets for African startups if you're not and it was super interesting. They mentioned that. They mentioned that basically a lot of the African startups are not led by African people or are not headquartered in Africa at all.

Bobby: So they say that on of the top 10 African based startups that received the highest amount of venture capital in Africa last year, eight were led by foreigners. And in Kenya, only 6% of startups that received more than a million dollars in 2019 were led by locals. So I think that's really interesting.

Tempest: So if you are building project or a product that is related to Africa, targeted at Africa, You need to have African representation in those companies, and it needs to be like more homegrown basically. I thought that was a really interesting article. Yeah. I'll have to check it out and maybe put the link in the show notes for the article.

Bobby: So switching gears and talking a little bit about responsible AI and health, I know the conversation has shifted from can we do it to, should we do it? What does that process look like? To get an answer. And where do you see the biggest pushback?

Tempest: So in my team we deal with a lot of really complex machine learning problems and projects. They're all really different. We work in different industries. We work with computer vision, we work with language, we work with structured data as we're seeing lots of different projects coming in, and we have to decide, yes, we can do this project, but should we be doing this project? We have started a review process, a responsible AI ethics review process, and this gives us a sounding board to, to explore pros and cons of doing the project, how people feel about it, and whether it's something that we should be doing.

We work through this with our customers. We build a responsible AI document where we describe everything. We work through this with our customers, so we ask questions like, can this problem be solved without technology at all? Is this a social problem? If you must use technology, can it be solved without machine learning?

Bobby: Or could this just be a simple SQL query? Do you need machine learning? Because if you do use machine learning, there's re responsibility that comes with that.

Tempest: We do something that I call Black Mirror brainstorming, which is, there is a u UK TV show called Black Mirror, which explored how technology goes horribly wrong. So we think about what could go wrong with this technology if it's used in, in, in a way it wasn't intended. We think about, the limits of the data sets that we're working with. How would that limit the model? How should this model be used in production? So we ask a bunch of different questions.

Also, who are the stakeholders? Who would this machine learning system affect end users, regulators? Are there any vulnerable groups that it might affect? Children immigrants? So map out all the stakeholders and then think of the benefits and harms, potential harms to each of these stakeholders, company reputation, even as a stakeholder, and map out all of these benefits and harms.

So we have a really methodical way of thinking it through. And understanding if the project needs to be reformulated limited in scope, or, completely revised. And you said where do you see the biggest pushback? The pushback is actually not from our customers. Our customers are actually really usually very keen to be involved in this responsible AI process.

They wanna know what is, what are the best industry practices? They don't wanna be building products that are not. That are gonna be harmful. So they're actually are really keen to see our process. The biggest pushback is actually misconception that if you're not developing a model, if you're just putting a model into production, you don't need to think about responsible ai.

The people often ask, I'm not developing the model's done, I'm just putting it into production. Do I still need to think about responsible AI? And the answer is definitely yes. You do still need to think about it because the model could become stale. This model's been running for 10 years without training.

That's not responsible AI. Like the data has shifted a lot since then. So there's lots of considerations to think about when it gets productionized in terms of responsible AI.

Bobby: I think those are some great ideas for folks to drill into, to do a better job in that area. Tempest, it's been an absolute pleasure. I'm so grateful for you making the time. I know you have a lot on your plate and I've certainly learned a lot and I know our audience will as well. Thank you so very much.

Tempest: Thanks very much for having me!

Bobby: That was Tempest van Schaik, PhD, senior machine learning engineer at Microsoft, leader in health tech innovation, and an incredibly impressive and infectious force for change. If you're interested in learning more about Tempest, her LinkedIn is actually a great jumping off point. And if you enjoyed our podcast, please like and rate us.

Tempest Van Schaik
Senior Machine Learning Engineer at Microsoft

Loka's syndication policy

Free and Easy

Put simply, we encourage free syndication. If you’re interested in sharing, posting or Tweeting our full articles, or even just a snippet, just reach out to medium@loka.com. We also ask that you attribute Loka, Inc. as the original source. And if you post on the web, please link back to the original content on Loka.com. Pretty straight forward stuff. And a good deal, right? Free content for a link back.

If you want to collaborate on something or have another idea for content, just email me. We’d love to join forces!