Transcript of "Tackling data challenges to empower public sector transformation"

FACILITATOR: Thank you and welcome back to Tech UKs building the smarter state. Hopefully you’ve had a chance to refresh the coffees and the teas, do a bit of networking. I think some of the previous panel have just managed to escape the grasps ofsome of the suppliersin the room, it’s fair to say. But we now have a great session to follow up on this morning with Made Tech giving us an outline of how to tackle data challenges in government and how to empower digital transformation. So I think really timely following on from what we heard earlier. So I’d very much like to welcome Jim Stamp, Head of Data, and Haroon Ahmed, Commercial Lead from Made Tech, to the stage. Thank you very much. 

[applause] 

JIM STAMP: Hi everyone, my name’s Jim Stamp, I’m the Head of Data here at Made Tech, looking after digital transformation projects, data transformation projects and – just waiting for the slides to come up, there we go. So, quick introductions. So Made Tech has been around for a while, we’ve focused recently on public sector, so we’re a public sector delivery specialist largely focused on the digital transformation but more recently moving into the data space. We successfully delivered projects for the public sector in the local government space, central government space, just starting to work in the health space as well; and it’s really starting to pick up. So today we’d like to talk through some of the issues that we’ve seen across multiple projects and multiple customers, talk through how we’ve solved those and really just I guess a call to arms across all of the different sectors of government of how we need to change and how we need to upscale and shift our view of how data is used, and that sort of area. 

I’ll just let Haroon introduce himself. 

HAROON AHMED: Hi guys, my name’s Haroon Ahmed, and that is probably a catfish picture of me. [laughter] My background is I’ve been working in data for about eight years now across the public and private sector, that includes pharmaceuticals, healthcare, working in the UK, central Europe, Middle East, a little bit of work in the States and some North African work as well. My background as a human is, I used to be a lawyer, I used to be a barrister, for my sins, we’re still trying to work out if we’re going on strike next week or not; and now I’m working at Made Tech as a commercial lead trying to do interesting and fascinating data projects. I’ll talk you through the challenges we face and what we think the solutions are. 

JIM STAMP: I just wanted to do a quick plug as well- Plug, is it plug within our talk or 

HAROON AHMED: I don’t know, you can call it a plug. 

JIM STAMP: Call it a plug. We’re thinking about a writing a book about how to solve problems in data, and we’re sort of going to try and do a thing where Haroon asks me a stupid question and I try and answer it for him but we were wondering if you could come up with any questions that we think we might have missed. So we’re going to have a section of time at the end, if you have any questions of any technical nature or non-technical nature about building data platforms, structuring data, any kind of interaction around data that you can think of, fire them at us. I think we’re recording the session so we’ll get a record of it, and we might be able to add them to the book; and hopefully I might be able to answer some of your questions as well. So, yeah, have a think while we’re talking whether there’s anything that occurs to you that you might want to ask us.

Do you want me to loom behind you or sit down? 

HAROON AHMED: I think you should sit down. 

JIM STAMP: Okay. 

HAROON AHMED: You’re quite a bit taller than me. [laughter] 

JIM STAMP: Oh, it’s not a uniform either, we just didn’t swap our messages 

HAROON AHMED: Yeah, I feel like mini-me stood next to him really. [laughter] I am average height, he’s just a lot taller than me, just for background. In regards to the book, one thing to mention is we want this book to be something that’s developed ongoing and when senior stakeholders are going into a meeting to talk about complicated data ethics or governance, they have no idea what it is, they should be able to read like a chapter of it really quickly and then be informed when they’re speaking. 

So starting off with the challenges, what we find is the question nobody’s asking themselves is, are we solving the problem? I see loads of teams getting wrapped up in conversation around data quality, data access, internal teams are too slow, they’re siloed and everybody We have too many requests coming in and we’ve got a backlog. Cloud computing and hosting, which BI tool should we use? Should we use QuickSight, and should we use Power BI? And rarely do we see sort of the teams having conversations about what are the problems we’re trying to solve, who the users are? And there’s no consideration for the sort of real users of the problem, or even who the users are sometimes. 

We probably need to take a step back, the best projects that we’ve seen yield the best results are the ones that concentrate on the user or the customer. What do they need? How quickly do they need it? What does their workflow look like? Are we just creating tools for the sake of creating tools? Are we creating dashboards that sit on the shelf and then never are looked at again. Let’s talk to the service designers, let’s talk to the data scientist, let’s talk to the delivery managers, and let’s understand these users. And this is essential that we start looking at this now, and we need to embed this culture of the users now before we get into the really complicated stuff with data around predicted government and those sorts of things where if we don’t concentrate on the users, we’ll be creating even more tools that just sit on the shelf and nobody looks at. Now they’re smart and there’s data science in them and there’s maths, and we’ve ticked that box, but what else, what does it yield? And that’s not thought about enough. 

I’ve spoken about the shelf dashboards and a lot of the time you ask people, okay, why have you made this dashboard and why have you carried on updating it? It’s a bit like, we’ve always done this. Who uses it? Well, we don’t know. It’s just a bit of a waste of time. We need to be clear on the value proposition of what we’re doing is, that’s how we build things that are sticky, that’s how we build thingsthat work; and one of the examplesfor the local government folk in the room is a data dashboard sort of plug, we built it with Hackney, but the reason it works is not because it’s the first data platform in the world, it’s not because it’s the most complicated data platform in the world, it’s because Hackney focused on the users and how they’re going to use it and built into the workflow of how data’s being used, and that’s why it’s sticky, and that’s why it works. 

So the message here is focus on the users first and then think about solutionising, discuss the data, discuss the tools, technology, it’s all really important but start with the users, start with the real issues, what the real problems are that our users are facing. 

Mind the gap. Now we are all really, really well aware that the government has- Well all debt government departments has an abundance of data, right? Really valuable data. But I mean I don’t speak to many people who really know how to access it, how to get to it, what they can do with it, and what it means. And that’s really important. We know data expertise is at a premium right now, and we hear from clients the great work and the great ambition and all the data strategies in the world and what they’re trying to do in the next five years without a real sort of plan on how they’re going to do it. There’s a lack of skills in internal teams that plays a part. There’s an inability to hire, and the big R word which is not recession but retention, keeping the staff that they’ve trained up and they’ve skilled.

Lack of understanding of data, where is it, how is it held, how is it structured, what does it mean, where does it come from, where was it collected, why was it collected? Training staff’s not been a priority, we hear this again and again from departments, that’s from local government to central government to healthcare, it’s all the same and of course the inability to keep up with salaries. And some of these things we can address, other things we can’t. 

Now if you combine that with the low rate of data literacy that we see in government and limited understanding around how to access data, data lineage, data interaction, interpretation. It’s all good having data in dashboards and stuff but unless someone understand what it means and how to make decisions using that data, it’s meaningless. One of my favourite phrases is, data is nonsense without intelligence, and that’s really apparent when we work with our clients. 

JIM STAMP: We mean intelligence that comes with the data, not the intelligence of the people interpreting the data. 

HAROON AHMED: There we go, that’s why we work with him. [laughter] 

Now the compounding impact of that is we have senior civil servants that are being left behind, we have requests coming from ministers to summarise things in ways they’ve never done before, we have requests that are trying to reduce really complex bits of data into a sentence or soundbites or oddly structured KPIs, as we like to say them, that mean, we don’t know- Sometimes we find that analysts are being asked to develop data platforms, so it’s the whole request of, can you go an learn Terraform overnight and then figure this out, which is the wrong way of doing things. We should be giving these jobs to the data engineers and data architects. 

What happens is we’re failing to deliver on really simple things, things are done quite easily, quite successfully, on time in the private sector. So there’s no reason we shouldn’t be doing this in the public sector. Now tools maybe the answer but there’s still a severe gap in knowledge and understanding that needs to be addressed; and there’s quite some sort of sticky upskilling challenges that we find.

I think that’s about it on this bit .

JIM STAMP: All of these slides are really interrelated but we’re going to try and tease out a thread. So aspiration versus ability. Most of our customers come to us and say, “We want to do data science,” and that is their data strategy. We want to do data science and we sort of stop them and say, “Well what have you done so far?” And they go, “Well someone’s learnt some Python and they’re running some data on their laptop.” And it’s like, “How are you going to productionise that?” Sort of, “Well we’re not quite sure, that’s why we’ve come to you, you’re the experts.” 

At which point we have to rewind and we have to rebuild their strategy, think through what they actually want to do. Back to the users again. What is it your users need. What is it that you think data science can answer, and this isn’t small customers, this is department level data strategies, you know? This is big customers coming to us and just saying, “We want to do data science,” and it feels like we need to really rewind and go back to those users, look at the underpinning problems that you’re trying to solve, try and look at the ways you want to work with your data and with your customers and your users, and try and build that view of what that data science can do. 

Once you’ve got that and you’ve answered those questions, then you can start doing some data engineering. You can start cleaning some data, you can store some data, you start extracting some value and some features from that data; and building that understanding of what your data can do to you outside of that data science bit is more important than anything else. I think there’s a lot of value that you can extract just from data without data science, and building your strategies around one single way of analysing your data is pretty dangerous. 

We have an issue within data that as a software engineer, historically, it feels like we’re not professionals in data. I think that there is a, from criticisms from my ex-software colleagues that data is seen as a bit of a cowboy hack is in part true; and I think we probably need to be a bit more aware of the ways that we work, of the ways that we treat data, how it’s stored, how we look at quality and start building that into the ways that we work as well.

That maps into data ownership, so one of the current sort of fashions for architecture design is data mesh and one of the things that they push quite heavily on that is data ownership. So, rather than having a warehouse where people throw their data and it sits there and rots for years, actually the team that generates the data, owns the data, they look after it so if you are looking after housing in local gov or providing a service in the NHS, then the data that you’ve produced is owned by you, not by a centralised team. Don’t move it somewhere else for someone else to understand and model and use; own it yourself, focus on the quality of the data, work out again who your users are specifically for your data, for the data output of the product and service that you’re supporting. 

It’s so important to make sure you understand how it’s going to be used, who should have access to it, why they are using it, whether they should be using it in that way, providing that metadata so that people can understand the providence and lineage of that dataset so that they know the legal right to use it, making sure that you can share it. 

It comes down to sharing. So sharing data is the thing that will fix the problems. We all know that we want to share data. I think we’ve heard again and again so far this morning about sharing data and how hard it is, and building technologies that will cope with it. I would say on the projects that we’ve worked on the aspiration to share is blocked, almost always, by the fear of the risks and the impact of leaked data or misused data or, you know, everything that everyone’s been scared of when you mention the, we’re going to share the data with someone else. And this could be department level, agency level, you know, across the board, sharing data is hard- Well, no, it’s perceived to be hard. I think we are used to, within an organisation, having governance structures that allows us to share data and do it in a confident way but when it comes to inter-organisation sharing I think we get nervous. 

I think as an engineer and as a technology, we have ways of doing it, we can control this, this is something that is maybe not a fully fixed thing but it’s definitely understood and known, and we can get there. I think where we’re missing at the moment is joining the governance colleagues with the engineering colleagues. Governance people are seen as, you know, blockers to innovation, to making things slow and making, you know, adding process to things that don’t need to be there. Engineers are seen as cowboy hackers by the other side of that balance, and I think by joining together and talking to one another and building technological solutions to governance problems is going to be how we open those doors to sort of data sharing across. I think this is going to be core to everything that we want to do over the next five years. I think the strategy is going to fail unless we get those two communities to start working together.  

I think that maps on nicely to data lock in. I was shocked when I first started working for Made Tech, we were engaged by an NHS ICS to work on some children, young people’s mental health work. Brilliant project, felt amazing, we loved it, we felt like we were having a massive impact, it was just in the middle of the lockdown. And I got told we couldn’t get a hold of the data. We couldn’t see where patient records were, we couldn’t get access to where they were in waiting lists or what they wanted or even who they were seeing or what case notes they had. And that was coming from – not from the NHS but from the supplier that the NHS had paid to provide them with the IT system to store those notes. The bill that that supplier sent to us was enormous, to get hold of that data, and it broke everyone’s heart because we had spent so long on the output and the discovery, building these amazing systems that, you know- And we had, you know, there was an API, it was available but the bill was just enormous and it felt frankly, criminal at the time. [laughs] It really didn’t feel comfortable at all and everyone was heartbroken. 

So I think that was the first one and it just really opened my eyes to the problems that we had. Then again local government context. Local government told me they didn’t have access to their council tax data, they couldn’t analyse their council tax data. It was like, really, what? Yeah, we don’t have access to our council tax data, all we can do is run this report to get the export from the system that we’ve bought. It was like, wow, okay, that’s not good. That is the core of your system. That’s what, you know, that’s the only person record that you can guarantee for each of your houses, right, that is it. And they didn’t have access to it, they couldn’t pull out those records and, again, just shocking. 

Central government, we had a project recently where we were asked to come and supply some data to their BI system and the supplier who was looking after their BI system refused to work with us, they said that’s not part of the contract, we don’t have to do that and it just, just feels so wrong. And I think this is the bit that I’m going to spend the most on and be the most passionate about because I don’t like it, I really don’t like it. [laughs] It just feels so wrong. We need to get better at buying services, at placing contracts in talking to the people that we work with and making sure that they are aligned with those aspirations and those ways of working. This is at the moment the biggest problem that I’ve seen, just getting access to data, it just feels so wrong and broken. 

So, yeah, I’d love some questions about this if you’ve got any, how we can fix this, or just some insight into how you’ve dealt with these problems. So, yeah, back to Haroon now. 

HAROON AHMED: Yeah, what about privacy and data ethics? When you think about it, it all comes down to trust right? We need to build services that use the data that we have in a way that instils trust in the public. People need to know that we’re using their data for good in the right way and that it’s protected. We need to show how much good we can do by using this data, in particular around data privacy, there needs to be an understanding that just because you have a statutory right to use data doesn’t mean there aren’t still some moral obligations on how you should be using that data. I think there’s a big difference that people need to understand. Anonymisation, segmentation, pseudo-anonymisation, is still best practice even when you’ve got statutory rights to use that data, and you’re not legally required to do so. So best practice needs to be kept in mind. 

Linking data also needs to be considered. Data looks very difference once it’s linked, it’s able to do other things and this adds more identifiable fields that need to be considered when looking at data privacy. We see a common theme that within government data tends to be really locked down until it isn’t and then it’s fair game, it’s just like you can do what you like with it now. And that’s just wrong, we still need to anonymise, pseudo-anonymise, segment data and really protect it. It’s essential when we’re building service to do those things to instil the trust that we need to instil. 

A lot of the time data privacy gets conflicted with data access. We still need to access data, let’s just develop the right frameworks and the safeguards and best practice that’s required, like you don’t need my date of birth and my full name and my postcode to design the best public services for me and my community. That’s never really needed, especially when we’re looking at data at an aggregate level. 

Now data ethics. I was talking to an ethicist here, we don’t meet with many of them. There is now a DDaT role, that’s a data ethicist and I was speaking to them and I can just see him in the back, and the description of the data ethicist for the DDaT role I think we collectively will know about two people on earth that meet that description, and they probably have six PhDs each. So that’s unhelpful. 

Now, what is data ethics? It’s about building public services that don’t discriminate or disadvantage people, communities or the environment. Really that’s what we’re trying to do at a basic level. We’re not trying to right now get into that argument about civil liberties or my data, his data, her data, my right, your right – I think that’s a slippery slope. It is an argument I can have over a beer or a coffee happily but probably not one we want to have when we’re talking about data ethics at an organisational level. 

We need to look at ethics by design. That is what we should be having. It shouldn’t be an afterthought, we should be having data ethic frameworks that we look at, at every stage of the data and we don’t just consider it when we’re looking at personal data, we should be looking at every stay of sort of data collection, data sharing, data use, and the algorithms that we’re building to make sure they’re not biased or racist or sexist. 

Ethics is not privacy. I think people need to remember that as well, there is a difference. Privacy doesn’t consider when there’s a bias in a data model, it doesn’t consider when there is practices that reinforce stereotypes, for example, or propagate falsehoods sometimes, or when things are just plain wrong. For example, I worked on a project – not in the UK – when we were looking at data around where to build bus stops and the data was really good. So we found out where the demand is and where we should have bus stops, but we soon realised that the data only came from affluent neighbourhoods because they knew how to submit the data, they had access to the technologies that were going to do that. So if we just relied on that data we would have built bus stops for the wrong community, for the wrong groups of people. Not for the people that are marginalised or less well off in society. A data ethics framework to refer to for that organisation would have solved that issue before we got to it, because that’s around data collection or the data we use. 

In short what we need to do is create data frameworks but not juststick to them, iterate them, develop them as technology develops, as our understanding improves and really empower team building- We don’t want data ethics to be a roadblock, we don’t want it to stop your projects, we want it to empower your projectsto make sure we don’t look at data ethics when things go wrong. Let’s look at them so things don’t go wrong. 

I think that’s it on data ethics. 

JIM STAMP: Yeah, so quick summary. 

So I think the four things that we probably take away from this is sharing data is a doable thing, we can share data, it doesn’t need to be the blocker that it’s perceived to be. It’s strange the difference between hearing people talk at this kind of session compared to what we see when we’re actually trying to deliver services. Sharing data is a scary thing and it stops projects from happening. 

Procurement, we need to fix procurement. We need to fix procurement of, you know, contracting suppliers like Made Tech to come and build bespoke solutions, we need to change the way that we buy off-the-shelf solutions as well. We need to build in those open sharing end points. We need to talk about how we export data. 

Privacy and ethics need to be built in from the start alongside quality, alongside all of the other things that we know are necessary to build data platforms and meshes and all the rest of it. Privacy and ethics have been left out over and over again on projects that we’ve seen and we’ve had to fight to get them put in there, and that feels wrong. 

Skills, skills, skills. We’ve heard it again and again and again this morning. I think skills – and it’s not just have we got enough data engineers or data scientists, you know, the DDaT roles are important but it’s data skills for everyone across government, making sure that, you know, senior policy advisors, ministers etc, understand the basics of what it is that they’re seeing as summarised up to them. So if you’re sat there looking at a dashboard and you don’t understand it, shout out, ask, get some training, work out what it means, it’s so important that everyone understands it. Or, get the team which built the dashboard to then do it in a different way so that you can understand it, add the context in, give that insight so that you have that intelligence embedded within dashboards or reports. Everyone needs to be confident in the data that they use and how they talk about it at the level that they talk about it; and so data literacy isn’t something just for the technical roles, it is across the board for everyone to have and use. 

HAROON AHMED: And not everybody needs a dashboard, right? Can we just stop building dashboards [laughter] like 

JIM STAMP: I hate dashboards. 

HAROON AHMED: The best example I have is, I was working with a Trust once and the CEO said, “Just don’t build me a bloody dashboard again,” pardon my tone, and that’s what he said, and I was like, “What do you need?” He was like, “Just let me know when things go wrong.” And that’s just good insight. He just wants to know when things are going wrong but someone else in the department may need that dashboard but just building a dashboard for everything is just silly. [laughter] 

JIM STAMP: It is the answer that most people want though, it’s, “Can we have a dashboard for this?” It’s like, “Sure, we could just tell you the answer, it’s not going to change over six months. It’ll be the same answer tomorrow when you check the dashboard, it’s just the dashboard isn’t going to help.” But they still want a dashboard and they look at it once every six months and then it sits there and it doesn’t change and then they get bored of it and then they don’t look at it again and, yeah, it’s a waste of time and money. 

Right, on to questions and also book as well. So has anyone got any questions about what we’ve spoken about now and do you also have any questions about technologies or data or data platforms or anything, ethics?

Go on there, yeah, one in the back there. 

JOHN HERWITZ: Hello, John Herwitz, working for the MOD. I’ve got a question regarding metadata, you were talking about the requirement of data owners to provide metadata. I was wondering if there was any standards on metadata, if you have any particular requirements for it, would you be advising that people have a level of metadata so that it can be taken forward to those dashboards and you can roll over a piece of data and it will tell you about it. Or if it’s something that you would be wanting to provide for integration purposes so that when there’s two different kinds of data with the same name and similar ideas, there’s enough information there to integrate them and get a common idea of what’s going on or any other concepts on metadata that you care to share with us? 

JIM STAMP: I can talk for a long time on this subject if you want me to. I had an hour-and-a-half long chat with an architect from DIT about semantic reasoning or across a data mesh, hierarchy, the other day which most of my data engineers just went, I have no idea what you just said but it sounds fascinating. 

So metadata is vital to everything that we do, it’s data about data, fixes most of the problems that we have. That’s how we fix sharing, that’s how we fix lineage, so when you’re looking at a dashboard, as you alluded to- Ah, dashboard again. If you’re looking at an item of data again and you don’t know where it’s come from, what it’s based on, what the quality of those datasets are then how can you know whether you can trust that number? If I’m looking at a single number with nothing around it, it’s just a number, it doesn’t mean anything. I need to know where it’s come from. 

So that metadata that says, this number is based on these things, these things can be described as this, they’re based on those and then you can say the quality of that coming up that lineage gives you that providence of that number and whether you can trust it; and that confidence in whether something is trustworthy is so important when it comes to data. I’ve seen numbers being thrown around – even in our own organisation, I’ll be honest with you. Where you just go, well I can see that same number twice with two different values in two different reports, so which one is true? And if you can’t trust it then what’s the point? It doesn’t help.

Are there any frameworks? Sadly, no. Lots of people have come up with lots of different ways of doing it, there’s a few around that have been around for a long time that we’re thinking of adopting. So as I’ve mentioned, some of the semantic web stuff for describing attributes, it’s really powerful and it hasn’t really made its way into big data. So link data and big data seem to have missed each other, so that’s why I was having that conversation with one of the architects from DIT. How do we join big data and link data so that we can say that this data item, you know, if it says name I know exactly what it means because it’s been referenced by that definition over there. There’s this huge dictionary of definitions of data items that we should be using and for some reason we’ve missed a chance to, so, yeah, that’s part of that. 

Yeah, anything more than that I don’t think there’s anything open, there’s a few products out there that are open-source ways of describing schemers but the lineage and the confidence and the trustworthiness perhaps isn’t part of that yet. There’s some good data hubs and catalogues that you can start using to summarise that but, yeah, I think we’re not quite there yet. Some people are working on 

Did that answer your question? 

JOHN HERWITZ: Yes. 

JIM STAMP: Sorry, I could talk for a lot longer if you wanted me to. Anything else? Any other questions? Free consultancy, come on. [laughs] 

AUDIENCE MEMBER: Hello, I work for Northern Ireland’s Regional Economic Development Agency and cyber security is really very strong in Northern Ireland. It’s a quick question. I was at the Cyber Expo two days ago, and Kieran Martin the former NCSC head talked about corporate responsibility for cyber security. So, question is to what extent do you see cyber as embedded as part of your data strategy versus just a separate issue? 

JIM STAMP: It’s the same strategy. It is the same strategy. I don’t, I can’t – as a software engineer I can’t separate those two things from each other. It doesn’t make sense to separate them. We have a colleague who has come from a background where he won’t tell us what he used to work on, we’ve seen his CV, it’s got names of some organisations that we sort of go, well roughly speaking we can guess what you worked on but he has some very strong views on how data platforms are structured. Not just from a privacy point of view, his is more around the how do you access and share data but also about the networking etc. 

And I think they are just the same strategy. I don’t know why you would want to separate them because if you separate them then they’re not aware of each other. You know, even at a technology level, if you separate the cyber team from your data team or your software team then if they’re not aware and working with each other then they can’t deliver a solution that’s coherent. The same with your strategy, if your strategy isn’t connected digital data, cyber, all need to be part of the same strategy. I think building a data strategy is great but in isolation it’s going to miss huge chunks. Digital and data have to go together, right? You can’t do one without the other these days, I think we’ve kind of finally got there, cyber is the one that sometimes gets missed and I think it shouldn’t. 

Anyone else? Yeah- Oh, there’s more hands coming up now. 

PATRICK KING: Hi Patrick King, CGI, here’s a question for you that you might want to consider for your book. We’re seeing the rise of artificial intelligence quote/unquote, and algorithms being used to make decisions instead of human beings. Some of those algorithms are making decisions based upon data, and it can affect people’s lives, so for example, can you have a mortgage, are you likely to commit a crime, will we admit you to this country and so on. How do you address those sorts of issues in your book? 

HAROON AHMED: To start with nobody can get a mortgage right now anyways- [laughter] 

JIM STAMP: There are no mortgages. [laughs] I think – Haroon’s going to shoot me for saying this. Ethics aside, I worked for Autotrader and we sent a long time building a valuations engine to tell car retailers how much their cars are worth, and when you get it wrong and car retailer has bought 50 of the same car and all of a sudden you tell him that they’ve all dropped in price by 10% overnight because someone’s just run an auction and devalued the market, they don’t like it. 

Now that’s fine, you can justify that. Where the difficulty came is when I was asked to explain why the model did what it did, and I think for me my biggest learning in building machine learning or AI-based solutions is the explainability part of it. Don’t built black boxes. Don’t build things where you can’t go, why, what’s the reasoning behind this change in the model or in this outlier. 

Machine learning is only as good as the data, and we all know the data is not good enough, right? You know, spreadsheets don’t cover it. Most data comes through spreadsheets before it hits. I saw a talk from Matt Parker the other day where he was saying that Excel automatically converts certain strings into dates, it automatically converts them. He went through a whole load of data and found that it had automatically scrubbed all of the entrants and they had to fundamentally change the way that it worked because someone was submitting the data in Excel. We all saw with COVID right, 64,000 or 65,000 COVID cases, it was like well hang on that sounds like a suspiciously binary round number, let’s just check that it’s not Excel, the end of Excel, oh it was. Right, okay, well let’s fix that problem, right? 

So if we can’t work out where the errors come from and we can’t work out what impact they have on those AI models, then how can we answer those questions. Now me working for a car company is one thing but if you’re talking about whether someone gets a visa application, that’s a fundamentally life-changing thing. I live with a Ukrainian refugee at the moment and it took a long time to get a visa sorted and that was down to a de-duping problem. You know, these are fundamental issues that fundamentally affect people’s lives and we’ve got to be so careful how we do it, from a technology point of view never mind the ethics involved, right? 

We’ve got more fundamental issues. There will always be outliers, the data is rubbish, right? You can through billions of lines of data at it, but there will always be outliers and if you can’t explain why something happens- Yeah, and I think GDPR and the you have a right to request a human review of a decision if it’s based on AI is fundamental. We need that right in there, and I don’t think it’s being removed, I hope it’s not, everything that I understand about the new process, legislation and the new bill that’s coming out, I don’t think they’ve removed it and I was crossing my fingers because I’ve seen it happen. ML is not always the answer and we need humans in the loop.

Do you want to say anything on the ethics point? 

HAROON AHMED: No. 

JIM STAMP: I think it’s fairly obviously 

HAROON AHMED: No, absolutely, I think the human loop thing is important, right, as long as you keep humans in the loop because AI automated algorithms making decisions on our lives are here, if not coming, at a government level they’re probably coming, right? So, we have to make sure we make sure that they’re valid, right, human in the loop is the way to do it. 

JIM STAMP: Is that enough? More, less? About right? [laughs] 

HAROON AHMED: We’ll stand around the corner. 

PAR1: Yeah, come and speak to us. 

CHARLOTTE CLAY: Thank you, Charlotte Clay, NHS England Transformation Directorate working in digital productivity. This is not really a question but more of just a sharing the pain. In the NHS I think two of the sort of biggest issues we have are access to data and integration of data, and I think part of that is, one, the NHS traditionally is very risk averse, and, two, that a lot of the data we have particularly around patients is as you mentioned looked after by suppliers. [laughs] So, it was more just to say I think what you’re doing is really great and something that, from a perspective of something like the NHS is something we desperately need the answer to how to fix that because it’s going to drastically impact on how we transform the way that we do things and it has been for the seven years that I’ve been working in digital transformation in the NHS, so, yeah, not a question but just more of a sort of reflection. 

HAROON AHMED: Yeah, I think the NHS went in its shell with the data stuff after care dot data stuff that went on, right, and that was not good for anybody but it doesn’t mean we don’t still need linked data, shared data. So if I go from one hospital to the other or my GP, shouldn’t get my treatment data from my hospital visit three weeks or two months after. I mean most of the time- I’ve got a mother who has a rare lung condition, she gets treated at three different hospitals and none of them know what’s going on. How’s that good enough, right? And it’s down to us not being able to effectively share data, people going into their privacy shells because they don’t want to touch a really complicated issue. It’s being addressed now, this is a good thing but it should have been addressed about 10 years ago, I think that’s where the frustration comes from. 

JIM STAMP: Yeah, that’s it. Thank you very much. 

HAROON AHMED: Thanks guys. 

JIM STAMP: Come and speak to us at the stand. 

[applause] 

FACILITATOR: Thank you to Jim and Haroon for I think really fascinating insight and I’m sure plenty of ideas which will take us into lunch. So you’re now released to go and network, chat to people, see the exhibitors as well and I’m just going to ask you to be back in here in little over an hour, so twenty to two please, back in the room, but thank you all and see you soon.

Back to the episode