Streaming live November 29 and 30, PrivSec Global unites experts from both Privacy and Security, providing a forum where professionals across both fields can listen, learn and debate the central role that Privacy, Security and GRC play in business today.
Leveraging over 10 years’ experience in data protection and compliance, Erin Francis Nicholson is currently Global Head of Data Protection and Privacy at multinational tech consultancy, Thoughtworks.
In addition to consulting on data protection for a number of organisations, Erin has delivered influential data protection change programmes in local government and the civil service, as well as the tech, energy, and finance sectors. Erin holds an LLM in Information Law and Practice and an ongoing MBA with a specialisation in Information Systems Strategy & Governance.
Technology: ChatGPT: the data privacy nightmare? - Day 2, Thursday 30th November, 11:30am - 12:15pm GMT
Could you briefly outline your career pathway so far?
After leaving university, I went straight into a job as an admin assistant in local council social care, and helped with their records management. From there I was poached by the data protection team. This was pre-GDPR, and I was working on things like Subject Access Requests, elements of policy and Freedom of Information Requests.
I had a taste of data protection and really enjoyed it, so from there I did an LLM in Information Law and Practice at Northumbria, and then went into a job at NHS Digital.
Following this, I set up my own consultancy, working with clients such as Scottish Power. I subsequently took up a consultancy role with Thoughtworks, where I’m currently employed. I had worked with Thoughtworks before, on a project in Stockport Council, and so when Thoughtworks needed a Data Protection Officer, they got in touch with me as I had the necessary experience, and that’s where I find myself today.
I’ve been in the industry for around 12 years now, and I’m currently undertaking an MBA specialising in Information Systems Strategy and Governance. When I complete this, I’m thinking of taking on a Computer Science degree just to round off my knowledge and skills base to add to my legal background.
What are the key data privacy issues that arise out of the accelerating use of ChatGPT?
It may be a slightly controversial opinion, but I think that the privacy concerns of chat GPT have been over emphasised. This is when you compare it to IoT items that may have gone under the radar, such as smart doorbells or Amazon’s Alexa – monitoring devices that governments have access to if they submit a request. Furthermore, people can use these devices for tracking; you can turn Alexa into a microphone and listen to somebody’s conversation remotely.
Sci-fi films have us convinced that AI will turn into our overlord, but I don’t think that’s necessarily the case. However, there are obviously some privacy concerns, one of them being the questionable legal basis for the collection of personal data in the first instance.
ChatGPT was trained on large swathes of the internet, which obviously contains so many people’s personal data, and ChatGPT processes that data. The owners might cite legitimate interests for that processing, but they haven’t notified people about it; I certainly didn’t get a notification or an opt out choice. I don’t think that the initial training data was particularly legal to scrape.
It’s not just ChatGPT that does this; marketing companies and data brokers do the same, just scraping whatever data they can off the internet. And I think that it’s a bit of a legislative black hole at the moment. Our data is getting scraped all of the time. Nobody’s notifying anybody, nobody’s giving you an opt out.
The other major privacy element concerns how the models are trained. Once ChatGPT has learned something, it’s very difficult to get it to unlearn it. So, even if I asked for my data to be deleted, while that may be actioned, the creators can’t overwrite the model in this way, because that’s not how the model works right now. You’ve also got things like identity theft, so we’re seeing really convincing deep fakes, and that makes phishing a lot easier.
ChatGPT learns from its inputs. If I put confidential information into chat GPT – maybe how my company, Thoughtworks operates, and if everybody from Thoughtworks were using ChatGPT to do their project plans, then the model would learn that way. An independent user might be able to ask ChatGPT how to manage a project like Thoughtworks, or how to code like Thoughworks, and they would be given that knowledge. This would obviously represent quite an issue from a consultancy, IP, and confidentiality perspective.
Furthermore, you can “jailbreak” ChatGPT – give it instructions in a way that gets around its restrictions on what it will and won’t tell you. So, if you ask the model in a specific way, it will tell you information that shouldn’t ideally be readily available, such as how to make a napalm bomb. It isn’t a privacy concern, per se, but it’s a societal concern.
On top of these issues, there’s accuracy, because ChatGPT only knows what it’s learned and can’t discriminate fact from fiction. If it has scraped material that’s twenty years old, depending on the industry, then that information might not be accurate anymore. These models may also have taken things from open forums, such as Reddit - places where people just say anything, and the technology just repeats what it’s heard, like a story from down the pub, and it will present that story as fact.
A positive with ChatGPT is it may curtail confirmation bias with how it displays information. If you search for information using Google, you’re presented with a list of choices and you click on the link that appeals to you most, and this enables confirmation bias. With ChatGPT, more rounded responses are given to requests, which can be very useful. The downside of this, is ChatGPT isn’t trying to be accurate, it’s just generating information that looks right, and so it can give misinformation.
What are the challenges that lie ahead if we are to use such technologies in a way that upholds privacy?
I think it would be helpful if ChatGPT had retrieval-augmented generation (RAG) – which retrieves data from external sources of knowledge to give context to its responses, alongside the natural language responses. This way, users would be able to track where information comes from, helping to ensure credibility, accuracy, trust, and that may help with servicing data deletion requests as the model may know where it knows information from.
I think data scraping needs to be addressed too – some sort of privacy engineering is needed to redact personal information in a better way before it is fed into the model. That would improve the legalities around the data processing that takes place. We also need to have better assurances over data accuracy, and ensure that data the model is trained on is as up-to-date as possible.
I think it’s good for regulators and laws to be technology agnostic because they have to stand the test of time. But I do think US President, Joe Biden’s latest executive order and legislation coming out of the EU are not specific enough to get under the skin of how models are trained, what techniques should be in training them, and how to get around inherent bias.
For example, if a mortgage company were to use these models to train on their algorithms, with outputs being whether an application has passed or failed, then you need to be able to explain how the algorithm reached that decision, and there has to be a human override. This is in current legislation, and so is not a question of legislation, but of enforcement.
Facial recognition technology is now commonly used on most personal devices and to help criminal investigations, and it’s one of the most advanced technologies of recent years.
However, following the European Parliament’s recent passing of its version of the Artificial Intelligence Act, the use of facial recognition remains one of artificial intelligence’s riskiest uses.
As facial-recognition tools are full of biases, lawmakers are ready to fight to avoid any risk of mass surveillance. Maybe the solution lies not in banning facial recognition or any one type of biometric, but in regulating how we consent to the sharing and tracking of identifying data.
Get up to speed on the key issues, only at PrivSec Global.
Also on the panel:
- Elisabeth Mackay, PA Consulting
- Victoria van Roosmalen, CISO & DPO, Coosto
- Fiorella Arevalo, Senior Global Privacy Officer, Media.Monks
- Elizabeth Smith, DPOrganizer
- Session: Technology: ChatGPT: the data privacy nightmare?
- Time: 11:30am – 12:15pm GMT
- Date: Day 2, Thursday 30 November 2023
Discover more at PrivSec Global
As regulation gets stricter – and data and tech become more crucial – it’s increasingly clear that the skills required in each of these areas are not only connected, but inseparable.
Exclusively at PrivSec Global on 29 & 30 November 2023, industry leaders, academics and subject-matter experts unite to explore these skills and the central role they play within privacy, security and GRC.