Blog articles

Security reviews and AI models: How to decide what to greenlight

May 08, 2023

Generative AI and Large Language Models (LLM) are here in a big way, and there are organizations keen to jump into the water, often without checking the depth first. However, many vendors have not done security reviews of their AI products at all, often basing their work on open source models.

When it comes to evaluating these AI products, cybersecurity professionals are stuck with the decision to approve or disapprove of them. Theoretically, it’s easy, but practically, it’s not so cut and dry. In this article, we’ll discuss these issues, as well as provide some advice on how to make the call on what to allow.

Table of contents

The issue: AI models being signed off without due diligence

Generative AI and LLMs (like ChatGPT) are free, they’re here, and people are excited about them. Why wouldn’t they be? They can see the potential for a tool to take some of the stress off their working lives: helping them write code, write business presentations, or come up with policy ideas. So it’s no wonder they come to security professionals, eager to have the AI tool signed off and approved for use. Again, it’s free, and what’s more important than that?

Right now, there are security professionals who simply say “Go ahead” and rubber-stamp them or are at a loss on how to even review these products. One of the biggest organizational concerns is about sensitive data getting leaked: will our users be putting client data or sensitive information into the LLM? To fix this, organizations simply take out the data concession which lets vendors use their data, mainly for training their AI, and then just onboard it.

The problem is, this only deals with the issue of sensitive data being actively shared with third parties. What about other threats, such as LLMs introducing code vulnerabilities into your products, or exploits in the cutting-edge service you are using?

A vulnerability example: Indirect Injection attacks on Bing Chat

One example of this is an “Indirect Prompt Injection”, a new class of vulnerabilities that affect language models like ChatGPT when interfaced with other applications. LLMs as a shared well can be “poisoned” by bad actors, so when a user requests something, compromised LLMs can be remote controlled or be used to exfiltrate or change user data, among other things.

Indirect injection attack diagram on LLMs

To learn more about this type of attack, you can read the paper or check out the GitHub repo which offers a proof of concept for the findings. But this is just one example of the type of security threats that these AI models can introduce. It’s one that many cybersecurity professionals may not be aware of, because the overwhelming narrative is about LLMs being dangerous in terms of sensitive data leaks, and bad actors using them to create phishing emails and malware.

The truth is, the technology is so new, so fringe, that every week something new is being discovered. Security professionals who are aware of the risks are left with a conundrum: what’s the balance between just letting everything in or not when it comes to AI? Being a blocker is a role that most cybersec pros are uncomfortably familiar with, and rarely relish.

How to evaluate if an AI model is safe for your organization to use

Alright, let’s stop talking about the problems, and start talking about the solutions. You’ve got an AI model you need to review, so how do you do that?

1. Check the AI model cards

A model card is a document that provides information about an AI model’s capabilities, limitations, and potential biases. They are designed to not only help consumers, but other developers and regulators as well. By using these model cards, security professionals can start to compare same-type AI services.

Some of the things you may find in a model card include:

  • Model details: The type of AI model used, data used to train the model, and the input and output formats

  • Intended use: How the model is meant to be used (and sometimes, what it shouldn’t be used for — more on that below)

  • Performance metrics: The accuracy and precision of the model, other relevant statistics

  • Ethical considerations: Any potential biases, limitations, or ethical considerations associated with the model

  • Data and privacy considerations: How the data used to train and test the model was collected, and how user privacy is protected

Model cards were proposed in a 2019 paper, and companies such as NVIDIA and Google are now using them. According to Daniel Rohrer, VP Software Product Security at NVIDIA, model cards are part of an initiative to be transparent about a model’s capabilities.

“There are model cards that say ‘Hey, this model is good for this, it’s not good for this, and it has never been tested for this’. For negative cases (where it hasn’t been tested)… we don’t know how it’s going to behave, and we’re being very clear about it,” he said during a recent keynote at RSA Conference 2023.

2. Check to see if it’s labeled as “research”

“Research” and “Product” are not the same thing, and there is a slippery slope between the two. Dr. Rumman Chowdhury, founder of Bias Buccaneers, gave a great explanation of this divide at the same RSAC keynote.

“Research can be put out into the world very incomplete, and that is the nature of research. But now because of the rapid pace of AI adoption, literally things that are called research are productized overnight, and pushed out as if they are products but still called research,” she said.

ChatGPT is research example

Dr. Chowdhury said it was likely a result of vendors sitting with their legal team, and their legal teams saying not to call it a product, because then there would be product liability. However, if it is called research, then there is no responsibility to make the product problem-free.

“(This issue) actually predates LLMs, we’re just seeing it explode with this adoption… It's open source being pushed out into the world, being pushed into product with insufficient review, oversight, and understanding, and is allowed to be out there and pushed under the hood into product because it is called research.”

The key takeaway? Be careful of any AI model that has the word “research” attached to it.

3. Evaluate their maturity via their security pages

Like any other product, evaluating AI models should still hit the same beats of supply chain diligence. Most companies have a page called “Security” or “Trust Center” in their footer (If they don’t, that’s kind of a warning sign). Check these pages, and start asking questions like:

  • Do they perform third-party penetration tests and audits?

  • Do they have vulnerability management?

  • Do they have a Secure Software Development Lifecycle? (SDLC)?

  • Do they have SOC2 Type 2 or ISO27001 certification? Are they GDP compliant?

  • Do they have a bug bounty program?

  • Do they offer Single Sign On (SSO), Two-factor or Mutli-Factor Authentication?

  • How do they say the data is handled? Do they encrypt data at rest, or perform end-to-end encryption (E2EE)?

… And similar indications that they take security seriously. Give them a rating based on what their security maturity level is.

4. Separate the near-term risks vs the long term risks

According to Vijay Bolina, the CISO of Deep Mind, not all risks are equal, and separating them into categories can be beneficial.

“I think it’s extremely important from the perspective of people exploring these technologies, or developing them from a product and service standpoint, to understand what their threat model is and take appropriate risk-based approaches or a risk management approach to addressing what could be considered near-term risks that need to be addressed versus long term emerging risks that maybe could be tabled for a bit,” he said.

5. Ask yourself if a LLM is actually needed to solve the business problem

Look, LLMs are great, but is it what you really need? According to Daniel Rohrer, the same problems can often be solved by using something other than a large language model.

“The very large models are interesting… but I think for the vast majority of use cases, you don’t even need or want these, because it’s very costly to run these huge models,” he said.

“You can have a much more narrowly constrained, task-specific model that I think would perform for a lot of use cases, and is actually more appropriate.”

For example, LLMs have a tendency to “hallucinate” and perpetuate falsehoods, which is not good for many business processes, Daniel said.

“If I ask the model the same thing three days apart, do I get a different answer? Like if I’ve got an AI copilot helping me in the Security Operations Center (SOC), I really want it to be consistent in its evaluation models of how it’s behaving in the system. Likewise, (there’s issues with) accuracy, toxicity, those other things you have to curate down with Reinforcement Learning from Human Feedback (RLHF).”

Daniel said while LLMs were a “great workplace amplifier for a lot of disciplines”, it was a question of finding the right fit for it.

How to mitigate the risks around AI models

There is no such thing as a perfect AI model. We won’t expect software to be bug-free, so why hold AI to a separate standard? It needs to meet a reasonable expectation, though. That’s where risk mitigation comes in.

No model is perfect, so build resilient processes around them

Daniel said there is often an overemphasis on what a model can do, rather than the system it is embedded into, which can be used to mitigate risk.

“There’s a lot of opportunity for processing on the front and back end of models, and building systems that even if the model has challenges… you can buffer out some of those consequences with really good engineering, and I think that’s true of any complex system,” he said.

He said when operationalizing AI and enforcing security, the mindset of engaging in preventative defense, but “assuming breach” is effective.

“You need to build systems where you assume drift. Even if they (AI models) were perfect today, and I see this of my teenager who brings me new words… LLM will need to learn new things, and it will inevitably be wrong at some point,” he said.

“And so if you shift your mind to that mindset, you start building the monitoring, the feedback loops, the human interventions into the system, which makes it robust and resilient in the face of what might be otherwise inconclusive. Because it can’t be perfect, it’s always moving.”

Keep humans in the loop

“I think there’s always going to be a human in the loop,” Vijay said. “To validate what comes from these systems, whether it’s a validation or a recommendation. I think that’s something we should all make sure we’re doing correctly.”

Conclusion: Don’t rubber-stamp AI, do your research

By shopping around and comparing AI models, you can present a reasoned argument to your stakeholders, and find the best possible fit for your organization in terms of security and meeting customer needs. That being said, the best possible fit, may be not at all, and that’s still a reasonable answer.


Michael Teske

Michael Teske is a Principal Author Evangelist-Cloud Engineer/Security at Pluralsight who loves helping people build out their skill toolkit. He has over 25 years of experience in the IT industry, including 17 of those years as an IT instructor focusing on Microsoft server infrastructure solutions and enterprise applications, including automation using PowerShell.