Playing Safe With AI

Everything related to generative AI is evolving incredibly fast. AI chatbots, tools, assistants, services and agents are becoming more capable every day. With all this buzz and excitement it's easy to lean to one side of the convenience versus safety trade-off.

This rapidly expanding space creates many opportunities for the bad guys to exploit weaknesses in security and data privacy. This could affect your personal data, your finances, or your company's data.

Being aware of what could go wrong is your best defense. In this article, we'll go through the most common risks, with recommendations to help you play safe with AI.

AI Safety

Those Terms & Conditions Matter

Most generative AI service providers offer different plans: free versions with limited capability and paid pro/business/enterprise tiers with advanced features. While paid services typically include robust data protection measures, free services often come with a hidden cost. Companies offering free AI services need something in return, and their terms of service often state they may use your data to improve their AI models.

Consider this scenario: you use a free AI service to create a product roadmap for your management team. If the service provider trains their next AI model on that information, your competitors could gain insight into your business strategy the next time they use that service for competitor analysis.

Improper usage of AI services can also lead to violations of data privacy regulations like GDPR, HIPAA, and data residency laws.

Minimise the Risk

Prompt Injection

Prompt injection is arguably the most serious and fundamental vulnerability facing AI systems today, affecting every application category from chatbots to agentic web browsers. These attacks exploit a core weakness: Large Language Models (LLMs) struggle to distinguish between legitimate prompts from trusted users and malicious instructions hidden in untrusted external content.

Since LLMs process both your prompts and external content as plain text in the same context window, it's quite easy for attackers to inject malicious instructions through websites, emails, PDFs, APIs, MCP server responses, or even images.

How Attackers Hide Instructions

Attackers use various subtle methods to conceal malicious instructions:

When an AI agent processes seemingly harmless content containing hidden instructions, it can execute dangerous actions, such as running commands with your permissions or leaking your private data.

Minimise the Risk

Using MCP Servers

The Model Context Protocol (MCP) is a standardised framework for connecting Large Language Models (LLMs) to other systems and data sources. Often described as the "USB-C for AI applications," it allows AI agents to access information and execute commands on your behalf. MCP has been rapidly adopted since its introduction, but it was designed primarily for functionality, not robust security, creating numerous security blind spots (although the spec is evolving).

When an AI agent uses an MCP server, it acts on your behalf with all of your permissions. If an AI misinterprets a request, it might execute MCP tools that cause unintended consequences. For example, a request to remove old database records could result in deleting all data.

MCP server responses can also contain malicious instructions (see Prompt Injection above).

It's incredibly tempting to connect everything together without properly considering the security and data privacy risks, unintentionally giving AI agents excessive agency.

"it's probably a certainty that probabilistic systems will bite you undeterministically"

It's like handing a bored kid your phone to play a game, next thing you know, they're deleting your photos and messaging your boss!

Key Risks

MCP servers are deployed either locally on your laptop or remotely on a server. Both configurations present significant risks:

Minimise the Risk

IT and software engineering teams must implement stringent security practices when deploying MCP servers:

Agentic Frameworks

Agentic systems leverage LLMs and orchestration frameworks to autonomously or semi-autonomously perform tasks, make decisions, and interact with external systems. They combine multiple components including language models, tools, orchestration layers, access protocols (MCP), and agent-to-agent (A2A) communication.

The Autonomy Challenge

The more autonomous the agent, the higher the potential safety risk. This creates a familiar trade-off between convenience and security, introducing specific risks and vulnerabilities:

Minimise the Risk

AI Web Browsers

AI-enabled web browsers and browser extensions are of particular concern. They enable agentic browsing, allowing the AI to navigate websites, fill forms, click buttons, and complete multi-step tasks on your behalf. These tools create a wide attack surface because they get unprecedented access to your digital life, including login credentials, browsing history, and cookies.

Key Risks

Minimise the Risk

Conclusion

AI is transforming how we work and play. The goal is not to discourage the use of AI, but to harness its benefits while minimising the risks. By reviewing the critical areas of AI security and data privacy, two key themes emerge for building safe AI practices:

Risk Matrix
Here is simple way to think about the risks when using AI:

Finally, it’s important to stay informed as the AI landscape is continuously evolving.

Further Reading