Show HN: An MCP Gateway to block the lethal trifecta Hackernews Viewer

Show HN: An MCP Gateway to block the lethal trifecta

42 points by 76SlashDolphin 11 hours ago | 22 comments

Comments

I think the "lethal trifecta" framing is useful and glad that attempts are being made at this! But there are two big, hard-to-solve problems here:

1. The "lethal trifecta" is also the "productive trifecta" - people want to be able to use LLMs to operate in this space since that's where much of the value is; using private / proprietary data to interact with (do I/O with) the real world.

2. I worry that there will soon be (if not already) a fourth leg to the stool - latent malicious training within the LLMs themselves. I know the AI labs are working on this, but trying to ferret out Manchurian Candidates embedded within LLMs may very well be the greatest security challenge of the next few decades.

sebastiennight 6 hours ago

I'm trying to wrap my head around this:

1. How are you defending against the case of one MCP poisoning your firewall LLM into incorrectly classifying other MCP tools?

2. How would you make sure the LLM shows the warning, as they are non-deterministic?

3. How clear do you expect MCP specs in order for your classification step to be trustworthy? To the best of my knowledge there is no spec that outlines how to "label" a tool for the 3 axes, so you've got another non-deterministic step here. Is "writing to disk" an external comm? It is if that directory is exposed to the web. How would you know?

aaronharnly 9 hours ago

"without risk", "solves", and "Guaranteed" are big words – you might want to temper them.

doctoboggan 8 hours ago

Wouldn't the LLM running in the gateway also be susceptible to the same jailbreaks?

pamelafox 6 hours ago

How do you determine if the tools access private data? Is it based solely on their tool description (which can be faked) or by trying them in a sandboxed environment or by analyzing the code?

datadrivenangel 6 hours ago

So is any combination of MCP servers basically going to require human in the loop approval for everything?

Sounds like it defeats the point.

warthog 10 hours ago

Seen a hack using whatsapp mcp recently - this seems promising