Letting the Vampires in
I’ve been wanting to play around with Claude Code and Cowork for a while. I enjoy using LLMs in certain contexts where I’ve found their drawbacks to be less prevalent, and up until now have been pretty content accessing it via the web client/mobile app to keep them off my local systems. However it’s obvious there are some advantages to having these systems more integrated with their agentic variants. (I’m particularly thinking about using Cowork to help me organize some files on my NAS, and it might be cool to see if I can utilize Claude Code to automate my network diagram generation more).
I found this paper linked in a library repo posted by its author da5ch0 while browsing X the other day. After reading this (which I gotta be honest is a bit out of my depth.) I feel a bit more comfortable mentally threat modelling them and understanding where I’m compromising by introducing them into my network.
From what I’ve gathered, the conclusion is to direct your focus on the “thresholds” you’re letting them through, the author makes the point that the very act of introducing one past these “thresholds” will inherently introduce the power imbalance that’s foundational to the classic vampire stories.
If you can analogise setting up Defense-in-Depth measures with medieval castle-building, then I like to think this is akin to my castle deciding to build a vampire prison & placing one inside of it. I wonder if in hindsight this will be as bad an idea as that is often portrayed in fiction lol.
Note: Where the vampire analogy breaks down is that unlike vampires, LLMs don’t necessarily have their own will (depending on how you look at it I guess). The natural extension to this is “golem”, which represents the prompt injection vector well, as classically their intentions can be altered by rewriting their original inscription. So my resulting mental models are:
-
Golems — for modelling how the LLM can be manipulated into using its capabilities.
Countermeasures: control injection avenues (access to their inscriptions), (inputs/sanitization, access control). -
Vampires — for modelling the extent of what they can do if compromised.
Countermeasures: threshold controls (letting them in the room), layered isolation, access control, physical isolation (i.e. faraday cage)
UPDATE: I’ve been thinking about this more and where my own castle prison analogy breaks down is that I don’t plan to leave this VM turned on 24/7 (Largely because my Proxmox cluster resources are limited anyways). So unlike a vampire held in a castle who is always present and able to observe anything that passes by, you could also look at them like djinn in a lamp. Dangerous when let out but bound by the nature of its container and the whim of the summoner who chooses to deliberately summon one.
Initial setup#
To get started I’ve made another debian VM on my Proxmox cluster (which will probably change to a physical host if I decide to take the physical isolation aspect more seriously in future).
- Created a separate LLM user (Carmilla) for Claude Code + Cowork
- Created input, output, and workspace directories owned by Carmilla and added read and write access for my user
- Created a VAMPIRE vlan dedicated to LLM agents
- Limited VLAN internet access so Claude can only reach Anthropic servers and nothing else (unless added explicitly)
- Avoided creating new inter-VLAN firewall rules — instead leveraged existing SERVER VLAN rules to remote-viewer into Proxmox and access the VM
Next steps#
I’ve got so much else to prioritize working on right now. So for now I’m just posting this largely to share my initial thoughts and da5ch0’s excellent writing. Like my previous posts I’ll come back and update it when I have made further progress on this and landed on something I’m happy with.
- Decide on a method for file transfer to/from the host that ideally doesn’t require adding firewall rules