Security Whitepaper

Privacy-Preserving LLM Inference with Hardware-Attested TEEs

Our research on deploying Large Language Model inference within Trusted Execution Environments with cryptographic remote attestation. Running DeepSeek models on Azure Confidential VMs with Intel TDX.

Hardware-enforced encryption

Remote attestation API

Open-source infrastructure

TEE Security Whitepaper

Abstract

We present an open-source infrastructure for deploying Large Language Model (LLM) inference within Trusted Execution Environments (TEEs) with cryptographic remote attestation. Our implementation runs self-hosted DeepSeek models on Azure Confidential VMs with Intel TDX, providing hardware-enforced memory encryption and verifiable privacy guarantees.

We introduce a remote attestation API that enables clients to cryptographically verify TEE execution before submitting sensitive prompts. Our production deployment demonstrates practical feasibility with 12 tokens/second on CPU TEE and projects 150+ tokens/second on GPU TEE with NVIDIA H100 Confidential Computing.

The complete infrastructure, including Terraform configurations and attestation services, is available at github.com/AiAgenteq/TrustedGenAi.

Key Contributions

Production TEE-LLM Infrastructure

End-to-end LLM inference on Azure Confidential VMs with Intel TDX

Remote Attestation API

Cryptographic proof of TEE execution for client verification

Open-Source Implementation

Complete Terraform configs, attestation services, and examples

Performance Benchmarks

Empirical measurements on CPU TEE and GPU TEE projections

Privacy-Preserving LLM Inference with Hardware-Attested TEEs

Abstract

Key Contributions

Production TEE-LLM Infrastructure

Remote Attestation API

Open-Source Implementation

Performance Benchmarks

Why Trusted Execution Environments?

Hardware-Level Isolation

Cryptographic Verification

Regulatory Compliance

Zero Trust Architecture

Ready to deploy privacy-preserving AI?