-
Notifications
You must be signed in to change notification settings - Fork 480
Description
Feature Proposal: Agent-VFS (Universal Internet-as-a-File Interface)
Description
I propose implementing Agent-VFS, a standardized virtual file system layer that abstracts heterogeneous internet services (HTTP APIs, Databases, Cloud Infrastructure) into a unified, POSIX-compliant file system structure.
Instead of requiring Agents to use diverse SDKs or write complex HTTP requests, this feature allows Agents to perform actions via standard file operations:
- Reading a file (
read()) maps to GET requests or database queries. - Writing to a file (
write()) maps to POST/PUT requests or database updates. - Listing a directory (
ls/readdir) maps to resource discovery or API endpoint exploration.
Motivation
Current software interaction interfaces (APIs) are designed for human developers, not AI Agents, leading to significant inefficiencies:
- Context Window Waste: Agents consume excessive tokens retrieving and reading API documentation, handling authentication (OAuth/JWT), and debugging schema errors.
- Fragmentation: Protocols (REST, GraphQL, gRPC) and formats (JSON, XML) vary widely, increasing the "cognitive load" and error rate for Agents.
- Lack of State Perception: Stateless API calls make it difficult for Agents to "sense" their environment compared to the spatial intuition provided by a file system (e.g.,
cdinto a context,lsto see available tools).
By adopting an "Everything is a File" philosophy (similar to Plan 9), we can reduce context switching costs and allow Agents to utilize mature, language-agnostic file I/O libraries.
Proposed Solution
The solution involves building a User Space File System (FUSE) daemon, preferably in Rust for performance and safety, which acts as a bridge between local file operations and remote services.
1. Core Architecture
- FUSE Implementation: Use a FUSE kernel module to intercept system calls (
open,write,close) and route them to a user-space daemon. - Protocol: Map standard POSIX operations to network protocols. The daemon manages connection pooling, authentication injection, and payload serialization.
2. Interaction Patterns
- REST-FS (API Mapping):
- Map URL paths to directory structures (e.g.,
/mnt/github/owner/repo/issues/). - Data Retrieval:
cat .../issue/1/bodytriggers a GET request and returns the content. - Creation (Magic Files): To create resources (POST), the Agent writes JSON to a special
.../newfile. Uponclose(), the daemon sends the POST request and renames the file to the new Resource ID. - Querying: Use "Control Files" (e.g., writing "state=open" to a
queryfile) to dynamically populate a results directory, handling pagination via lazy loading.
- Map URL paths to directory structures (e.g.,
- DB-FS (Database Mapping):
- Map databases to directories:
/mnt/postgres/db_name/table_name/row_id.json. - Support transactional integrity by performing operations in a temporary "transaction directory" and committing via a specific trigger file.
- Map databases to directories:
- MCP Bridge:
- Integrate the Model Context Protocol (MCP). The VFS acts as a generic adapter, mounting any existing MCP Server (e.g., Google Drive, Slack) as a local folder, instantly leveraging the MCP ecosystem.
3. Usage Example (Python)
Instead of importing requests and handling headers:
# Agent reads a GitHub issue body directly
with open('/mnt/github/openai/gym/issues/1/body', 'r') as f:
content = f.read()
# Agent creates a new issue via "Magic File"
with open('/mnt/github/openai/gym/issues/new', 'w') as f:
f.write('{"title": "Bug Report", "body": "..."}')
# VFS handles the POST request on file closeAlternatives Considered
- Python-based VFS (fusepy):
- Pros: Easier ecosystem integration for AI libraries.
- *Cons:
- Rejected due to the Global Interpreter Lock (GIL) limiting concurrency and high memory overhead, which is detrimental when an Agent opens hundreds of "files" simultaneously.
- Language-Specific SDK Generation:
- Pros: Native code execution.
- Cons: Requires the Agent to write and debug code for thousands of different APIs. It does not solve the context window consumption caused by reading documentation.
Additional Context
- Security: The system should support "View Isolation" using Linux Namespaces to ensure Agents only access authorized API "directories." Read-only mounts should be used for untrusted Agents.
- Consistency: To prevent data loss during crashes, the implementation should support Direct I/O (bypassing kernel cache) or explicit
fsync()requirements to ensure data is confirmed by the remote API before returning success. - Vision: This is a foundational step toward a true "Agent OS," where the Agent perceives no browser or apps, only a file system representing the world.