Skip to content

Agent-VFS (Universal Internet-as-a-File Interface) #519

@phodal

Description

@phodal

Feature Proposal: Agent-VFS (Universal Internet-as-a-File Interface)

Description

I propose implementing Agent-VFS, a standardized virtual file system layer that abstracts heterogeneous internet services (HTTP APIs, Databases, Cloud Infrastructure) into a unified, POSIX-compliant file system structure.

Instead of requiring Agents to use diverse SDKs or write complex HTTP requests, this feature allows Agents to perform actions via standard file operations:

  • Reading a file (read()) maps to GET requests or database queries.
  • Writing to a file (write()) maps to POST/PUT requests or database updates.
  • Listing a directory (ls/readdir) maps to resource discovery or API endpoint exploration.

Motivation

Current software interaction interfaces (APIs) are designed for human developers, not AI Agents, leading to significant inefficiencies:

  • Context Window Waste: Agents consume excessive tokens retrieving and reading API documentation, handling authentication (OAuth/JWT), and debugging schema errors.
  • Fragmentation: Protocols (REST, GraphQL, gRPC) and formats (JSON, XML) vary widely, increasing the "cognitive load" and error rate for Agents.
  • Lack of State Perception: Stateless API calls make it difficult for Agents to "sense" their environment compared to the spatial intuition provided by a file system (e.g., cd into a context, ls to see available tools).

By adopting an "Everything is a File" philosophy (similar to Plan 9), we can reduce context switching costs and allow Agents to utilize mature, language-agnostic file I/O libraries.

Proposed Solution

The solution involves building a User Space File System (FUSE) daemon, preferably in Rust for performance and safety, which acts as a bridge between local file operations and remote services.

1. Core Architecture

  • FUSE Implementation: Use a FUSE kernel module to intercept system calls (open, write, close) and route them to a user-space daemon.
  • Protocol: Map standard POSIX operations to network protocols. The daemon manages connection pooling, authentication injection, and payload serialization.

2. Interaction Patterns

  • REST-FS (API Mapping):
    • Map URL paths to directory structures (e.g., /mnt/github/owner/repo/issues/).
    • Data Retrieval: cat .../issue/1/body triggers a GET request and returns the content.
    • Creation (Magic Files): To create resources (POST), the Agent writes JSON to a special .../new file. Upon close(), the daemon sends the POST request and renames the file to the new Resource ID.
    • Querying: Use "Control Files" (e.g., writing "state=open" to a query file) to dynamically populate a results directory, handling pagination via lazy loading.
  • DB-FS (Database Mapping):
    • Map databases to directories: /mnt/postgres/db_name/table_name/row_id.json.
    • Support transactional integrity by performing operations in a temporary "transaction directory" and committing via a specific trigger file.
  • MCP Bridge:
    • Integrate the Model Context Protocol (MCP). The VFS acts as a generic adapter, mounting any existing MCP Server (e.g., Google Drive, Slack) as a local folder, instantly leveraging the MCP ecosystem.

3. Usage Example (Python)

Instead of importing requests and handling headers:

# Agent reads a GitHub issue body directly
with open('/mnt/github/openai/gym/issues/1/body', 'r') as f:
    content = f.read()

# Agent creates a new issue via "Magic File"
with open('/mnt/github/openai/gym/issues/new', 'w') as f:
    f.write('{"title": "Bug Report", "body": "..."}') 
# VFS handles the POST request on file close

Alternatives Considered

  • Python-based VFS (fusepy):
  • Pros: Easier ecosystem integration for AI libraries.
  • *Cons:
    • Rejected due to the Global Interpreter Lock (GIL) limiting concurrency and high memory overhead, which is detrimental when an Agent opens hundreds of "files" simultaneously.
    • Language-Specific SDK Generation:
  • Pros: Native code execution.
    • Cons: Requires the Agent to write and debug code for thousands of different APIs. It does not solve the context window consumption caused by reading documentation.

Additional Context

  • Security: The system should support "View Isolation" using Linux Namespaces to ensure Agents only access authorized API "directories." Read-only mounts should be used for untrusted Agents.
  • Consistency: To prevent data loss during crashes, the implementation should support Direct I/O (bypassing kernel cache) or explicit fsync() requirements to ensure data is confirmed by the remote API before returning success.
  • Vision: This is a foundational step toward a true "Agent OS," where the Agent perceives no browser or apps, only a file system representing the world.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions