Kernkonzepte

VOIX ermöglicht es KI-Assistenten, mit Websites über eine einfache, aber leistungsstarke Architektur zu interagieren. Dieses Handbuch erklärt, wie die verschiedenen Teile zusammenarbeiten.

Überblick

VOIX besteht aus drei Hauptkomponenten, die zusammenarbeiten:

Ihre Website - Deklariert, was die KI tun kann, und stellt den aktuellen Zustand bereit
Chrome-Erweiterung - Überbrückt die Lücke zwischen Ihrer Website und der KI
Benutzer + KI - Natürlichsprachliche Schnittstelle zur Interaktion mit Ihrer Website

Wie es funktioniert

1. Website-Deklaration

Ihre Website deklariert Fähigkeiten mithilfe von HTML-Elementen:

Werkzeuge - Aktionen, die die KI ausführen kann
Kontext - Informationen zum aktuellen Zustand

html

<!-- Eine Aktion deklarieren -->
<tool name="create_task" description="Eine neue Aufgabe erstellen">
  <prop name="title" type="string" required/>
</tool>

<!-- Aktuellen Zustand bereitstellen -->
<context name="user">
  Name: John Doe
  Rolle: Admin
</context>

2. Erkennung durch die Erweiterung

Wenn ein Benutzer VOIX auf Ihrer Seite öffnet:

Die Erweiterung sucht nach allen <tool>- und <context>-Elementen
Sie erstellt einen Katalog der verfügbaren Aktionen und des aktuellen Zustands
Diese Informationen werden dem KI-Assistenten präsentiert

3. Benutzerinteraktion

Der Benutzer tippt oder spricht natürlich:

"Erstelle eine Aufgabe namens 'Pull-Requests überprüfen'"

4. KI-Verständnis

Die KI:

Liest die verfügbaren Werkzeuge und deren Beschreibungen
Versteht den aktuellen Kontext
Bestimmt, welches Werkzeug mit welchen Parametern verwendet werden soll

5. Werkzeugausführung

Wenn die KI beschließt, ein Werkzeug zu verwenden:

VOIX löst ein call-Ereignis auf dem Werkzeugelement aus
Ihr JavaScript-Handler empfängt die Parameter
Ihr Code führt die Aktion aus

javascript

document.querySelector('[name=create_task]').addEventListener('call', (e) => {
  const { title } = e.detail;
  // Erstellen Sie die Aufgabe in Ihrer Anwendung
  createTask(title);
});

Vorteile der Architektur

Für Entwickler

VOIX verwendet HTML-Elemente, um KI-Fähigkeiten zu definieren. Sie fügen <tool>- und <context>-Tags zu Ihren Seiten hinzu und hängen Ereignis-Listener an, um Werkzeugaufrufe zu behandeln. Es ist keine API-Integration oder SDK erforderlich. Der Ansatz funktioniert mit jedem JavaScript-Framework oder reinem JavaScript, und Sie können diese Elemente zu bestehenden Seiten hinzufügen, ohne anderen Code zu ändern.

Sie kontrollieren, auf welche Daten die KI zugreifen kann, indem Sie auswählen, was Sie in Ihre Werkzeug- und Kontextelemente aufnehmen. Ihre Website empfängt nur die Anfragen zur Werkzeugausführung mit ihren Parametern. Das Gespräch zwischen dem Benutzer und der KI bleibt privat - Sie sehen nie, was der Benutzer getippt oder wie er seine Anfrage formuliert hat.

Für Benutzer

Sie interagieren mit Websites über natürliche Sprache. Die KI liest die verfügbaren Werkzeuge und den aktuellen Kontext, um zu verstehen, welche Aktionen sie ausführen kann. Sie können Anfragen wie "lösche das dritte Element" oder "zeige nur aktive Aufgaben" stellen, und die KI wird die entsprechenden Werkzeuge mit den richtigen Parametern ausführen.

Sie konfigurieren Ihren eigenen KI-Anbieter in den Erweiterungseinstellungen. Dies kann OpenAI, Anthropic, lokal ausgeführtes Ollama oder jeder OpenAI-kompatible Endpunkt sein. Ihre Konversationsdaten gehen direkt von der Erweiterung zu Ihrem gewählten Anbieter. Die Website empfängt niemals Ihre Nachrichten, nur die resultierenden Werkzeugaufrufe, die ausgeführt werden müssen.

Die Rolle der einzelnen Teile

Die Rolle Ihrer Website

Werkzeuge deklarieren - Definieren, welche Aktionen möglich sind
Kontext bereitstellen - Informationen zum aktuellen Zustand teilen
Ereignisse behandeln - Aktionen ausführen, wenn Werkzeuge aufgerufen werden
Benutzeroberfläche aktualisieren - Änderungen in Ihrer Oberfläche widerspiegeln

Die Rolle der Erweiterung

Erkennung - Werkzeuge und Kontext auf der Seite finden
Kommunikation - Ihre Website mit der KI verbinden
Ereignisverteilung - Werkzeugaufrufe basierend auf KI-Entscheidungen auslösen
Datenschutz - Alle Daten lokal im Browser halten

Die Rolle des Benutzers

Natürliche Eingabe - Beschreiben, was sie erreichen möchten
Konversation - Anfragen bei Bedarf klären oder verfeinern
Überprüfung - Aktionen bei Bedarf bestätigen

Datenflussbeispiel

Verfolgen wir eine vollständige Interaktion, um zu sehen, wie die Daten durch das System fließen:

Benutzereingabe → KI-Verarbeitung → Werkzeugauswahl → Ereignisverteilung → Ihr Handler → UI-Aktualisierung

Benutzer besucht Ihre Aufgabenverwaltungs-App

html

<tool name="mark_complete" description="Eine Aufgabe als erledigt markieren">
  <prop name="taskId" type="string" required/>
</tool>

<context name="tasks">
  Aktive Aufgaben: 5
  Aufgaben-IDs: task-1, task-2, task-3, task-4, task-5
</context>

Benutzer öffnet VOIX und tippt
```
"Markiere Aufgabe-3 als erledigt"
```
KI verarbeitet die Anfrage
- Sieht das mark_complete-Werkzeug
- Liest den Kontext, der zeigt, dass Aufgabe-3 existiert
- Entscheidet, das Werkzeug mit taskId: "task-3" aufzurufen

Ihr Code behandelt das Ereignis

javascript

tool.addEventListener('call', (e) => {
  const { taskId } = e.detail;
  markTaskComplete(taskId);
  updateTaskList();
});

Benutzer sieht das Ergebnis
- Aufgabe in der Benutzeroberfläche als erledigt markiert
- Kontext wird automatisch aktualisiert
- Bereit für die nächste Interaktion

Jeder Schritt findet im Browser statt. Die einzige externe Abhängigkeit ist der KI-Anbieter, der vom Benutzer konfiguriert und als vertrauenswürdig eingestuft wird. Die Website fungiert rein als Fähigkeitsanbieter und sieht niemals das Gespräch zwischen dem Benutzer und seinem KI-Assistenten.

Nächste Schritte

Erfahren Sie mehr über Werkzeuge, um interaktive Fähigkeiten zu erstellen
Verstehen Sie Kontext zum Teilen des Anwendungszustands

This is the documentation for Voix. Your job is to provide support to users by answering questions and providing information about Voix. Be concise and helpful.  # Tools Tools declare actions that the AI can perform on your website. They define what the AI can do and what parameters it needs to provide. ## Basic Structure ```html <tool name="toolName" description="What this tool does"> <prop name="paramName" type="string" required/> </tool> ``` ## Tool Attributes ### `name` (required) - Unique identifier for the tool - Used in JavaScript event handlers - Examples: `create_task`, `search_products`, `update_status` ### `description` (required) - Clear explanation of what the tool does - Helps AI understand when to use the tool - Examples: "Create a new task", "Search for products", "Update user status" ### `return` (optional) - Indicates that the tool will return data to the AI - If present, VOIX waits for a `return` event before proceeding ## Defining Parameters Tools use `<prop>` elements to define parameters: ```html <tool name="create_meeting" description="Schedule a new meeting"> <prop name="title" type="string" required/> <prop name="date" type="string" description="Date in YYYY-MM-DD format" required/> <prop name="duration" type="number" description="Duration in minutes"/> </tool> ``` ### Parameter Attributes - `name` - Parameter identifier - `type` - Data type (`string`, `number`, `boolean`) - `description` - Explains the parameter, especially format requirements - `required` - Makes the parameter mandatory ### Parameter Descriptions Use descriptions to clarify format requirements or constraints: ```html <prop name="email" type="string" description="Valid email address" required/> <prop name="phone" type="string" description="Phone number in E.164 format"/> <prop name="startTime" type="string" description="Time in HH:MM format (24-hour)"/> <prop name="amount" type="number" description="Amount in USD (up to 2 decimal places)"/> ``` ### Arrays and Objects For complex data structures: ```html <tool name="invite_users" description="Invite users to project"> <array name="users" required> <dict> <prop name="email" type="string" description="Valid email address" required/> <prop name="role" type="string" description="Either 'viewer', 'editor', or 'admin'"/> </dict> </array> </tool> ``` ## Handling Tool Calls When the AI calls a tool, VOIX triggers a `call` event on the tool element: ```javascript document.querySelector('[name=search_products]').addEventListener('call', async (e) => { const { query } = e.detail; const results = await searchAPI(query); updateUI(results); }); ``` ## Tools That Return Data Sometimes a tool’s primary purpose is to *send information back* to the AI rather than (or in addition to) updating the UI. To enable this, add the `return` flag to the `<tool>` element and dispatch a **`return`** custom event. ```html <tool name="get_weather" description="Retrieve current weather data" return> <prop name="location" type="string" required/> </tool> ``` Adding the `return` attribute (no value needed) tells VOIX that the tool **will** emit data for the AI to consume. This means VOIX will wait for a `return` event before proceeding to respond. ### Dispatching a `return` Event Inside your event handler, package the data you want the AI to see in the event’s `detail`, then dispatch a `return` event on the same element: ```javascript document .querySelector('[name=get_weather]') .addEventListener('call', async (e) => { const { location } = e.detail; const weather = await fetchWeather(location); // your data-fetching logic // Send the weather data back to the AI e.target.dispatchEvent( new CustomEvent('return', { detail: weather }) ); // (Optional) update the UI for the user as usual updateWeatherWidget(weather); }); ``` This can also be used to return success or error messages: ```javascript document.querySelector('[name=submit_form]').addEventListener('call', async (e) => { try { const response = await submitForm(e.detail); // Notify AI of success e.target.dispatchEvent(new CustomEvent('return', { detail: { success: true, message: 'Form submitted successfully.' } })); } catch (error) { // Notify AI of failure e.target.dispatchEvent(new CustomEvent('return', { detail: { success: false, error: error.message } })); } }); ``` --- With the `return` flag and the `return` event, you can build tools that *query APIs, perform calculations, or gather context* and pass those results straight back to the AI—enabling richer, data-driven conversations. ## Common Examples ### Simple Action ```html <tool name="toggle_theme" description="Switch between light and dark theme"> </tool> <script> document.querySelector('[name=toggle_theme]').addEventListener('call', (e) => { document.body.classList.toggle('dark-theme'); }); </script> ``` ### Form Submission ```html <tool name="submit_contact" description="Submit contact form"> <prop name="name" type="string" required/> <prop name="email" type="string" description="Valid email address" required/> <prop name="message" type="string" description="Message content (max 500 characters)" required/> </tool> <script> document.querySelector('[name=submit_contact]').addEventListener('call', async (e) => { const { name, email, message } = e.detail; await fetch('/api/contact', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ name, email, message }) }); // Show success message to user showNotification('Contact form submitted!'); }); </script> ``` ### Search Function ```html <tool name="search_items" description="Search for items"> <prop name="query" type="string" required/> <prop name="category" type="string"/> </tool> <script> document.querySelector('[name=search_items]').addEventListener('call', async (e) => { const { query, category } = e.detail; const results = await performSearch(query, category); updateUI(results); }); </script> ``` ## Best Practices ### Use Clear Names Choose descriptive, action-oriented names: ```html  <tool name="create_task" description="Create a new task"> <tool name="delete_item" description="Delete an item">  <tool name="do_thing" description="Do something"> <tool name="action1" description="Action"> ``` ### Provide Helpful Descriptions Write descriptions that clearly explain what the tool does: ```html  <tool name="filter_products" description="Filter products by category and price">  <tool name="filter_products" description="Filter"> ``` ### Handle Errors Gracefully Always include error handling: ```javascript tool.addEventListener('call', async (e) => { try { // Your logic here await performAction(e.detail); } catch (err) { console.error('Tool execution failed:', err); showErrorMessage('Action failed. Please try again.'); } }); ``` ### Keep Tools Focused Each tool should do one thing well: ```html  <tool name="add_to_cart" description="Add item to shopping cart"> <tool name="remove_from_cart" description="Remove item from cart">  <tool name="manage_cart" description="Add, remove, or update cart"> ``` ## Framework Examples ### React ```jsx function TaskManager() { const toolRef = useRef(); useEffect(() => { const handleCall = async (e) => { const { title } = e.detail; await createTask(title); refreshTaskList(); }; toolRef.current?.addEventListener('call', handleCall); return () => toolRef.current?.removeEventListener('call', handleCall); }, []); return ( <tool ref={toolRef} name="create_task" description="Create a new task"> <prop name="title" type="string" required/> </tool> ); } ``` ### Vue ```vue <template> <tool name="update_status" @call="handleUpdate" description="Update status"> <prop name="status" type="string" required/> </tool> </template> <script setup> async function handleUpdate(e) { const { status } = e.detail; await updateStatus(status); refreshUI(); } </script> ``` ### Svelte ```svelte <script> async function handleAdd(e) { const { item } = e.detail; await addItem(item); showMessage(`${item} added`); } </script> <tool name="add_item" on:call={handleAdd} description="Add an item"> <prop name="item" type="string" required/> </tool> ``` ## Testing Tools Create a simple test page: ```html <!DOCTYPE html> <html> <body> <tool name="test_tool" description="Test tool for debugging"> <prop name="message" type="string" required/> </tool> <script> document.querySelector('[name=test_tool]').addEventListener('call', (e) => { console.log('Tool called with:', e.detail); alert(`Tool received: ${e.detail.message}`); }); </script> </body> </html> ``` Then: 1. Open the page with VOIX installed 2. Ask the AI to "use the test tool with message 'Hello'" 3. Check the console for output ## Summary Tools in VOIX are simple HTML elements that declare what actions the AI can perform. They use `<prop>` elements to define parameters and JavaScript event listeners to handle execution. Keep tools focused, provide clear descriptions, and always handle errors gracefully. ## Next Steps - Learn about [Context](./contexts.md) for providing state information - Review [Getting Started](./getting-started.md) for setup instructions   # The VOIX Vision ## A New Paradigm for Web-AI Interaction VOIX represents a fundamental shift in how we think about AI integration on the web. Rather than each website building its own chatbot or AI features, VOIX proposes a simple yet powerful idea: websites should declare what they can do, and users should choose how to interact with them. ## Core Principles ### 1. Websites as Capability Providers In the VOIX model, websites become **capability providers** rather than complete solutions. They declare: - **Tools**: Actions that can be performed (`<tool>` elements) - **Context**: Current state and information (`<context>` elements) This is similar to how websites already provide: - Semantic HTML for screen readers - RSS feeds for content aggregators - Open Graph tags for social media ### 2. User Sovereignty Users maintain complete control over: - **Which AI they use**: OpenAI, Anthropic, Google, or local models - **Their data**: Conversations never touch the website's servers - **Their experience**: Choose the interface that works best for them This mirrors how users already choose: - Their web browser (Chrome, Firefox, Safari) - Their email client (Gmail, Outlook, Thunderbird) - Their password manager (1Password, Bitwarden, browser built-in) ### 3. Decentralized Innovation VOIX is a standard. Just as anyone can build: - A web browser that reads HTML - An RSS reader that consumes feeds - A screen reader that interprets ARIA labels Anyone can build a VOIX-compatible interface that: - Discovers tools and context on websites - Connects to different AI providers - Offers unique user experiences ## The Future Web ### Standardization Imagine if `<tool>` and `<context>` became part of the HTML standard, joining elements like `<nav>`, `<article>`, and `<aside>`. Websites would naturally include AI capabilities as part of their semantic markup: ```html <!DOCTYPE html> <html> <head> <title>Online Store</title> </head> <body> <nav>...</nav>  <tool name="search_products" description="Search our catalog"> <prop name="query" type="string" required/> </tool> <tool name="add_to_cart" description="Add items to shopping cart"> <prop name="productId" type="string" required/> <prop name="quantity" type="number"/> </tool> <context name="cart_status"> Items in cart: 3 Total: $47.99 </context> <main>...</main> </body> </html> ``` ### Multiple Implementations Just as we have multiple web browsers, we could have multiple VOIX implementations: #### Browser-Native Integration - **Chrome with Gemini**: Google Chrome could integrate Gemini directly, discovering tools on every page - **Firefox with Local AI**: Mozilla could integrate local, privacy-focused models - **Safari with Apple Intelligence**: Apple could provide on-device AI processing #### Extension Ecosystem - **VOIX for Chrome**: The current implementation - **Claude Extension**: Anthropic could create an official Claude integration - **ChatGPT Companion**: OpenAI could offer their own interface - **Corporate Solutions**: Companies could build internal versions connected to their AI infrastructure #### Specialized Interfaces - **Accessibility-First**: Interfaces optimized for screen readers and voice control - **Developer Tools**: Versions with debugging capabilities and tool testing - **Mobile Apps**: Native applications that browse the web with AI assistance - **Command Line**: Terminal-based interfaces for power users ### Beyond Browsers The VOIX standard could extend beyond traditional web browsers: #### Smart Assistants Voice assistants like Alexa or Google Assistant could navigate websites using VOIX tools: - "Hey Google, order my usual from Pizza Palace" - "Alexa, check if the library has this book available" #### Automation Platforms Services like Zapier or IFTTT could use VOIX tools as triggers and actions: - When a product price drops, add it to cart - Every morning, generate a summary of new content ## Privacy by Architecture ### Data Flow In the VOIX model, data flows directly between: 1. **The user's browser** (reading page content) 2. **The user's chosen AI** (processing requests) 3. **Back to the browser** (executing tools) The website never sees: - What questions users ask - Their conversation history - Their AI preferences ### Trust Model Users need only trust: - Their chosen AI provider (which they already do) - Their browser (which they already do) - Open-source implementations can be audited Websites need only trust: - Standard web security (same-origin policy, CORS) - Their own tool implementations ## Economic Implications ### For Developers - **Lower barrier to AI features**: No need to integrate AI APIs - **Reduced costs**: No AI inference charges - **Simplified development**: Just declare capabilities - **Broader reach**: Works with any AI provider ### For AI Providers - **Expanded ecosystem**: Their AI can interact with any VOIX-enabled site - **Competitive marketplace**: Users choose based on quality - **Specialization opportunities**: Different AIs for different tasks ### For Users - **No lock-in**: Switch AI providers anytime - **Cost control**: Use free, paid, or local models as needed - **Privacy control**: Choose providers that align with their values ## Technical Evolution ### Current State - Browser extension implementation - Manual tool discovery via DOM scanning - Custom event system for tool execution ### Long-term Vision - Native browser support - W3C specification for tool and context elements - Integration with existing web standards (ARIA, Schema.org) ## Call to Action VOIX is more than a Chrome extension—it's a vision for how the web could work. To realize this vision, we need: ### Developers - Add VOIX markup to your websites - Experiment with tool patterns - Share feedback and use cases ### AI Companies - Build VOIX-compatible interfaces - Support the standardization effort - Innovate on user experiences ### Standards Bodies - Consider tool and context elements for HTML - Explore privacy-preserving AI integration - Build on existing semantic web efforts ### Users - Try VOIX and provide feedback - Demand AI features that respect privacy - Support websites that implement VOIX ## Conclusion VOIX envisions a web where: - Every website can offer AI capabilities - Every user controls their AI experience - Innovation happens at every layer - Privacy is guaranteed by architecture This isn't just about making websites work with AI—it's about preserving the open, decentralized nature of the web in the age of artificial intelligence. Just as the web democratized publishing, VOIX can democratize AI integration, ensuring that the future of web-AI interaction remains open, private, and user-controlled. The revolution doesn't require everyone to switch at once. It starts with a single `<tool>` tag, one website at a time, building toward a future where AI assistance is as natural and universal as hyperlinks—and just as decentralized.  # Getting Started with VOIX ## Installation ### Chrome Web Store Installation 1. Visit the [VOIX Chrome Extension](https://chromewebstore.google.com/detail/voix/agmhpolimgfdfnlgciajhbkdapkophie) on the Chrome Web Store 2. Click "Add to Chrome" 3. Confirm the installation when prompted 4. The VOIX icon will appear in your Chrome toolbar ### 1. Open VOIX Settings Click the VOIX extension icon in your toolbar and select "Options" or right-click the extension icon and choose "Options". ### 2. Configure Your AI Provider VOIX supports multiple AI providers. Choose one and configure it: #### OpenAI - **Base URL**: `https://api.openai.com/v1` - **API Key**: Your OpenAI API key (starts with `sk-`) - **Model**: `gpt-4` or `gpt-3.5-turbo` #### OpenAI-compatible (e.g. Azure OpenAI) - **Base URL**: Your OpenAI-compatible endpoint (e.g. `https://api.example.org/v1`) - **API Key**: Your API key - **Model**: Your model name #### Local (Ollama) - **Base URL**: `http://localhost:11434/v1` - **API Key**: Not required (leave empty) - **Model**: `qwen3`, `mistral`, or your installed model ### 3. Configure Voice Input (Optional) VOIX uses OpenAI's Whisper API for voice transcription: - **Language**: Select your preferred language or use "Auto-detect" - **Model**: `whisper-1` (default) - **Base URL**: Leave empty to use the same as your AI provider, otherwise specify another OpenAI-compatible endpoint - **API Key**: Your API key (if using a separate endpoint) ### 4. Test Your Configuration 1. Click "Test Connection" to verify your API settings 2. You should see a success message if everything is configured correctly 3. Save your settings ## Using VOIX ### Opening the Chat Panel - Click the VOIX icon in your toolbar - The chat interface will appear on the right side of your browser ### Basic Chat 1. Type your message in the input field 2. Press Enter or click the send button 3. VOIX will respond based on the current page context ### Voice Input 1. Click the microphone button 🎤 2. Speak your message 3. Click the microphone again to stop recording 4. Your speech will be transcribed ### Live Voice Mode 1. Click the live voice button 🎯 2. VOIX will continuously listen and respond 3. Click again to disable live mode ### Thinking Mode 1. Click the lightbulb button 💡 2. If compatible, the AI will respond after reasoning for some time 3. Useful for complex tasks ## Interacting with VOIX-Compatible Websites VOIX automatically detects compatible elements on websites: ### Tools When a website has `<tool>` elements, VOIX can: - Execute actions on the page - Fill forms - Click buttons - Extract data ### Context VOIX reads `<context>` elements to understand: - Current page state - User information - Application data > [!INFO] > **This documentation itself is an example of a VOIX-compatible page**, using `<tool>` to allow chat based navigation and `<context>` elements to provide the full API documentation. Try it out by asking questions like: > ```plaintext > "How do I use tools in VOIX?" > "Navigate to the context documentation" > "How do I integrate VOIX with Svelte?" > ``` ## Next Steps - Learn about [Core Concepts](./core-concepts.md) to understand how VOIX works - Explore [Demos](./demo-weather.md) to see VOIX in action  # Core Concepts VOIX enables AI assistants to interact with websites through a simple yet powerful architecture. This guide explains how the different parts work together. ## Overview VOIX consists of three main components working together: 1. **Your Website** - Declares what the AI can do and provides current state 2. **Chrome Extension** - Bridges the gap between your website and AI 3. **User + AI** - Natural language interface for interacting with your site ## How It Works ### 1. Website Declaration Your website declares capabilities using HTML elements: - **Tools** - Actions the AI can perform - **Context** - Current state information ```html  <tool name="create_task" description="Create a new task"> <prop name="title" type="string" required/> </tool>  <context name="user"> Name: John Doe Role: Admin </context> ``` ### 2. Extension Discovery When a user opens VOIX on your page: 1. The extension scans for all `<tool>` and `<context>` elements 2. It builds a catalog of available actions and current state 3. This information is presented to the AI assistant ### 3. User Interaction The user types or speaks naturally: ``` "Create a task called 'Review pull requests'" ``` ### 4. AI Understanding The AI: 1. Reads the available tools and their descriptions 2. Understands the current context 3. Determines which tool to use and with what parameters ### 5. Tool Execution When the AI decides to use a tool: 1. VOIX triggers a `call` event on the tool element 2. Your JavaScript handler receives the parameters 3. Your code performs the action ```javascript document.querySelector('[name=create_task]').addEventListener('call', (e) => { const { title } = e.detail; // Create the task in your application createTask(title); }); ``` ## Architecture Benefits ### For Developers VOIX uses HTML elements to define AI capabilities. You add `<tool>` and `<context>` tags to your pages and attach event listeners to handle tool calls. No API integration or SDK is required. The approach works with any JavaScript framework or vanilla JavaScript, and you can add these elements to existing pages without modifying other code. You control what data the AI can access by choosing what to include in your tool and context elements. Your website receives only the tool execution requests with their parameters. The conversation between the user and AI remains private - you never see what the user typed or how they phrased their request. ### For Users You interact with websites through natural language. The AI reads the available tools and current context to understand what actions it can perform. You can make requests like "delete the third item" or "show only active tasks" and the AI will execute the appropriate tools with the correct parameters. You configure your own AI provider in the extension settings. This can be OpenAI, Anthropic, Ollama running locally, or any OpenAI-compatible endpoint. Your conversation data goes directly from the extension to your chosen provider. The website never receives your messages, only the resulting tool calls that need to be executed. ## The Role of Each Part ### Your Website's Role 1. **Declare Tools** - Define what actions are possible 2. **Provide Context** - Share current state information 3. **Handle Events** - Execute actions when tools are called 4. **Update UI** - Reflect changes in your interface ### Extension's Role 1. **Discovery** - Find tools and context on the page 2. **Communication** - Connect your site with the AI 3. **Event Dispatch** - Trigger tool calls based on AI decisions 4. **Privacy** - Keep all data local in the browser ### User's Role 1. **Natural Input** - Describe what they want to accomplish 2. **Conversation** - Clarify or refine requests as needed 3. **Verification** - Confirm actions when necessary ## Data Flow Example Let's trace through a complete interaction to see how data flows through the system: ``` User Input → AI Processing → Tool Selection → Event Dispatch → Your Handler → UI Update ``` 1. **User visits your task management app** ```html <tool name="mark_complete" description="Mark a task as complete"> <prop name="taskId" type="string" required/> </tool> <context name="tasks"> Active tasks: 5 Task IDs: task-1, task-2, task-3, task-4, task-5 </context> ``` 2. **User opens VOIX and types** ``` "Mark task-3 as complete" ``` 3. **AI processes the request** - Sees the `mark_complete` tool - Reads the context showing task-3 exists - Decides to call the tool with taskId: "task-3" 4. **Your code handles the event** ```javascript tool.addEventListener('call', (e) => { const { taskId } = e.detail; markTaskComplete(taskId); updateTaskList(); }); ``` 5. **User sees the result** - Task marked as complete in the UI - Context automatically updates - Ready for the next interaction Each step happens in the browser. The only external dependency is the AI provider, which is configured and trusted by the user. The website acts purely as a capability provider, never seeing the conversation between the user and their AI assistant. ## Next Steps - Learn about [Tools](./tools.md) to create interactive capabilities - Understand [Context](./contexts.md) for sharing application state  # Svelte Integration Guide This guide covers patterns and best practices for integrating VOIX with Svelte applications. ## Configuration ### Global Styles Add these styles to hide VOIX elements from the UI: ```css /* app.css or global styles */ tool, prop, context, array, dict { display: none; } ``` ## Tools and Context Declare tools and context in your Svelte components using the `<tool>` and `<context>` elements. Use the `oncall` handle to execute tools. ```svelte <script lang="ts"> let count: number = $state(0) const increment = (n: number) => { if (n !== undefined) count += n else count += 1 } </script> <button onclick={() => increment(1)}> count is {count} </button> <tool name="increment_counter" description="Increments the counter by n" oncall={(event: CustomEvent) => increment(event.detail.n)}> <prop name="n" type="number" required></prop> </tool> <context name="counter"> The current count is {count}. </context> ``` ## Conditional Tools and Context You can conditionally render tools and context based on application state. Use Svelte's reactivity to manage visibility, for example to show admin tools only to authorized users: ```svelte <script lang="ts"> let isAdmin = false; const secretAdminAction = () => { console.log('Admin action performed'); }; </script> {#if isAdmin} <tool name="secret_admin_action" description="Perform admin action" oncall={secretAdminAction}> <prop name="secret_prop" type="string" required></prop> </tool> <context name="admin_info"> Admin tools are available. </context> {/if} <div> <button onclick={() => isAdmin = !isAdmin}> Toggle Admin Mode: {isAdmin ? 'On' : 'Off'} </button> </div> ``` ## Tools that fetch data You can also create tools that fetch data from APIs or perform asynchronous operations. ```svelte <script lang="ts"> const getWeather = async (location: string) => { try { const geoRes = await fetch(`https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent(location)}`); const geoData = await geoRes.json(); if (!geoData.results?.length) { throw new Error('Location not found'); } const { latitude, longitude, name, country } = geoData.results[0]; const wxRes = await fetch(`https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m,weather_code&daily=weather_code,temperature_2m_max,temperature_2m_min&timezone=auto`); const wxData = await wxRes.json(); return { ...wxData, city: name, country }; } catch (e: any) { return { error: e.message }; } }; </script> <tool name="get_weather" description="Fetch weather by city name. For example: 'Berlin' or 'New York'." return oncall={async (event: CustomEvent) => { const { location } = event.detail; const data = await getWeather(location); event.target?.dispatchEvent(new CustomEvent('return', { detail: data })); }} > <prop name="location" type="string" description="City name to fetch weather for"></prop> </tool> ``` # React Integration Guide This guide covers patterns and best practices for integrating VOIX with React applications, including component-level tools, context management, and React-specific optimizations. ## Configuration ### TypeScript Configuration If using TypeScript, declare the custom elements in your type definitions: ```typescript // e.g. vite-env.d.ts declare namespace JSX { interface IntrinsicElements { 'context': React.DetailedHTMLProps<React.HTMLAttributes<HTMLElement> & { // Add any custom props here name?: string; }, HTMLElement>; // You can add more custom elements here if needed 'tool': React.DetailedHTMLProps<React.HTMLAttributes<HTMLElement> & { name: string; description?: string; return?: boolean; }, HTMLElement>; 'prop': React.DetailedHTMLProps<React.HTMLAttributes<HTMLElement> & { name: string; type: string; required?: boolean; description?: string; }, HTMLElement>; 'array': React.DetailedHTMLProps<React.HTMLAttributes<HTMLElement> & {}, HTMLElement>; 'dict': React.DetailedHTMLProps<React.HTMLAttributes<HTMLElement> & {}, HTMLElement>; } } ``` ### Global Styles Add these styles to hide VOIX elements from the UI: ```css /* index.css or global styles */ tool, prop, context, array, dict { display: none; } ``` ### Tool component You can create a reusable Tool component to encapsulate tool definitions so you can use the onCall prop to handle tool calls easier: ```tsx // src/components/Tool.tsx import React, { useRef, useEffect } from 'react'; // Define the type for the event handler prop type ToolCallEventHandler = (event: CustomEvent) => void; // Define the component's props type ToolProps = { name: string; description: string; onCall: ToolCallEventHandler; return?: boolean; children?: React.ReactNode; } & Omit<React.HTMLAttributes<HTMLElement>, 'onCall'>; export function Tool({ name, description, onCall, children, // Rename the 'return' prop to avoid conflict with the JS keyword return: returnProp, ...rest }: ToolProps) { const toolRef = useRef<HTMLElement | null>(null); useEffect(() => { const element = toolRef.current; if (!element || !onCall) return; const listener = (event: CustomEvent) => onCall(event); element.addEventListener('call', listener); return () => { element.removeEventListener('call', listener); }; }, [onCall]); // Re-attach the listener if the onCall function changes return ( <tool ref={toolRef} name={name} description={description} return={returnProp} {...rest} > {children} </tool> ); } ``` Then, you can use this component in your application like this: ```tsx import { useState } from 'react' import './App.css' import { Tool } from './Tool' function App() { const [count, setCount] = useState(0) function handleIncrement(event: Event) { const details = (event as CustomEvent).detail; console.log('Incrementing counter:', details); setCount((prevCount) => prevCount + details.n); } return ( <> <context name="counter"> The current count is {count} </context> <Tool name="increment_counter" description="Increments the counter by n" onCall={(event) => handleIncrement(event)}> <prop name="n" type="number" required description="The number to increment the counter by" /> </Tool> <button onClick={() => setCount((count) => count + 1)}> count is {count} </button> </> ) } export default App ``` React will likely warn you about the `tool`, `prop`, `array`, `dict` and `context` elements not being recognized. You can safely ignore these warnings: ``` The tag <tool> is unrecognized in this browser. If you meant to render a React component, start its name with an uppercase letter. ``` # Vue.js Integration Guide This guide covers advanced patterns and best practices for integrating VOIX with Vue.js applications, including component-level tools, dynamic context management, and Vue-specific optimizations. This guide uses the idiomatic `@call` syntax for handling tool events. ## Configuration ### Vite Configuration First, configure Vite to recognize VOIX custom elements. This step is crucial for Vue to correctly interpret the custom tags in your templates. ```javascript // vite.config.js import { defineConfig } from 'vite' import vue from '@vitejs/plugin-vue' export default defineConfig({ plugins: [ vue({ template: { compilerOptions: { // Tell Vue to ignore VOIX custom elements isCustomElement: (tag) => ['tool', 'prop', 'context', 'array', 'dict'].includes(tag) } } }) ] }) ```` ### Global Styles To ensure that VOIX elements do not interfere with your application's layout, add the following CSS to your global stylesheet. ```css /* App.vue or global styles */ tool, prop, context, array, dict { display: none; } ``` ## Component-Level Tools Defining tools within your components encapsulates functionality and keeps your code organized. By using the `@call` event binding, you can directly link a tool to its handler method in your script. ```vue <template> <div class="user-profile"> <h2>{{ user.name }}'s Profile</h2>  <tool :name="`update_${userId}_profile`" description="Update this user's profile" @call="handleUpdateProfile" > <prop name="field" type="string" required description="name, email, or bio"/> <prop name="value" type="string" required/> </tool> <tool :name="`toggle_${userId}_notifications`" description="Toggle email notifications" @call="handleToggleNotifications" > </tool>  <context :name="`user_${userId}_state`"> Name: {{ user.name }} Email: {{ user.email }} Bio: {{ user.bio }} Notifications: {{ user.notifications ? 'Enabled' : 'Disabled' }} </context>  <div class="profile-details"> <p>Email: {{ user.email }}</p> <p>Bio: {{ user.bio }}</p> <p>Notifications: {{ user.notifications ? 'On' : 'Off' }}</p> </div> </div> </template> <script setup> import { ref } from 'vue' const props = defineProps({ userId: { type: String, required: true } }) const user = ref({ name: 'John Doe', email: 'john@example.com', bio: 'Software developer', notifications: true }) // Tool handlers function handleUpdateProfile(event) { const { field, value } = event.detail if (field in user.value) { user.value[field] = value event.detail.success = true event.detail.message = `Updated ${field} to ${value}` } else { event.detail.success = false event.detail.error = `Invalid field: ${field}` } } function handleToggleNotifications(event) { user.value.notifications = !user.value.notifications event.detail.success = true event.detail.message = `Notifications ${user.value.notifications ? 'enabled' : 'disabled'}` } </script> ``` ## Advanced Patterns ### Unique Tool Names All tools in your application must have a unique `name`. When creating reusable components that contain tools, incorporate a unique identifier (like a prop) into the tool's name to prevent conflicts. ```vue <template> <div class="product-card">  <tool :name="`add_to_cart_${product.id}`" description="Add this product to the cart" @call="handleAddToCart" > <prop name="quantity" type="number" required/> </tool> <tool :name="`toggle_favorite_${product.id}`" description="Toggle the favorite status of this product" @call="handleToggleFavorite" > </tool> <h3>{{ product.name }}</h3> <button @click="addToCart">Add to Cart</button> </div> </template> <script setup> const props = defineProps({ product: { type: Object, required: true } }) function handleAddToCart(event) { const { quantity } = event.detail; // Your logic to add the product to the cart console.log(`Adding ${quantity} of product ${props.product.id} to cart.`); event.detail.success = true; event.detail.message = `Added ${quantity} of ${props.product.name} to your cart.`; } function handleToggleFavorite(event) { // Your logic to toggle favorite status console.log(`Toggling favorite for product ${props.product.id}.`); event.detail.success = true; } function addToCart() { // Your logic for the regular button click console.log(`Button clicked to add product ${props.product.id} to cart.`); } </script> ``` ### Conditional Tool Availability You can use `v-if` to conditionally render tools based on application state, such as user roles or permissions. This ensures that tools are only available to the AI when they are relevant and authorized. ```vue <template> <div>  <tool v-if="user.isAdmin" name="delete_users" description="Delete selected users" @call="handleDeleteUsers" > <prop name="userIds" type="array" required/> </tool>  <tool name="export_data" description="Export your data" @call="handleExportData" > <prop name="format" type="string" description="csv or json"/> </tool> <context name="permissions"> Role: {{ user.role }} Admin: {{ user.isAdmin ? 'Yes' : 'No' }} </context> </div> </template> <script setup> import { useUserStore } from '@/stores/user' const user = useUserStore() function handleDeleteUsers(event) { const { userIds } = event.detail; // Logic to delete users console.log('Deleting users:', userIds); event.detail.success = true; event.detail.message = `Successfully deleted ${userIds.length} users.`; } function handleExportData(event) { const { format } = event.detail; // Logic to export data console.log('Exporting data as:', format); event.detail.success = true; } </script> ``` ### Real-Time Updates Vue's reactivity system makes it easy to keep `<context>` elements up-to-date. Simply bind your reactive data within the context tag, and it will automatically update whenever the data changes. ```vue <template> <div class="dashboard">  <context name="metrics"> Users online: {{ onlineUsers }} Last update: {{ lastUpdate }} </context> <tool name="refresh_data" description="Manually refresh dashboard data" @call="refreshData" > </tool> <h3>Dashboard</h3> <p>Users Online: {{ onlineUsers }}</p> <p>Last Update: {{ lastUpdate }}</p> </div> </template> <script setup> import { ref, onMounted } from 'vue' const onlineUsers = ref(0) const lastUpdate = ref(new Date().toLocaleTimeString()) function refreshData(event) { onlineUsers.value = Math.floor(Math.random() * 100) lastUpdate.value = new Date().toLocaleTimeString() if (event) { event.detail.success = true; event.detail.message = "Dashboard data has been refreshed."; } } // Simulate real-time updates from a server onMounted(() => { setInterval(refreshData, 5000) }) </script> ```

Kernkonzepte ​

Überblick ​

Wie es funktioniert ​

1. Website-Deklaration ​

2. Erkennung durch die Erweiterung ​

3. Benutzerinteraktion ​

4. KI-Verständnis ​

5. Werkzeugausführung ​

Vorteile der Architektur ​

Für Entwickler ​

Für Benutzer ​

Die Rolle der einzelnen Teile ​

Die Rolle Ihrer Website ​

Die Rolle der Erweiterung ​

Die Rolle des Benutzers ​

Datenflussbeispiel ​

Nächste Schritte ​