# API-Konzept für Laravel LLM Gateway **Version:** 1.0 **Datum:** 2025-11-19 **Status:** 📋 Konzept --- ## 🎯 Zielsetzung Entwicklung einer modernen, RESTful API für das Laravel LLM Gateway mit folgenden Hauptzielen: - **Self-Service:** Gateway-User können alle relevanten Informationen selbst abfragen - **Transparenz:** Vollständige Einsicht in Verbrauch, Kosten und Budget - **Provider-Agnostisch:** Einheitliche Schnittstelle für alle LLM-Provider - **OpenAPI-Dokumentation:** Vollständige Swagger/Scramble-Dokumentation --- ## 📊 Aktuelle Situation ### ✅ Was funktioniert: - Chat Completions Endpoint (`/api/chat/completions`) - API-Key Authentication - Basis-User-Info Endpoint ### ❌ Was fehlt: - Provider-Discovery (Welche Provider sind verfügbar?) - Model-Discovery (Welche Modelle gibt es pro Provider?) - Budget-Management Endpoints - Usage-Statistics Endpoints - Pricing-Information Endpoints - Credentials-Management für Gateway-User - Rate-Limit-Informationen --- ## 🏗️ Vorgeschlagene API-Struktur ### Übersicht der Endpoint-Kategorien ``` /api ├── /providers # Provider-Discovery ├── /models # Model-Discovery ├── /credentials # Eigene Credentials verwalten ├── /chat # LLM-Interaktion ├── /budget # Budget-Management ├── /usage # Usage-Statistics ├── /pricing # Model-Pricing └── /account # Account-Informationen ``` --- ## 📋 Detaillierte API-Spezifikation ### 1. Provider-Management #### `GET /api/providers` Liste aller unterstützten LLM-Provider. **Response:** ```json { "data": [ { "id": "openai", "name": "OpenAI", "status": "available", "has_credentials": true, "credentials_status": "active", "last_tested": "2025-11-19T10:30:00Z", "supported_features": ["chat", "streaming"], "models_count": 12 }, { "id": "anthropic", "name": "Anthropic", "status": "available", "has_credentials": true, "credentials_status": "active", "last_tested": "2025-11-19T10:25:00Z", "supported_features": ["chat", "streaming"], "models_count": 8 }, { "id": "gemini", "name": "Google Gemini", "status": "available", "has_credentials": false, "credentials_status": null, "last_tested": null, "supported_features": ["chat"], "models_count": 6 } ] } ``` **Use Cases:** - Dashboard: Provider-Status anzeigen - Setup: Welche Provider kann ich nutzen? - Integration: Welche Features unterstützt ein Provider? --- #### `GET /api/providers/{provider}` Details zu einem spezifischen Provider. **Response:** ```json { "data": { "id": "openai", "name": "OpenAI", "description": "OpenAI GPT Models", "status": "available", "has_credentials": true, "credentials_status": "active", "credentials": { "api_key_format": "sk-...", "organization_id_required": false, "last_tested": "2025-11-19T10:30:00Z", "test_status": "success" }, "supported_features": ["chat", "streaming", "function_calling"], "api_documentation": "https://platform.openai.com/docs/api-reference", "models": [ { "id": "gpt-4-turbo", "name": "GPT-4 Turbo", "context_window": 128000, "max_output_tokens": 4096, "supports_streaming": true, "supports_function_calling": true, "pricing": { "input_per_1k": 0.01, "output_per_1k": 0.03, "currency": "USD" } } ], "rate_limits": { "requests_per_minute": 10000, "tokens_per_minute": 2000000 }, "statistics": { "total_requests": 1250, "total_cost": 45.67, "total_tokens": 2500000, "last_used": "2025-11-19T11:45:00Z" } } } ``` --- ### 2. Model-Discovery #### `GET /api/models` Liste aller verfügbaren Modelle über alle Provider hinweg. **Query Parameters:** - `provider` (optional): Filter nach Provider - `supports_streaming` (optional): Nur Streaming-fähige Modelle - `max_price` (optional): Maximaler Preis pro 1k Tokens - `min_context` (optional): Minimale Context Window Size - `sort` (optional): `price`, `context`, `popularity` **Response:** ```json { "data": [ { "id": "gpt-4-turbo", "provider": "openai", "provider_name": "OpenAI", "name": "GPT-4 Turbo", "description": "Most capable GPT-4 model", "context_window": 128000, "max_output_tokens": 4096, "supports_streaming": true, "supports_function_calling": true, "supports_vision": true, "pricing": { "input_per_1k_tokens": 0.01, "output_per_1k_tokens": 0.03, "currency": "USD" }, "availability": "available" }, { "id": "claude-3-5-sonnet-20241022", "provider": "anthropic", "provider_name": "Anthropic", "name": "Claude 3.5 Sonnet", "description": "Most intelligent Claude model", "context_window": 200000, "max_output_tokens": 8192, "supports_streaming": true, "supports_function_calling": true, "supports_vision": true, "pricing": { "input_per_1k_tokens": 0.003, "output_per_1k_tokens": 0.015, "currency": "USD" }, "availability": "available" } ], "meta": { "total": 42, "filtered": 2, "providers_count": 5 } } ``` **Use Cases:** - Model-Selection: Bestes Modell für Use Case finden - Price-Comparison: Günstigste Option ermitteln - Feature-Check: Welche Modelle unterstützen Vision/Functions? --- #### `GET /api/models/{provider}/{model}` Detaillierte Informationen zu einem spezifischen Modell. **Response:** ```json { "data": { "id": "gpt-4-turbo", "provider": "openai", "provider_name": "OpenAI", "name": "GPT-4 Turbo", "full_name": "OpenAI GPT-4 Turbo", "description": "Most capable GPT-4 model with improved instruction following", "release_date": "2024-04-09", "deprecation_date": null, "status": "active", "capabilities": { "context_window": 128000, "max_output_tokens": 4096, "supports_streaming": true, "supports_function_calling": true, "supports_vision": true, "supports_json_mode": true, "training_data_cutoff": "2023-12" }, "pricing": { "input_per_1k_tokens": 0.01, "output_per_1k_tokens": 0.03, "currency": "USD", "last_updated": "2024-11-01T00:00:00Z" }, "performance": { "avg_response_time_ms": 1250, "p95_response_time_ms": 2800, "success_rate": 99.8 }, "your_usage": { "total_requests": 145, "total_tokens": 250000, "total_cost": 3.75, "avg_tokens_per_request": 1724, "last_used": "2025-11-19T11:30:00Z" }, "documentation": "https://platform.openai.com/docs/models/gpt-4-turbo", "best_for": ["complex reasoning", "code generation", "analysis"] } } ``` --- ### 3. Credentials-Management #### `GET /api/credentials` Liste aller eigenen Provider-Credentials. **Response:** ```json { "data": [ { "id": 1, "provider": "openai", "provider_name": "OpenAI", "api_key_preview": "sk-proj-...xyz", "organization_id": null, "is_active": true, "status": "verified", "last_used": "2025-11-19T11:45:00Z", "last_tested": "2025-11-19T10:30:00Z", "test_result": { "status": "success", "message": "Connection successful", "tested_at": "2025-11-19T10:30:00Z" }, "created_at": "2025-11-10T08:00:00Z", "updated_at": "2025-11-19T10:30:00Z" }, { "id": 2, "provider": "anthropic", "provider_name": "Anthropic", "api_key_preview": "sk-ant-...abc", "organization_id": null, "is_active": true, "status": "verified", "last_used": "2025-11-19T11:30:00Z", "last_tested": "2025-11-19T10:25:00Z", "test_result": { "status": "success", "message": "Connection successful", "tested_at": "2025-11-19T10:25:00Z" }, "created_at": "2025-11-12T09:00:00Z", "updated_at": "2025-11-19T10:25:00Z" } ] } ``` --- #### `POST /api/credentials` Neue Credentials für einen Provider hinzufügen. **Request:** ```json { "provider": "openai", "api_key": "sk-proj-abc123...", "organization_id": null, "test_connection": true } ``` **Response:** ```json { "data": { "id": 3, "provider": "openai", "provider_name": "OpenAI", "api_key_preview": "sk-proj-...xyz", "organization_id": null, "is_active": true, "status": "verified", "test_result": { "status": "success", "message": "Connection successful", "model_tested": "gpt-3.5-turbo" }, "created_at": "2025-11-19T12:00:00Z" }, "message": "Credentials successfully added and verified" } ``` --- #### `PUT /api/credentials/{id}` Bestehende Credentials aktualisieren. **Request:** ```json { "api_key": "sk-proj-new-key-...", "organization_id": "org-123", "is_active": true, "test_connection": true } ``` --- #### `DELETE /api/credentials/{id}` Credentials löschen. **Response:** ```json { "message": "Credentials successfully deleted" } ``` --- #### `POST /api/credentials/{id}/test` Credentials testen ohne zu ändern. **Response:** ```json { "status": "success", "message": "Connection successful", "details": { "provider": "openai", "model_tested": "gpt-3.5-turbo", "response_time_ms": 245, "tested_at": "2025-11-19T12:05:00Z" } } ``` --- ### 4. Chat Completions (Erweitert) #### `POST /api/chat/completions` LLM Chat Completion Request (OpenAI-kompatibel). **Request:** ```json { "model": "gpt-4-turbo", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the capital of France?" } ], "temperature": 0.7, "max_tokens": 500, "stream": false } ``` **Response:** ```json { "id": "req_abc123", "object": "chat.completion", "created": 1700000000, "model": "gpt-4-turbo", "provider": "openai", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 15, "completion_tokens": 8, "total_tokens": 23 }, "cost": { "input_cost": 0.00015, "output_cost": 0.00024, "total_cost": 0.00039, "currency": "USD" }, "metadata": { "gateway_request_id": "req_abc123", "response_time_ms": 1245, "budget_remaining": 98.75 } } ``` --- ### 5. Budget-Management #### `GET /api/budget` Aktuelles Budget und Limits abrufen. **Response:** ```json { "data": { "total_budget": 100.00, "used_budget": 45.67, "remaining_budget": 54.33, "budget_percentage": 45.67, "currency": "USD", "period": "monthly", "period_start": "2025-11-01T00:00:00Z", "period_end": "2025-11-30T23:59:59Z", "days_remaining": 11, "projected_spend": 95.34, "projected_overspend": false, "limits": { "daily_limit": 10.00, "daily_used": 2.45, "daily_remaining": 7.55, "request_limit_per_minute": 100, "token_limit_per_request": 10000 }, "alerts": { "threshold_50_percent": false, "threshold_75_percent": false, "threshold_90_percent": false, "approaching_daily_limit": false }, "breakdown_by_provider": [ { "provider": "openai", "provider_name": "OpenAI", "spent": 25.30, "percentage": 55.4, "requests": 850 }, { "provider": "anthropic", "provider_name": "Anthropic", "spent": 20.37, "percentage": 44.6, "requests": 400 } ], "updated_at": "2025-11-19T12:00:00Z" } } ``` **Use Cases:** - Dashboard: Budget-Status anzeigen - Alerts: Warnung bei Budget-Überschreitung - Planning: Hochrechnung für Monatsende --- #### `GET /api/budget/history` Budget-Verlauf über die Zeit. **Query Parameters:** - `period` (optional): `daily`, `weekly`, `monthly` (default: `daily`) - `days` (optional): Anzahl Tage zurück (default: 30) **Response:** ```json { "data": [ { "date": "2025-11-19", "total_cost": 2.45, "total_requests": 45, "total_tokens": 45000, "breakdown": [ { "provider": "openai", "cost": 1.35, "requests": 25 }, { "provider": "anthropic", "cost": 1.10, "requests": 20 } ] }, { "date": "2025-11-18", "total_cost": 3.67, "total_requests": 68, "total_tokens": 78000, "breakdown": [ { "provider": "openai", "cost": 2.10, "requests": 40 }, { "provider": "anthropic", "cost": 1.57, "requests": 28 } ] } ], "meta": { "period": "daily", "days": 30, "total_cost": 45.67, "total_requests": 1250, "avg_daily_cost": 1.52, "avg_daily_requests": 41.7 } } ``` --- ### 6. Usage Statistics #### `GET /api/usage/summary` Zusammenfassung der Nutzungsstatistiken. **Query Parameters:** - `period` (optional): `today`, `week`, `month`, `all` (default: `month`) - `provider` (optional): Filter nach Provider **Response:** ```json { "data": { "period": "month", "period_start": "2025-11-01T00:00:00Z", "period_end": "2025-11-30T23:59:59Z", "summary": { "total_requests": 1250, "successful_requests": 1235, "failed_requests": 15, "success_rate": 98.8, "total_tokens": 2500000, "prompt_tokens": 1800000, "completion_tokens": 700000, "total_cost": 45.67, "avg_cost_per_request": 0.0365, "avg_tokens_per_request": 2000, "avg_response_time_ms": 1450 }, "by_provider": [ { "provider": "openai", "provider_name": "OpenAI", "requests": 850, "tokens": 1700000, "cost": 25.30, "avg_response_time_ms": 1350, "success_rate": 99.2 }, { "provider": "anthropic", "provider_name": "Anthropic", "requests": 400, "tokens": 800000, "cost": 20.37, "avg_response_time_ms": 1650, "success_rate": 98.0 } ], "by_model": [ { "model": "gpt-4-turbo", "provider": "openai", "requests": 450, "tokens": 900000, "cost": 15.50, "avg_tokens_per_request": 2000 }, { "model": "claude-3-5-sonnet-20241022", "provider": "anthropic", "requests": 400, "tokens": 800000, "cost": 20.37, "avg_tokens_per_request": 2000 } ], "top_hours": [ { "hour": 14, "requests": 150, "cost": 5.45 }, { "hour": 10, "requests": 140, "cost": 5.10 } ] } } ``` --- #### `GET /api/usage/requests` Liste der einzelnen Requests mit Details. **Query Parameters:** - `page` (optional): Seite (default: 1) - `per_page` (optional): Items pro Seite (default: 20, max: 100) - `provider` (optional): Filter nach Provider - `model` (optional): Filter nach Model - `status` (optional): `success`, `failed`, `all` (default: `all`) - `date_from` (optional): Von Datum (ISO 8601) - `date_to` (optional): Bis Datum (ISO 8601) - `sort` (optional): `created_at`, `cost`, `tokens`, `response_time` (default: `-created_at`) **Response:** ```json { "data": [ { "id": "req_abc123", "provider": "openai", "model": "gpt-4-turbo", "status": "success", "prompt_tokens": 150, "completion_tokens": 85, "total_tokens": 235, "input_cost": 0.0015, "output_cost": 0.00255, "total_cost": 0.00405, "response_time_ms": 1245, "created_at": "2025-11-19T11:45:00Z", "metadata": { "has_streaming": false, "has_function_calling": false, "temperature": 0.7 } } ], "meta": { "current_page": 1, "per_page": 20, "total": 1250, "total_pages": 63, "has_more": true }, "summary": { "total_cost": 45.67, "total_tokens": 2500000, "avg_response_time_ms": 1450 } } ``` --- #### `GET /api/usage/requests/{id}` Details zu einem spezifischen Request. **Response:** ```json { "data": { "id": "req_abc123", "gateway_user_id": "usr_xyz", "provider": "openai", "provider_name": "OpenAI", "model": "gpt-4-turbo", "status": "success", "request": { "messages": [ { "role": "user", "content": "What is the capital of France?" } ], "temperature": 0.7, "max_tokens": 500 }, "response": { "message": { "role": "assistant", "content": "The capital of France is Paris." }, "finish_reason": "stop" }, "usage": { "prompt_tokens": 15, "completion_tokens": 8, "total_tokens": 23 }, "cost": { "input_cost": 0.00015, "output_cost": 0.00024, "total_cost": 0.00039, "currency": "USD", "pricing_at_request": { "input_per_1k": 0.01, "output_per_1k": 0.03 } }, "performance": { "response_time_ms": 1245, "time_to_first_token_ms": 450 }, "metadata": { "ip_address": "192.168.1.1", "user_agent": "MyApp/1.0", "stream": false, "has_function_calling": false }, "created_at": "2025-11-19T11:45:00Z", "completed_at": "2025-11-19T11:45:01Z" } } ``` --- #### `GET /api/usage/charts` Chart-Daten für Visualisierungen. **Query Parameters:** - `type`: `daily_cost`, `provider_distribution`, `model_usage`, `hourly_pattern` - `days` (optional): Anzahl Tage zurück (default: 30) **Response für `type=daily_cost`:** ```json { "data": { "type": "daily_cost", "labels": [ "2025-11-01", "2025-11-02", "..." ], "datasets": [ { "label": "Daily Cost", "data": [1.45, 2.30, 1.85, "..."], "backgroundColor": "rgba(59, 130, 246, 0.5)", "borderColor": "rgba(59, 130, 246, 1)" }, { "label": "7-day Moving Average", "data": [1.50, 1.87, 1.98, "..."], "borderColor": "rgba(239, 68, 68, 1)", "type": "line" } ] } } ``` --- ### 7. Model Pricing #### `GET /api/pricing` Aktuelle Preise für alle Modelle. **Query Parameters:** - `provider` (optional): Filter nach Provider - `sort` (optional): `price`, `name`, `provider` **Response:** ```json { "data": [ { "provider": "openai", "provider_name": "OpenAI", "model": "gpt-4-turbo", "model_name": "GPT-4 Turbo", "pricing": { "input_per_1k_tokens": 0.01, "output_per_1k_tokens": 0.03, "currency": "USD" }, "last_updated": "2024-11-01T00:00:00Z" }, { "provider": "anthropic", "provider_name": "Anthropic", "model": "claude-3-5-sonnet-20241022", "model_name": "Claude 3.5 Sonnet", "pricing": { "input_per_1k_tokens": 0.003, "output_per_1k_tokens": 0.015, "currency": "USD" }, "last_updated": "2024-10-22T00:00:00Z" } ], "meta": { "total_models": 42, "providers_count": 5, "last_sync": "2025-11-19T10:00:00Z" } } ``` --- #### `GET /api/pricing/calculator` Kosten-Kalkulator für hypothetische Requests. **Query Parameters:** - `model`: Model-ID - `input_tokens`: Anzahl Input-Tokens - `output_tokens`: Anzahl Output-Tokens **Response:** ```json { "data": { "model": "gpt-4-turbo", "provider": "openai", "input_tokens": 1000, "output_tokens": 500, "pricing": { "input_per_1k": 0.01, "output_per_1k": 0.03, "currency": "USD" }, "calculation": { "input_cost": 0.01, "output_cost": 0.015, "total_cost": 0.025, "currency": "USD" }, "examples": { "10_requests": 0.25, "100_requests": 2.50, "1000_requests": 25.00 } } } ``` --- ### 8. Account-Information #### `GET /api/account` Informationen zum eigenen Gateway-User Account. **Response:** ```json { "data": { "user_id": "usr_xyz789", "name": "API Client App", "email": "api@example.com", "status": "active", "created_at": "2025-11-01T08:00:00Z", "last_login": "2025-11-19T11:45:00Z", "api_keys": [ { "id": "key_123", "name": "Production Key", "key_preview": "gw_...xyz", "created_at": "2025-11-01T08:00:00Z", "last_used": "2025-11-19T11:45:00Z", "expires_at": null } ], "providers_configured": 3, "budget": { "total": 100.00, "used": 45.67, "remaining": 54.33, "currency": "USD" }, "statistics": { "total_requests": 1250, "total_tokens": 2500000, "total_cost": 45.67, "first_request": "2025-11-05T10:00:00Z" }, "rate_limits": { "requests_per_minute": 100, "tokens_per_request": 10000, "daily_budget_limit": 10.00 } } } ``` --- #### `GET /api/account/activity` Recent Activity Log. **Query Parameters:** - `limit` (optional): Anzahl Items (default: 20, max: 100) - `type` (optional): `request`, `credential_change`, `budget_alert`, `all` (default: `all`) **Response:** ```json { "data": [ { "id": 1, "type": "request", "action": "chat_completion", "details": { "provider": "openai", "model": "gpt-4-turbo", "cost": 0.00405, "status": "success" }, "timestamp": "2025-11-19T11:45:00Z" }, { "id": 2, "type": "credential_change", "action": "credentials_added", "details": { "provider": "anthropic" }, "timestamp": "2025-11-12T09:00:00Z" }, { "id": 3, "type": "budget_alert", "action": "75_percent_reached", "details": { "budget_used": 75.00, "budget_total": 100.00 }, "timestamp": "2025-11-18T14:30:00Z" } ], "meta": { "total": 156, "limit": 20, "has_more": true } } ``` --- ## 🔐 Authentifizierung & Authorization ### API-Key Authentication Alle API-Endpoints erfordern einen gültigen API-Key im Header: ``` Authorization: Bearer gw_abc123xyz789... ``` ### Response bei fehlender/ungültiger Authentifizierung: ```json { "error": { "code": "unauthorized", "message": "Invalid or missing API key", "status": 401 } } ``` --- ## 📝 Standard-Response-Formate ### Success Response ```json { "data": { /* Resource data */ }, "meta": { /* Optional metadata */ } } ``` ### Error Response ```json { "error": { "code": "error_code", "message": "Human-readable error message", "status": 400, "details": { /* Optional additional details */ } } } ``` ### Pagination ```json { "data": [ /* Array of items */ ], "meta": { "current_page": 1, "per_page": 20, "total": 1250, "total_pages": 63, "has_more": true }, "links": { "first": "/api/endpoint?page=1", "last": "/api/endpoint?page=63", "prev": null, "next": "/api/endpoint?page=2" } } ``` --- ## ⚠️ Error-Codes | Code | HTTP Status | Beschreibung | |------|-------------|--------------| | `unauthorized` | 401 | Fehlende oder ungültige Authentifizierung | | `forbidden` | 403 | Zugriff verweigert | | `not_found` | 404 | Resource nicht gefunden | | `validation_error` | 422 | Validierungsfehler in Request-Daten | | `rate_limit_exceeded` | 429 | Rate-Limit überschritten | | `budget_exceeded` | 402 | Budget aufgebraucht | | `provider_error` | 502 | Fehler beim LLM-Provider | | `provider_unavailable` | 503 | Provider nicht erreichbar | | `internal_error` | 500 | Interner Server-Fehler | --- ## 🚀 Implementierungsreihenfolge ### Phase 1: Foundation (Prio: HOCH) 1. ✅ Basis API-Struktur mit Scramble 2. ✅ Error-Handling & Response-Formate 3. Provider-Endpoints (`/api/providers`, `/api/providers/{id}`) 4. Model-Endpoints (`/api/models`, `/api/models/{provider}/{model}`) ### Phase 2: Core-Features (Prio: HOCH) 5. Credentials-Management (`/api/credentials/*`) 6. Budget-Endpoints (`/api/budget`, `/api/budget/history`) 7. Pricing-Endpoints (`/api/pricing`, `/api/pricing/calculator`) ### Phase 3: Analytics (Prio: MITTEL) 8. Usage-Summary (`/api/usage/summary`) 9. Request-History (`/api/usage/requests`) 10. Chart-Data (`/api/usage/charts`) ### Phase 4: Account (Prio: NIEDRIG) 11. Account-Info (`/api/account`) 12. Activity-Log (`/api/account/activity`) --- ## 📊 Vorteile der neuen API-Struktur ### Für Gateway-User: - ✅ **Self-Service:** Alle Informationen selbst abrufbar - ✅ **Transparenz:** Vollständiger Überblick über Kosten und Verbrauch - ✅ **Flexibilität:** Provider und Modelle frei wählbar - ✅ **Kontrolle:** Budget-Management in eigener Hand ### Für Entwickler: - ✅ **RESTful:** Standard-konforme API-Struktur - ✅ **Dokumentiert:** Vollständige Swagger/Scramble-Docs - ✅ **Konsistent:** Einheitliche Response-Formate - ✅ **Skalierbar:** Einfach erweiterbar ### Für das System: - ✅ **Modular:** Klare Trennung der Verantwortlichkeiten - ✅ **Wartbar:** Übersichtliche Controller-Struktur - ✅ **Testbar:** Alle Endpoints einzeln testbar - ✅ **Erweiterbar:** Neue Provider/Features einfach integrierbar --- ## 🎯 Nächste Schritte 1. **Review:** Konzept durchgehen und finalisieren 2. **Implementation:** Controller und Routes implementieren 3. **Testing:** Comprehensive API-Tests schreiben 4. **Documentation:** Scramble-Annotations hinzufügen 5. **Deployment:** API live schalten --- **Feedback erwünscht!** 💬