# ARCA v1.2.0 "WorkspaceArchitecture" — Agentenbasierte Forschungs- und Dokumentenanalyse
> **Stand:** 2026-06-03 | **Codename:** WorkspaceArchitecture | **Modelle:** GLM-OCR + Qwen2.5-VL-3B (permanent VRAM), Qwen-7B + LM Studio (Fallback)
Lokale, modulare KI-Plattform für historische Forschung und Dokumentenanalyse — läuft 100 % offline auf eigener GPU.
## Was ist neu in v1.2.0
- **Workspace-DB-Architektur** — ARCA ist Framework; jede `data/db/*.db` ist ein eigener Workspace mit `workspace_settings` + `installed_components`
- **Persistenter Workspace-Switch** — letzte aktive DB wird in `.env` gemerkt (Toggle `REMEMBER_LAST_WORKSPACE`)
- **Document-Save-Pipeline-Test** — End-to-End 5-Stufen-Verifikation ([scripts/test_document_save_pipeline.py](scripts/test_document_save_pipeline.py))
- **Flask Dev-Modus** — `FLASK_DEBUG=1` + `--debug` Auto-Reload in `start.bat` / `ARCAStart.bat`
- **LLM-JSON-Härtung** — zweistufiger Retry mit `response_format=json_object` vor Regex-Fallback
- **Vollständige UI/API-Map** — alle 14 Blueprints dokumentiert ([docs/UI_API_Connections.md](docs/UI_API_Connections.md))
- **Bug-Sweep** — #14, #15, #18 gefixt (siehe [docs/buglist.md](docs/buglist.md))
## Architektur
ARCA ist das **Framework** — es gibt keine `arca.db`. Unter ARCA liegen Workspace-DBs:
```
ARCA (Framework: api/, agents/, core/, components/, gui_electron/)
└─ data/db/
├─ Weser.db ── Default-Workspace (Science & Research)
├─ Koold.db ── Agency-Kontext
├─ Circat.db ── Personal & Studio
├─ TestSuite.db ── Test-Sandbox
└─ <user-defined>.db
```
Jede Workspace-DB enthält:
- `workspace_settings` (key/value) — Basis-Einstellungen
- `installed_components` — verfügbare/installierte Components mit `config_json`
- Datensatz: `documents`, `entities`, `relations`, `spatial`, RAG-Chunks, `chat_summaries`, …
Workspace-Wechsel: `POST /api/db/switch` → bei `REMEMBER_LAST_WORKSPACE=True` (Default) wird `ARCA_DB_PATH` in `.env` persistiert.
## Anforderungen
- **Python 3.13+** (3.14 hat keine CUDA-Wheels)
- **CUDA 12.4** für GPU-Support
- **8 GB+ RAM**, **6 GB+ VRAM** (GLM-OCR + Qwen-3B permanent ≈ 8 GB)
- **Node.js LTS** (für Electron-UI)
- **Windows 10+ / Linux / macOS**
## Schnellstart
### 1. Repository
```bash
git clone https://github.com/circat/BASEAGENT.git
cd BASEAGENT
```
### 2. Python venv + Dependencies
```bash
python -m venv .venv_arca
.venv_arca\Scripts\activate # Windows
# source .venv_arca/bin/activate # Linux/macOS
pip install -r requirements.txt
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
```
### 3. Workspace-DBs initialisieren
```bash
python scripts/migrate_workspace_tables.py
```
### 4. App starten
**Electron Desktop (empfohlen):**
```bash
ARCAStart.bat
```
**Nur Backend (Dev-Modus mit Auto-Reload):**
```bash
set FLASK_DEBUG=1
python -m flask --app api run --host 127.0.0.1 --port 5000 --debug
```
Backend: `http://127.0.0.1:5000` · Electron-UI startet automatisch via `ARCAStart.bat`.
## API-Übersicht (14 Blueprints)
Vollständige Doku: [docs/UI_API_Connections.md](docs/UI_API_Connections.md)
| Prefix | Zweck |
|---|---|
| `/api/documents` | Upload, CRUD, OCR-Save, Bildanalyse |
| `/api/rag` | Retrieval, Antwort-Synthese |
| `/api/chat` | Master-Chat (Sessions, Stream, Attach, Permissions) |
| `/api/entities` | Entity-CRUD + Suche |
| `/api/network` | Relations-Graph |
| `/api/spatial` | Geo-Punkte |
| `/api/research` | Multi-Source-Suche (NARA, EHRI, OpenAlex, Archive.org) |
| `/api/llm` | OCR-/Summarize-/Extract-Calls |
| `/api/system` | Discovery + Status |
| `/api/db` | Workspace-Wechsel |
| `/api/skills` | Skill-Install / Activate / Inject |
| `/api/workspaces` | Workspace-Verwaltung |
| `/api/components` | Component-Registry, Health |
| `/api/components/doc-rag` + `/web-research` | Component-spezifische Pipelines |
## Tests
CPU-first — laufen ohne GPU:
```bash
# Document-Save Pipeline (5 Stufen)
python scripts/test_document_save_pipeline.py
# Doc-Analyse (Mock-LLM)
python scripts/test_doc_analysis_full.py
# Interaktive Test-Suite
test_pipelines.bat
```
Pytest:
```bash
.venv_arca\Scripts\python.exe -m pytest tests/ -v
```
## LLM-Stack
| Rolle | Modell | Status |
|---|---|---|
| Primary OCR | GLM-OCR 0.9B | warm im VRAM (t+3 s) |
| Primary Chat | Qwen 2.5-VL-3B-Instruct | warm im VRAM (t+8 s) |
| Heavy Chat | Qwen-7B | lazy, 4-bit |
| External Fallback | LM Studio `localhost:1234` | optional |
VRAM-Budget: ~8 GB permanent von 24 GB (RTX 3090).
## Projektstruktur
```
ARCA/
├── api/ # Flask REST-API (14 Blueprints)
├── agents/ # LLM-Agenten, OCR, Chat, Search
├── components/ # Modulare Components (document_rag, web_research, …)
├── core/ # Config, Logger, Component-Registry
├── db/ # SQLite-Manager (Workspace-DBs)
├── gui_electron/ # Electron + React UI (TypeScript)
├── gui/ui_v2/ # Legacy Web-UI v2
├── models/ # LLM-Weights (gitignored)
├── data/db/ # Workspace-Datenbanken (gitignored)
├── scripts/ # Migrations, Pipeline-Tests, Tools
├── tests/ # Pytest-Suite
├── docs/ # Roadmap, HANDOFF, UI/API-Map, Buglist
├── launcher.py # PyWebView/Electron Entry
├── start.bat # Legacy-Start
└── ARCAStart.bat # Electron One-Click-Start
```
## Konfiguration
`.env` (aus `.env.example`):
```ini
ARCA_DB_PATH=data/db/Weser.db # Default-Workspace
REMEMBER_LAST_WORKSPACE=True # Switch persistiert
FLASK_HOST=127.0.0.1
FLASK_PORT=5000
FLASK_DEBUG=True # Dev-Modus
LM_STUDIO_URL=http://localhost:1234 # optional
```
## Roadmap
Erledigt:
- [x] Step 1 — Ingestion + Fallback-Sequences
- [x] Step 2 — Multi-Context-Datenbanken
- [x] Step 3 — User-Overwrite-Safeguard
- [x] Step 4 — Dynamic-UI-Architektur (Electron + React)
- [x] **v1.2.0 — Workspace-Architektur (workspace_settings + installed_components Tabellen)**
Offen:
- [ ] Step 5 — AG-Grid Database Editor
- [ ] Step 6 — Watchdog & Deduplication
- [ ] Step 7 — MiroFish Swarm Analysis (Multi-Agent NARA + Arolsen + EHRI)
Details: [docs/Roadmap.md](docs/Roadmap.md)
## Dokumentation
| Datei | Inhalt |
|---|---|
| [CLAUDE.md](CLAUDE.md) | Projektregeln + Architektur |
| [docs/UI_API_Connections.md](docs/UI_API_Connections.md) | Vollständige API-Map |
| [docs/buglist.md](docs/buglist.md) | Bug-Tracking & Fixes |
| [docs/HANDOFF.md](docs/HANDOFF.md) | Übergabe-Doku |
| [docs/Roadmap.md](docs/Roadmap.md) | Entwicklungs-Roadmap |
| [docs/WorkspaceArchitecture.md](docs/WorkspaceArchitecture.md) | Workspace-Konzept |
| [docs/database_rules.md](docs/database_rules.md) | DB-Konventionen |
## Lizenz
MIT License — kostenlos für private und kommerzielle Nutzung.
---
**Version:** 1.2.0 "WorkspaceArchitecture" · **Release:** 2026-06-03