🧠 Cortex Engine

Action-Tag Execution & Validation Core

Cortex Engine transforms inline directives embedded in conversation text into real-world side-effects and quality gates. Below is the current catalogue of supported tags. Use exact syntax—Cortex is strict.

ECHO_TO_AI

~<ECHO_TO_AI entity="Name" message="Text">~

Send new message (authored by Cortex) to another AI.

SQL

~<SQL>SELECT ...~

Execute safe SQL and inject results.

WEB_REQUEST

~{WEB_REQUEST}https://...~

Fetch remote JSON/HTML (whitelisted).

APPROVE

~<APPROVE>~

Signal that the reply passes validation.

BOUNCE

~<BOUNCE entity="AI" message="Feedback">~

Request revision from target AI.

ESCALATE

~<ESCALATE reason="...">~

Flag severe issue for human review.

SET_METADATA

~<SET_METADATA temperature="0.6" max_tokens="800">~

Override conversation-specific OpenAI parameters.

📌 Usage Notes

Tags may be nested; Cortex evaluates inside-out.
Unrecognised tags are ignored and logged.
SQL safety filter blocks DDL/DROP and unrestricted DELETE.
`SET_METADATA` applies on next model call; limited keys accepted.

🤖 Planned Special-Agent AIs

These dedicated agents sit between raw content and the user, collaborating to refine prompts, route messages, and guarantee quality.

RouteMaster

Chooses recipients & activates on-demand AIs (uses <ENGAGE_AI>).

TokenTuner

Optimises temperature, max_tokens, etc. via <SET_METADATA>.

AnswerJudge

Checks whether replies answer the ask; can <BOUNCE> or <APPROVE>.

CharacterGuardian

Ensures persona consistency; bounces out-of-character text.

FactChecker

Detects hallucinations, fetches citations using {WEB_REQUEST}.

SafetySentinel

Scans for policy violations; may <ESCALATE> critical issues.

BounceLimiter

Monitors bounce loops; overrides limits or force-approves to break stalemates.

🔭 Possible Future Agents

Concepts under exploration—these agents could further enhance the flow once core specialists are stable.

SentimentShaper

Detects audience mood and tweaks tone (temperature, wording) to keep conversation empathetic.

ContextCompressor

Generates ultra-dense summaries of long threads; injects them to cut token usage.

MultilingualBridge

Auto-translates user input/output & adjusts metadata locale settings.

ComplianceAuditor

Scans for industry regulations (GDPR, HIPAA) and redacts or escalates as needed.

AccessibilityEnhancer

Adds alt-text, simplification, or captioning directives to improve accessibility.

🛠️ General Agent Action Directives (Evaluation Set)

Candidate directives any AI could embed; Cortex will translate them into SQL or internal calls.

UPDATE_MEMORY

~<UPDATE_MEMORY id="42" content="..." importance="0.8">~

UPDATE_THOUGHT

~<UPDATE_THOUGHT id="77" content="..." ranking="0.6">~

DELETE_MEMORY

~<DELETE_MEMORY id="42">~

DELETE_THOUGHT

~<DELETE_THOUGHT id="77">~

ADD_DIRECTIVE

~<ADD_DIRECTIVE priority="0.9" content="...">~

SET_PROJECT

~<SET_PROJECT name="Alpha" priority="0.7" goal="...">~

ADD_TASK

~<ADD_TASK project="Alpha" description="..." due="2025-07-01">~

LOG_EVENT

~<LOG_EVENT level="info" message="...">~

REQUEST_SUMMARY

~<REQUEST_SUMMARY limit="10">~

REQUEST_VALIDATION

~<REQUEST_VALIDATION type="security">~

🚀 Additional Special-Agent Ideas

MemoryCurator

Automates pruning, merging, and surfacing of memories via UPDATE/DELETE_MEMORY tags.

DirectiveManager

Creates and reprioritises directives to steer long-running projects.

DialogueStylist

Adjusts linguistic style guidelines per audience; may tweak personae.

RiskProfiler

Assesses business or legal risk, lowering temperature or escalating when high.

LatencyOptimizer

Dynamically toggles streaming, `n`, and token limits to keep responses snappy.

REQUEST_COMPLETION_CHECK

~<REQUEST_COMPLETION_CHECK>~

SET_STREAMING

~<SET_STREAMING value="true">~

SPLIT_RESPONSE

~<SPLIT_RESPONSE parts="3">~

SET_RESPONSE_STYLE

~<SET_RESPONSE_STYLE style="bullet">~

SAVE_SUMMARY

~<SAVE_SUMMARY importance="0.8" content="...">~

CompletenessVerifier

Bounces answers until all user questions are addressed; leverages `REQUEST_COMPLETION_CHECK` + `BOUNCE`.

ChunkComposer

Breaks very long outputs into numbered chunks using `SPLIT_RESPONSE` + streaming.

ToneEqualizer

Ensures consistent tone across multi-AI replies; adjusts style metadata.

TokenEconomist

Predicts token cost; downscales `max_tokens` or summarises context when budget tight.

InsightMiner

Extracts key insights & stores them via `SAVE_SUMMARY` and `UPDATE_MEMORY` for future leverage.

🧩 End-to-End Use-Case Walk-Through

The flow below shows how a single user question can be enriched by multiple Special-Agents before the final answer reaches the user.

1️⃣  Incoming user message
──────────────────────────
"What's our month-to-date revenue by region and how can we improve next quarter?"

2️⃣  RouteMaster intercepts ➜ decides recipients:
   Adds 
   Adds 

3️⃣  DataAnalyst drafts reply containing:
   ~SELECT region,sum(amount) FROM sales WHERE date>=CURDATE()-30 GROUP BY region~

4️⃣  Cortex_Engine executes SQL ➜ injects table results.

5️⃣  CompletenessVerifier checks draft ➜ sees no strategic improvement section ➜
   issues ~~

6️⃣  DataAnalyst revises, TokenTuner sets longer max_tokens
   ~~

7️⃣  FactChecker scans output ➜ OK
   CharacterGuardian verifies tone ➜ OK

8️⃣  AnswerJudge ➜ 

9️⃣  Output delivered to user with full revenue table + strategy plan.

🏛️ Cortex Pillars & Relationships

Cortex_Engine – Core Python service that parses & executes inline tags. It's stateless other than DB I/O.
Cortex_AI – Always-active meta-assistant that triggers bounces, meta-data tweaks, and mediates disputes.
Cortex_Agents – Fleet of domain-focused micro-AIs (RouteMaster, TokenTuner, CompletenessVerifier, etc.) that run pre/post processing.

All processing is dynamic: every decision—from routing to SQL execution—is done by AIs through tags; Python merely enforces order and safety.

Meta-parameter Governance

If a Special-Agent writes max_tokens=1500 but the agent's cortex_personality.response_metadata caps it at 800, Matrix clamps to 800. Agents can only dial down within allowed limits.

🔧 Engine Enhancements Needed for Some Agents

ExternalPost tag: ~<POST_X content="...">~ – would call X/Twitter API (requires new whitelisted HTTP handler).
REQUEST_VALIDATION currently stubbed – add hook to enhanced_cortex_validation.
SPLIT_RESPONSE – Engine must handle part aggregation when n>1 streaming segments come back.

These items are flagged in the roadmap & will be built as agents graduate from concept to production.

🗄️ Configuring Special-Agents via `cortex_config`

Agent activation and ordering are data-driven. Each agent has a JSON row in cortex_config:

INSERT INTO cortex_config (config_name, config_value)
VALUES (
  'SpecialAgent.RouteMaster',
  '{"enabled": true, "role": "input", "priority": 10}'
);

Planned dedicated schema:

Column	Type	Purpose
id	INT PK	Row identifier
agent_name	VARCHAR(64)	Name e.g. `RouteMaster`
processor_type	ENUM('input','output','both')	Where in the pipeline it hooks
priority	INT	Run order (1 = highest)
metadata	JSON	Default OpenAI params or agent settings
enabled	TINYINT	Soft toggle

Matrix fetches active agents with:

SELECT agent_name, metadata
FROM cortex_config
WHERE processor_type = 'input' AND enabled = 1
ORDER BY priority ASC;

This avoids hard-coding order in Python and allows conversation-level overrides in the future.

🌐 Holistic Flow & the Role of `Cortex_AI`

Cortex_AI is the conversational face of the entire pipeline. Think of it as the maître-d' who can step in, explain what's happening, translate system feedback into human-friendly language, and make executive overrides when the specialist swarm hits an impasse.

End-to-End Sequence (high-level)

User Input enters Matrix ➜ stored raw.
Input-Agent tier (priority ASC): RouteMaster → SkillBroker → SentimentShaper etc.
They may inject ENGAGE_AI tags, adjust metadata, or rewrite portions.
Augmented message passes to Cortex_Engine for tag execution (SQL, WEB_REQUEST).
Matrix routes the cleaned prompt to selected participant AIs.
Each AI produces a draft response (OpenAI).
Output-Agent tier: FactChecker → CompletenessVerifier → CharacterGuardian → ToneEqualizer → AnswerJudge.
AnswerJudge emits <APPROVE> or <BOUNCE>.
- If BOUNCE ➜ draft returned to authoring AI with feedback; BounceLimiter monitors loop (max N).
Approved response goes through SafetySentinel for final policy check.
Matrix delivers polished answer to user.
Cortex_AI remains available to:
- Clarify why an answer was delayed/bounced.
- Summarise conversation so far (`REQUEST_SUMMARY`).
- Accept meta-commands from admins (e.g., enable/disable agents on the fly).

Dynamic Loops & Governance

Bounce loop – CompletenessVerifier/AnswerJudge ↔ authoring AI. Stopped by BounceLimiter or AnswerJudge reaching satisfaction.
Token tuning loop – TokenTuner observes token usage, updates metadata; Matrix clamps to personality caps.
Recruitment loop – SkillBroker detects missing expertise, emits ENGAGE_AI; newly engaged AI enters subsequent rounds.

Design Principles

Modularity – Agents are pluggable; order defined by cortex_config.
Minimum Tokens – Most agents run with low temperature & max_tokens (≤150) unless generating creative suggestions.
Explainability – Every agent writes a short log entry to matrix_communications.engine_processing_log so Cortex_AI can audit.
Fail-safe – If any agent errors out, Cortex_Engine passes original content with a warning tag for later review.

Result: Users interact with a single, coherent voice—empowered by a dynamic micro-society of AIs working behind the scenes, continuously optimising quality, safety, and insight.

🧬 Naming the Specialists

Clear naming accelerates adoption. Below are purpose statements and alternative codenames (some inspired by brain regions) for each key agent:

Current Name	Purpose	Alt-Names (pick 1)
RouteMaster	Selects recipients & manages participation modes.	CingulateRouter, RelayNucleus, PathFinder, Switchboard, MessageDispatcher
TokenTuner	Optimises OpenAI parameters per message.	ThalamusRegulator, ParamMaestro, DialSetter, CostBalancer, ThermostatAI
AnswerJudge	Checks completeness & relevance.	RealityCheck, VerdictAI, PrefrontalReviewer, IntegrityPanel, ResponseReferee
CharacterGuardian	Ensures persona/style consistency.	LimbicCustodian, VoiceKeeper, PersonaWarden, StylisticSentinel, ToneProtector
FactChecker	Detects factual errors & adds citations.	HippocampusVerify, TruthSeeker, SourceScout, CitationMiner, RealityAnchor
SafetySentinel	Policy & compliance enforcement.	AmygdalaGuard, PolicyShield, RiskWatcher, SafeguardAI, ComplianceGate
BounceLimiter	Prevents infinite correction loops.	CycleBreaker, LoopGuard, FeedbackThrottle, ReboundCap, IterationBrake
CompletenessVerifier	Bounces until all parts of question answered.	CoverageAgent, GapDetector, HolisticCheck, FullScope, SatiationSensor
ChunkComposer	Splits long outputs into numbered segments.	SectionSmith, ParcelWriter, Segmentor, SliceArtist, CorticalChunker
InsightMiner	Stores key insights for future leverage.	GoldProspector, IdeaHarvester, PatternDigger, HippocampusScribe, KnowledgeSeeder

🛠️ Advanced Example – On-the-Fly Fitness Tracker

User: "Act as my fitness coach. Track my workouts and show weekly summaries. I did 30 push-ups today."

1. RouteMaster ➜ decides DataSchemaDesigner AI is needed ➜ ~~

2. DataSchemaDesigner replies with:
   "Creating table" + ~CREATE TABLE workouts (id INT AUTO_INCREMENT, date DATE, exercise VARCHAR(50), reps INT)~

3. Cortex_Engine executes SQL (table now exists).

4. DataAnalyst logs workout:
   ~INSERT INTO workouts (date, exercise, reps) VALUES (CURDATE(), 'push-ups', 30)~

5. CompletenessVerifier ensures insert success ➜ APPROVE.

6. User later asks: "How many reps this week?"
   DataAnalyst uses ~SELECT SUM(reps) FROM workouts WHERE YEARWEEK(date)=YEARWEEK(CURDATE())~ ➜ returns 30.

7. InsightMiner saves a high-level memory: ~~

🛠️ Mixed Modal Example – SQL ✚ Web Scraping

User: "Compare our current Bitcoin wallet balance with today's BTC/USD price and show total USD value."  

1. RouteMaster routes to FinanceAnalyst.  
2. FinanceAnalyst queries internal ledger:  
   ~SELECT balance_btc FROM wallets WHERE id=1~ ➜ returns 2.5 BTC.  
3. FinanceAnalyst embeds web tag:  
   ~{WEB_REQUEST}https://api.coindesk.com/v1/bpi/currentprice/USD.json~ ➜ Cortex injects price = $70,000.  
4. FinanceAnalyst calculates USD = 2.5 × 70,000 = $175,000 and responds.  
5. FactChecker verifies exchange rate parity; AnswerJudge approves.  
6. InsightMiner stores summary: ~~

🛰️ Deep-Dive Prompt Transformation (Step-by-Step)

The chain below shows every prompt variant as it travels through agents, Cortex_Engine, and back out.

USER ➜ Matrix (raw)
-------------------------------------------------
"What's our month-to-date revenue by region and how can we improve next quarter?"

① RouteMaster (input agent)  
   Adds routing + engages analysts → modified prompt stored as:
   [UserMsg] ~~ ~~  
   (Matrix records this as source_prompt)

② Cortex_Engine executes action-tags: none yet (ECHO & ENGAGE are tags for Matrix, not engine)  
   Passes cleaned text (UserMsg unchanged) to **DataAnalyst**

-------------------------------------------------
DataAnalyst receives prompt:
"What's our month-to-date revenue… ( + context )"
Produces draft reply (v0):
"Sure, here are the numbers: ~SELECT region,SUM(amount) FROM sales WHERE date>=CURDATE()-30 GROUP BY region~"

③ Cortex_Engine post-processing  
   • Runs SQL → injects table  
   • Adds log entry
Draft v1 (SQL resolved):
"Sure, here are the numbers:\nREGION | TOTAL\nUS | 1.2M …"

④ CompletenessVerifier (output agent)  
   Scans → detects no improvement suggestions.  
   Emits ~~
   Matrix sets validation_status = bounced.

-------------------------------------------------
Bounce loop back to DataAnalyst with feedback.
DataAnalyst drafts v2:
"Numbers (table)… Recommendations:
 1. Upsell premium plans in EMEA…"  
   Adds ~~ (TokenTuner advice embedded)

⑤ Cortex_Engine again: applies SET_METADATA.

⑥ FactChecker → OK, CharacterGuardian → OK
⑦ AnswerJudge → ~~ (validation_status = approved)
⑧ SafetySentinel final check → passes.

Matrix publishes FINAL RESPONSE to user:
-------------------------------------------------
"Month-to-date revenue by region:\nREGION | TOTAL\nUS | $1.2M…\n\nHow to improve next quarter:\n1. Upsell premium plans in EMEA…"

🔄 Runtime Modes via `cortex_active_level`

Instead of a simple on/off flag, cortex_active_level (INT) in matrix_conversations allows experiment-friendly toggling of how much orchestration is applied:

Level	Pipeline Behaviour	Use-case
0 – Direct	No Cortex processing, no validation; messages pass directly between user ↔ selected AI(s). Only minimal logging.	Benchmark raw LLM quality & latency, fallback mode when Cortex offline.
1 – Standard	Current production flow: Cortex_Engine tag execution + Cortex_AI validation/bounce loop. Special-Agents OFF.	Stable conversations with quality assurance but low overhead.
3 – Advanced	Full Special-Agent stack (RouteMaster, TokenTuner, FactChecker…); input & output tiers obey `cortex_config` priority.	R&D mode, rich orchestration; compare against Level 1 for uplift measurement.
≥4 – Experimental	Reserved for future prototypes (e.g. multimodal agents, generative UI plugins).	A/B testing, canary releases.

Matrix reads cortex_active_level at runtime and assembles the processing chain accordingly. This makes it trivial to run the same conversation under different orchestration regimes and measure impact on cost, latency, and user satisfaction.

📝 Implementation Checklist – Introducing `cortex_active_level`

DB Migration
ALTER TABLE matrix_conversations ADD COLUMN cortex_active_level TINYINT NOT NULL DEFAULT 1;
Back-fill: UPDATE matrix_conversations SET cortex_active_level = IF(cortex_active=0,0,1);
Helper: add get_cortex_mode(conversation_id, cursor) in shared/utils.py.
Validation flag: refactor get_current_cortex_active_flag to call helper and return mode >= 1.
Router switch: in Matrix/matrix_bp.send_message branch on mode 0 / 1 / >=3 to assemble pipeline.
Agent Loader: new shared/agent_loader.load_agents(processor_type, mode) pulling from cortex_config.
Function Naming: standardise wrappers — cortex_mode_execute_input_agents, cortex_mode_execute_output_agents — for grep-able maintenance.
UI/API: swap checkbox for select; expose /api/conversation/<id>/cortex_mode.
Unit Tests: verify Mode 0 (direct), Mode 1 (current), Mode 3 (agent stack) flows.
Rollback: safe — new column has default; ignore if needed.
Optional: Insert each Special-Agent into cortex_personalities so they inherit metadata limits.

This "surgical" list isolates all required edits; existing production behaviour (mode 1) stays untouched during rollout.