Extending Semantic Models with AI Memory


In data analytics, semantic models define how your business data is organized — they describe elements such as metrics, dimensions, and relationships so that AI can understand the meaning of terms like “Revenue,” “Customer,” or “Order.”
But in real organizations, this semantic layer is often not enough. Every company has its own rules, abbreviations, synonyms, exceptions, and workflows that are not part of the data model. For AI agents to be truly useful, they must also learn these organization-specific details.
The Problem
Even with full knowledge of the data model, an agent may still misunderstand internal jargon or decision logic.
For example:
- “GMV” might mean Gross Merchandise Value for one team and Gross Margin Value for another.
- Some metrics might be “internal-only” or “draft.”
- Certain dimensions should never be joined (e.g., region vs. corridor).
Without this additional knowledge, the agent’s outputs can be technically correct but contextually wrong.
AI Memory: Long-Term Knowledge for Agents
To bridge this gap between data understanding and organizational context, we developed AI Memory — a persistent knowledge layer that lets organizations inject company-specific rules, vocabulary, and feedback into AI agents.
It acts as the agent’s long-term memory, a structured store of:
- Business-specific terminology and abbreviations.
- Behavioral adjustments and hand-crafted instructions.
- User feedback and refinement hints.
- Guardrails or exceptions to general rules.
These items are stored persistently and used to fine-tune AI reasoning without requiring full model retraining.
To manage this knowledge effectively, AI Memory is divided into two complementary types.
Two Types of AI Memory
To help the agent decide when and how to use stored knowledge, AI Memory items are grouped into two categories: one for always-on guidance, and another for dynamic, context-aware adjustments.
| Type | Description | Usage |
|---|---|---|
| Always | Persistently included in every system prompt | Used for global instructions, rules, or tone |
| Auto | Dynamically injected based on semantic similarity to the current query | Used for contextual hints and RAG-based augmentation |
Together, these two memory types balance consistency (through Always items) and contextual adaptability (through Auto items). Admins can manage both types in the UI — adding, editing, or deleting them as the organization evolves.
How It Works
Now that we’ve defined how memory items are structured, let’s look at how they’re actually used when an AI agent processes a query.
Each time an agent receives a user query, the system:
- Runs a similarity search between the query and Auto memory items.
- Selects the most relevant items.
- Combines them with "Always" items and embeds them into the context.
- Executes the pipeline (RAG or code generation) using both the semantic model and relevant memory.
This process ensures the agent reasons not just from the data model but also from the organization’s embedded knowledge.
Below is a simplified example of how the memory instructions could be embedded in the prompt:
{% if memory_items %}
---
## 🧠 **AI Memory — User-Provided Instructions**
The user has stored custom AI Memory items that must be **applied** during reasoning and routing.
These items represent **persistent contextual knowledge** — such as business rules, terminology, preferences, or behavior guidelines — that should shape your interpretation and decision-making.
> ⚠️ Always apply these memory items within the constraints of the system prompt.
> If a memory item conflicts with a strict technical rule (e.g., schema, format, or security restriction),
> follow the technical rule but **preserve the intent** of the instruction whenever possible.
### **User Instructions (AI Memory)**
{% for memory_item in memory_items %}
- {{ memory_item }}
{% endfor %}
Ensure these AI Memory instructions are **incorporated** into your reasoning, planning, and final output generation.
{% endif %}
This mechanism gives agents rich contextual awareness, but it also introduces new design and governance challenges, which we will address in the next section.
Risks and Design Considerations of AI Memory
While AI Memory significantly enhances agent intelligence, it also introduces new system-level risks that require careful design and governance.
Below are the key considerations we addressed during implementation:
1. Prompt Injection — Preventing Malicious Instructions
User-provided memory can contain or reference instructions that override or manipulate system behavior. To mitigate this, memory inputs should be validated, sanitized, and subject to permission-based access controls.
2. Conflict Resolution — Balancing Rules and Intent
Conflicts may occur between:
- System-level technical requirements (e.g., strict JSON output).
- User-defined memory items (e.g., “respond in Markdown”).
- Multiple memory items that provide contradictory guidance.
In these cases, system-level requirements take precedence.
The agent must still apply the intent of the user instruction, provided it does not violate technical constraints.
Rule of Thumb: Follow technical constraints strictly, but preserve user intent where possible.
3. Context Overflow — Managing Input Size and Relevance
Large or redundant memory increases input size and can push relevant context out of the model’s attention window.
To address this, we implemented:
- Relevance scoring to rank memory items.
- Dynamic selection to include only the most pertinent knowledge before each invocation.
This ensures that the model stays focused and efficient, even as memory grows over time.
By managing these risks systematically, AI Memory can safely extend semantic models without compromising reliability, performance, or governance. The next section illustrates how these safeguards play out in practice.
Example: Applying AI Memory
To see these mechanisms in action, let’s walk through a simple example showing how AI Memory influences query generation and interpretation.
Memory items
- Always refer to 'GMV' as 'Gross Merchandise Value (GMV)' in all outputs.
- When dealing with corridor analysis, exclude 'test corridors'.
User query
“What was GMV per corridor in the last quarter?”
System behavior
When this query is received:
- The Auto memory item about corridors is matched by similarity search.
- Both relevant memory items (GMV definition and corridor exclusion) are inserted into the prompt.
- The agent then generates the output using these contextual rules automatically.
As a result, the system output reflects both business-specific instructions without the user needing to restate them.
Resulting Query (Simplified):
SELECT corridor, SUM(gross_merchandise_value)
FROM sales
WHERE corridor NOT LIKE '%test%'
AND quarter = 'Q3'
GROUP BY corridor;
This example highlights how AI Memory ensures data queries remain technically valid while staying aligned with organizational semantics and business rules.
Summary
AI Memory extends the semantic model with long-term organizational knowledge. It ensures that analytics agents:
- Understand company-specific terminology and rules.
- Adapt to user feedback over time.
- Stay consistent across use cases.
- Resolve instruction conflicts predictably.
By layering persistent business context on top of the semantic model, teams can refine agent behavior continuously, without retraining or redeploying
In short, AI Memory transforms static data understanding into living organizational intelligence, enabling AI agents to think and operate in your company’s own language.
At GoodData, we designed AI Memory to help enterprises turn their analytics layer into a living knowledge system; one that learns, adapts, and grows with the business. Talk to our team to discuss how it fits your data strategy.