LTM vs LLM: The Essential Guide for Financial Services Leaders

Here’s a clear explanation, grounded for financials services practitioners, of what an LTM (Large Tabular Model) is and how it differs from the development path of LLMs for tables and Gen‑AI so far.
What is an LTM?
An LTM (Large Tabular Model) is a type of AI foundation model designed specifically for structured data, such as:
- spreadsheets
- databases
- enterprise tables (classifications, cash flows, transactions, positions, strategy tags, business rules)
LTMs are built to natively understand, model, and generate tabular data, including heterogeneous column types, missing values, metadata, and cross‑dataset patterns.
Key idea: Unlike models that treat tables as text, LTMs are trained directly on structured datasets, allowing them to reason, predict, and generate new tabular data.
Source: Fast Company describes LTMs as models that extract insights from structured tabular data rather than unstructured text used by LLMs.
Research such as LaTable frames LTMs as the “tabular counterpart” to text/vision foundation models.
How are LTMs different from LLMs (the path so far)?
Below is the main conceptual split:
- Type of data they are built for
| Model Type | Primary Data | Notes |
| LLM (Large Language Model) | Unstructured text | Some hacky workarounds exist to serialize tables into text, but not natively optimized. |
| LTM (Large Tabular Model) | Structured tabular data | Rows, columns, categorical fields, numerical features, metadata. |
LTMs directly serve the ~80% of enterprise data locked in spreadsheets, databases, and logs.
- How they understand tables
LLMs so far (pre‑LTM):
- Treat tables as text (e.g., Markdown or CSV serialization).
- Suffer from:
- loss of structural information
- sensitivity to row/column order
- difficulty with large tables that exceed context limits
- poor structural comprehension (e.g., cell lookups, hierarchical headers)
These weaknesses are documented in Microsoft’s benchmark study showing LLMs struggle with core structural table tasks.
LTMs:
- Natively model table structure, metadata, and heterogeneous feature spaces.
- Can reason over mixed datatypes, missing fields, domain conventions.
- Are being developed to exhibit foundation‑model‑style scaling laws for tabular data.
- What they enable
LLMs for tables have mostly been limited to:
- Querying tables (natural‑language → SQL)
- Basic summarization
- Light reasoning when tables are small enough
- Spreadsheet automation
LTMs aim to support:
- Cross‑dataset generalization (training across many unrelated tables)
- Generative tabular modelling: creating synthetic datasets for privacy, simulation, or augmentation
- Better prediction models using tabular features
- Few‑shot / zero‑shot table tasks, similar to how LLMs behave in text domains
They could become the “foundation model” for structured enterprise data.
- Difference in training challenges
LLMs:
- Train on massive unstructured text corpora.
- Benefit from clear ordering (tokens, sentences).
LTMs:
- Much harder:
- Features differ across datasets.
- No fixed order of columns or rows.
- Mixture of numerical, categorical, sparse, missing values.
- Need to integrate metadata, schemas, and domain conventions.
(All noted in LaTable and LTM research.)
- How this relates to Gen‑AI
Gen‑AI to date (LLMs, multimodal models):
- Excel at text, images, audio, code.
- Poor at structured data generation.
LTMs bring Gen‑AI capabilities to tabular data, enabling:
- Synthetic dataset generation
- Data augmentation
- Automated modelling
- Advanced reasoning over enterprise data
- Agents that act on spreadsheets and databases
This is seen as a major upcoming leap for enterprise AI
In short
- LLM → handles unstructured text.
- LTM → handles structured tables.
LLMs were adapted to tables after the fact, with mixed success.
LTMs are being built from the ground up for tabular intelligence, predicted to unlock the majority of enterprise data that LLMs cannot effectively use.
See the infin8 blog on Cache Augmented Generation and Context Augmented Generation
Why Context Augmented Generation (CAG) Is Becoming Essential for Financial Services AI – AI infin8
Below are clear, practical places where there techniques makes agentic AI noticeably more useful in investment research and performance commentary—plus the guardrails you’ll likely need.
Practical implications of what “Agentic AI” means (in plain speak)
Let your agents remember facts, experiences, and rules across sessions—not just what’s in the current prompt. In agentic orchestration, that memory is consulted and updated as different “specialist” agents (data ingestion, analyst, risk, compliance, editor) hand off work. Modern frameworks distinguish semantic (facts), episodic (past interactions/results), and procedural (how to behave/style rules) memories, and provide primitives to store, retrieve, update, and forget them.
How LTM & CAG Strengthen an Agentic Workflow for Shadow AI (Cash Recs & Corporate Actions)
- Creates consistency over time:
- LTM stores validated reconciliation rules, exception patterns, and corporate action workflows so the AI behaves predictably.
- Reduces re‑explanation by users, boosting reliability.
- Improves accuracy in Cash Reconciliation:
- Learns recurring breaks, root‑cause patterns, and frequently used matching logic.
- CAG allows the AI to break reconciliation tasks into steps (ingest → classify → match → flag exceptions → propose resolutions).
- Enhances Corporate Actions processing:
- Remembers event‑specific nuances (elections, deadlines, tax treatments).
- CAG enables multi‑step reasoning: event detection → eligibility → instruction preparation → downstream booking.
- Builds trust gradually through controlled autonomy:
- Shadow AI operates silently at first (observe → propose-only → partial automation → full automation).
- LTM captures what human reviewers approve vs reject, improving decision alignment.
- Reduces operational risk:
- Memory reinforces correct interpretations of SWIFT/ISO messages and ledger codes.
- Agentic chains ensure auditability — every step is logged and reproducible.
- Improves explainability:
- LTM retains context for “why this match or instruction was suggested.”
- CAG provides step-by-step rationale, increasing transparency.
- Accelerates onboarding & scalability:
- Institutional knowledge becomes encoded, reducing reliance on individuals.
Investment research — high‑impact use cases
- Company/issuer dossiers that evolve with your thesis
Store management guideposts, catalysts, model assumptions, valuation snapshots, and “what changed” since the prior note. Agents use this dossier to (a) prep for calls, (b) highlight deltas vs. prior views, and (c) avoid repeating work. Practical win: faster initiation updates and fewer missed catalysts. - Broker research entitlements & (re)bundling compliance
Keep a memory of who can read what (provider, sector, license), how it’s paid (P&L, RPA, or UK joint payment option), and the audit trail of consumption → cost allocations. Agents can auto‑route requests and block access if entitlements or disclosures aren’t in place. (Useful as the FCA’s “payment optionality” brought back a form of bundling under guardrails.) - Source hygiene: deduplication, traceability, and retrieval choices
Persist source fingerprints (URL/DOI/provider) so agents cite consistently, avoid duplicates, and can re‑pull the exact source later. LTM complements RAG: use memory for interaction/state (what we decided, our house facts), and RAG for volatile external knowledge. This separation reduces hallucinations and keeps narratives grounded. - IR/management engagement memory
Record Q&A themes, management’s stated milestones, follow‑ups, and access notes. The next “prep packet” agent pulls this history to propose sharp, non‑repetitive questions and check whether commitments were met. (Multi‑session chat and unbounded context memory are core strengths of agentic memory systems such as MemGPT.) - Repeatable fundamental/ESG screens with exception memory
Agents remember screen definitions, false‑positive patterns, and prior overrides with justifications—so the universe narrows accurately over time and exceptions are consistent. This fits well with agentic workflows already piloted in finance. - Backtest provenance & “p‑hacking” control
Persist which datasets, factors, lookbacks, and costs were used in each experiment, plus critiques from risk/compliance agents. This memory helps reproduce (or challenge) results later and flags repeated parameter fishing. (Memory frameworks emphasize consolidation, updating, and forgetting to manage this over time.)
Investment Performance commentary — high‑impact use cases
- House style, lexicon, and jurisdictional disclaimers
Agents keep a living memory of your tone guide, banned/promissory phrasing, and required disclosures by audience/region (e.g., SEC Marketing Rule requirements and disclosure expectations). Commentary drafts are automatically conformed before compliance review. - Composite‑aware narratives that match reported numbers
LTM holds composite definitions, key exposures, and recurring drivers so commentary aligns with what’s actually reported under the GIPS Standards (fair representation, full disclosure). This reduces rework when marketing/performance teams integrate text and tables. - PRIIPs KID alignment for retail share classes
For UCITS/AIFs in scope, agents remember the current KID inputs/methods and latest ESA Q&A clarifications (e.g., scenario methodology and MRM nuances). Commentary can reference or explain scenario changes without drifting from the official KID. - Hypothetical/attribution guardrails (US marketing)
Where materials include model, extracted, or hypothetical performance, LTM stores which categories apply, what net/gross treatments are required, and the approved disclosures, so agents don’t output non‑compliant claims. (SEC’s amended rule broadened what’s allowed but tightened conditions and recordkeeping.) - Quarter‑over‑quarter continuity
Agents keep a short “facts pack” of prior period narratives (top detractors/contributors, macro framing) to produce consistent “what changed and why” updates—improving readability without re‑explaining the franchise each month. (Short‑ vs. long‑term memory separation is explicitly supported in modern agent frameworks.) - Approval workflow and audit trail
Store who edited/approved what and when, with reasons for material changes. That reduces scramble during regulator or verifier reviews and aligns with the SEC’s updated books‑and‑records requirements for marketing.
Orchestrating the agents with LTM (how to wire it simply)
- Roles & what they read/write to memory
- Data Ingestor: whitelists, entity maps, licensing/entitlements.
- Analyst: issuer dossiers, thesis deltas, open tasks.
- Risk: limit/threshold policies, flagged model risks.
- Compliance: regional rule snippets (SEC/GIPS/PRIIPs), phrasing guardrails.
- Editor: style guide, house lexicon, audience presets.
Orchestrators (e.g., LangGraph) persist state across steps/threads and provide LTM stores for agents to query/update.
- Memory types to implement on day 1
- Semantic: facts (e.g., “Fund A is composite X; UK retail requires Y disclosure”).
- Episodic: past calls/drafts/reviews (“Compliance asked us to avoid ‘guarantee’ last quarter”).
- Procedural: behavior/style rules (“Use net performance first; cite sources inline”).
- Tooling you can choose today
- LangGraph + LangMem for built‑in short/long‑term memory primitives and background consolidation.
- MemGPT/Letta to give agents OS‑like virtual context management for long, multi‑session research threads.
- Mem0 if you want a production‑focused memory layer with latency and token‑cost savings vs. full‑context approaches.
- AWS AgentCore Memory if you prefer managed persistence integrated with LangGraph.
What improves (and how you’ll measure it)
- Faster first drafts of research notes and commentaries; fewer compliance redlines (track turnaround time and redline counts). Agentic AI in finance shows gains when tasks become multi‑agent and memory‑aware.
- Better continuity across months/quarters (track reuse of prior facts/themes without “copy‑paste drift”).
- Lower run‑costs vs. “stuff the whole history into the prompt”: structured LTM reduces tokens and latency.
Risks & controls (keep it simple but strict)
- Memory bloat & staleness → schedule summarization, consolidation, and forgetting; index with time and source reliability. (This is standard in modern agent memory taxonomies.)
- Over‑reliance on memory for facts → use RAG for external/volatile data, LTM for internal state/preferences.
- Regulatory mis‑statements in commentary → encode rule snippets + approved templates in procedural memory; require a Compliance Agent pass before publishing (SEC Marketing Rule/GIPS expectations).
- Auditability → persist citations and change logs; this aligns with the SEC’s recordkeeping updates tied to marketing.
Quick start blueprint (2–3 weeks)
- Define a minimal memory schema
- Issuer dossier: {ticker, thesis, catalysts, last‑changed, key sources}
- Style/compliance: {region → required disclaimers, banned phrases, examples}
- Composite facts: {name, definition, dispersion policy, benchmark, links to latest report}
Types = semantic (facts), episodic (drafts/approvals), procedural (style/rules).
- Stand up orchestration
Use LangGraph (threads + checkpoints) + a memory store (LangMem/Mem0). Start with two agents: Analyst and Compliance. Add Editor once drafts flow. - Seed memory
Import last 6–12 months of notes/commentary and your current disclaimers/templates; tag by composite, audience, and region (US/UK/EU). Map KID‑relevant funds to their latest Q&A method choices. - Pilot two workflows
- Earnings‑update memo (dossier continuity + source hygiene).
- Monthly commentary (style/compliance + composite alignment

The infin8 Opinion
Sources: fastcompany.com, arxiv.org, gipsstandards.org, sec.gov, esma.europa.eu, docs.langchain.com, blog.langchain.com, arxiv.org, labelstud.io, mckinsey.com, docs.aws.amazon.com, research.memgpt.ai, reged.com, bsp.lu, ojs.aaai.org, fca.org.uk, simmons-simmons.com, microsoft.com, fastcompany.com
