Content Blocks
LLM messages aren't strings anymore. Modern models accept text, images, audio, tool calls, and tool results — often all in the same message. llm-ports represents this as a discriminated union of ContentBlock types.
import type { ContentBlock, MessageContent, LLMMessage } from "@llm-ports/core";The five block types
type ContentBlock =
| TextBlock // { type: "text", text: string }
| ImageBlock // { type: "image", source: { kind: "base64"|"url", ... } }
| AudioBlock // { type: "audio", source: { kind: "base64", mediaType, data } }
| ToolUseBlock // { type: "tool_use", id, name, input }
| ToolResultBlock; // { type: "tool_result", toolUseId, content, isError? }MessageContent: string OR blocks
MessageContent = string | ContentBlock[]. A bare string is sugar for [{ type: "text", text: "..." }]. Adapters accept either form and normalize internally.
// These are equivalent:
const a: LLMMessage = { role: "user", content: "hello" };
const b: LLMMessage = { role: "user", content: [{ type: "text", text: "hello" }] };Use a string when content is text-only. Use the array form when you need multimodal content or need to mix text with tool calls.
Multimodal: images and audio
// Image (URL): the most common form
const message: LLMMessage = {
role: "user",
content: [
{ type: "text", text: "Describe this image:" },
{ type: "image", source: { kind: "url", url: "https://example.com/cat.jpg" } },
],
};
// Image (base64): when you have local data
const message2: LLMMessage = {
role: "user",
content: [
{ type: "text", text: "What's in this screenshot?" },
{
type: "image",
source: {
kind: "base64",
mediaType: "image/png",
data: base64EncodedImage,
},
},
],
};
// Audio (base64; URL audio not supported by current providers)
const message3: LLMMessage = {
role: "user",
content: [
{ type: "text", text: "Transcribe this audio:" },
{
type: "audio",
source: { kind: "base64", mediaType: "audio/mp3", data: base64EncodedAudio },
},
],
};Adapter capability differences
Not all adapters support all block types. The adapter throws ContentBlockUnsupportedError if you send a block it can't handle.
| Block | Anthropic | OpenAI | Ollama | Vercel |
|---|---|---|---|---|
text | ✓ | ✓ | ✓ | ✓ |
image (base64) | ✓ | ✓ (data URI) | ✓ | partial (via SDK) |
image (URL) | ✓ | ✓ | ✗ (Ollama doesn't fetch URLs) | partial |
audio | ✗ (Anthropic chat doesn't accept audio) | ✓ (wav, mp3 only; ogg ✗) | ✗ | ✗ |
tool_use (assistant) | ✓ | ✓ (as tool_calls) | ✓ | partial (single-turn in v0.1) |
tool_result (user→tool message) | ✓ | ✓ (separate role: tool message) | ✓ | partial |
See the adapter feature matrix → for the full breakdown.
Tool blocks: where the magic happens
When the model calls a tool, the assistant message contains a tool_use block. When you respond with the tool's output, you send a tool_result block back. This dance enables agent loops.
// Step 1: User asks the agent to do something
let conversation: LLMMessage[] = [
{ role: "user", content: "Search the inbox for invoices from Acme" },
];
// Step 2: Agent's response (assistant message) — includes a tool_use block
conversation.push({
role: "assistant",
content: [
{ type: "text", text: "Let me search." },
{
type: "tool_use",
id: "toolu_01",
name: "searchEmails",
input: { query: "from:acme.com invoice" },
},
],
});
// Step 3: You execute the tool and append the result
conversation.push({
role: "user",
content: [
{
type: "tool_result",
toolUseId: "toolu_01",
content: JSON.stringify({ found: 3, ids: [...] }),
},
],
});
// Step 4: Send the conversation back to the agent for the next turn
const next = await llm.runAgent({ ... });In practice, runAgent on the LLM port handles this loop for you in v0.1: declare the tools, the agent calls them, the multi-turn message accumulation happens inside the adapter. A higher-level createAgent capability factory (matching the ergonomics of createClassifier / createDrafter) ships in v0.2 — until then, use llm.runAgent({ ... }) directly.
Adapter normalization
Internally, each adapter converts ContentBlock[] to its provider's wire format and back. Examples:
- Anthropic:
image.source = { kind: "base64", mediaType, data }→{ type: "base64", media_type, data }(note the field rename) - OpenAI: image becomes
{ type: "image_url", image_url: { url: "..." or "data:image/...;base64,..." } } - Ollama: images get split into a separate
images: [base64, ...]field; text-onlycontent: string - Vercel: passes through (Vercel's content shape is similar to ours)
Tool blocks have similar per-adapter quirks. For example, OpenAI promotes assistant tool_use blocks into a separate tool_calls field on the message; user tool_result blocks become standalone role: "tool" messages. Each adapter handles this on the way in and out.
Helper functions
import { toBlocks, isStringContent, tryCollapseToText, extractText } from "@llm-ports/core";
// Always get the array form
toBlocks("hello"); // [{ type: "text", text: "hello" }]
toBlocks([{ type: "text", text: "hi" }]); // (passthrough)
// Check the runtime form
isStringContent("hello"); // true
isStringContent([{ type: "text", text: "hello" }]); // false
// Collapse text-only arrays to a string (returns null if non-text blocks present)
tryCollapseToText([{ type: "text", text: "a" }, { type: "text", text: "b" }]); // "ab"
tryCollapseToText([{ type: "text", text: "a" }, { type: "image", ... }]); // null
// Strip non-text content
extractText([{ type: "text", text: "describe " }, { type: "image", ... }, { type: "text", text: "this" }]);
// "describe this"Use these in your own code when interfacing between user-supplied content and adapter calls.
Reading next
createDraftercapability → — handles thread history with mixed content- Tool-use security → — when blocks carry untrusted data