LLMs

Methods on this page are called as client.llms.<method>(...) where client is either a synchronous Goodmem or asynchronous AsyncGoodmem instance initialized below:

from goodmem import Goodmem
client = Goodmem(base_url='http://localhost:8080', api_key='gm_...')

from goodmem import AsyncGoodmem
client = AsyncGoodmem(base_url='http://localhost:8080', api_key='gm_...')

Create a new LLM

llms.create(*, display_name: str, model_identifier: str, api_key: str = None, api_path: str = None, capabilities: LLMCapabilities = None, client_config: dict[str, Any] = None, credentials: EndpointAuthentication = None, default_sampling_params: LLMSamplingParams = None, description: str = None, endpoint_url: str = None, labels: dict[str, str] = None, llm_id: str = None, max_context_length: int = None, monitoring_endpoint: str = None, owner_id: str = None, provider_type: LLMProviderType = None, supported_modalities: list[Modality] = None, version: str = None) -> CreateLLMResponse

Creates a new LLM configuration for text generation services. LLMs represent connections to different language model API services (like OpenAI, vLLM, etc.) and include all the necessary configuration to use them for text generation.

DUPLICATE DETECTION: Returns HTTP 409 Conflict (ALREADY_EXISTS) if another LLM exists with identical {owner_id, provider_type, endpoint_url, api_path, model_identifier, credentials_fingerprint} after URL canonicalization. Uniqueness is enforced per-owner. Credentials are hashed (SHA-256) for uniqueness while remaining encrypted. The api_path field defaults to '/chat/completions' if omitted. Requires CREATE_LLM_OWN permission (or CREATE_LLM_ANY for admin users).

Parameters

display_name (str) — User-facing name of the LLM
model_identifier (str) — When a known model, auto-fills provider_type, endpoint_url, max_context_length, and supported_modalities.
api_key (str, optional) — A convenience shorthand for credentials. Converts a plain API key string to the full EndpointAuthentication structure (i.e. {"kind": "CREDENTIAL_KIND_API_KEY", "api_key": {"inline_secret": "sk-..."}}). At most one of api_key and credentials may be provided.
api_path (str, optional, server default='/chat/completions') — API path for chat/completions request (defaults to /chat/completions if not provided).
capabilities (LLMCapabilities, optional) — LLM capabilities defining supported features and modes. Optional — server infers capabilities from model identifier if not provided.
client_config (dict[str, Any], optional) — Provider-specific client configuration as flexible JSON structure
credentials (EndpointAuthentication, optional) — Structured credential payload describing how to authenticate with the provider. Required for SaaS providers; optional for local or proxy providers. Can also be set via the convenience shorthand api_key. At most one of api_key and credentials may be provided.
default_sampling_params (LLMSamplingParams, optional) — Default sampling parameters for generation requests
description (str, optional) — Description of the LLM
endpoint_url (str, optional) — Base URL for the LLM endpoint (OpenAI-compatible base, typically ends with /v1). Auto-inferred from provider_type for known providers; required when model_identifier is not in the registry.
labels (dict[str, str], optional) — User-defined labels for categorization
llm_id (str, format: uuid, optional) — Optional client-provided UUID for idempotent creation. If not provided, server generates a new UUID. Returns ALREADY_EXISTS if ID is already in use.
max_context_length (int, format: int32, optional) — Maximum context window size in tokens. Auto-inferred from model_identifier for known models; recommended when model_identifier is not in the registry.
monitoring_endpoint (str, optional) — Monitoring endpoint URL
owner_id (str, format: uuid, optional) — Optional owner ID. If not provided, derived from the authentication context. Requires CREATE_LLM_ANY permission if specified.
provider_type (LLMProviderType, optional) — Provider backend — one of "OPENAI", "LITELLM_PROXY", "OPEN_ROUTER", "VLLM", "OLLAMA", "LLAMA_CPP", "CUSTOM_OPENAI_COMPATIBLE". Use "CUSTOM_OPENAI_COMPATIBLE" for third-party OpenAI-compatible endpoints such as Anthropic, Google Gemini, or Mistral. Auto-inferred from model_identifier for known models; required when model_identifier is not in the registry.
supported_modalities (list[Modality], optional, server default="['TEXT']") — Modalities supported by this LLM (e.g. ["TEXT"]). Auto-inferred from model_identifier for known models; defaults to ["TEXT"] on the server if omitted.
version (str, optional) — Version information

Returns

CreateLLMResponse

Example

resp = client.llms.create(
    display_name="Doc LLM",
    model_identifier="gpt-4o-mini",
    api_key="sk-...",
    labels={"env": "docs"},
)
llm = resp.llm

Get an LLM by ID

llms.get(*, id: str) -> LLMResponse

Retrieves the details of a specific LLM configuration by its unique identifier. Requires READ_LLM_OWN permission for LLMs you own (or READ_LLM_ANY for admin users to view any user's LLMs). This is a read-only operation with no side effects.

Parameters

id (str) — The unique identifier of the LLM to retrieve

Returns

LLMResponse — Returns the LLM configuration.

Example

llm = client.llms.get(id="your-llm-id")
print(llm.display_name)

List LLMs

llms.list(*, label: dict[str, str] = None, owner_id: str = None, provider_type: LLMProviderType = None) -> list[LLMResponse]

Retrieves a list of LLM configurations accessible to the caller, with optional filtering.

LABEL FILTERS: Label filters accept either label.<key>=<value> or label[key]=value (for example, label.environment=production or label[environment]=production). PERMISSION-BASED FILTERING: With LIST_LLM_OWN permission, you can only see your own LLMs (owner_id filter is ignored if set to another user). With LIST_LLM_ANY permission, you can see all LLMs or filter by any owner_id. This is a read-only operation with no side effects.

Parameters

label (dict[str, str], optional) — Filter by label key-value pairs. Label filters accept either label.<key>=<value> or label[key]=value (for example, label.environment=production or label[environment]=production).
owner_id (str, optional) — Filter LLMs by owner ID. With LIST_LLM_ANY permission, omitting this shows all accessible LLMs; providing it filters by that owner. With LIST_LLM_OWN permission, only your own LLMs are shown regardless of this parameter.
provider_type (LLMProviderType, optional) — Filter LLMs by provider type. Allowed values match the LLMProviderType schema.

Returns

list[LLMResponse]

Example

for llm in client.llms.list():
    print(llm.llm_id, llm.display_name)

Update an LLM

llms.update(*, id: str, request: LLMUpdateRequest | dict) -> LLMResponse

Updates an existing LLM configuration including display information, endpoint configuration, model parameters, credentials, and labels. All fields are optional - only specified fields will be updated. SUPPORTED_MODALITIES UPDATE: If the array contains >=1 elements, it replaces the stored set; if empty or omitted, no change occurs and it does not count as an update by itself.

IMPORTANT: provider_type is IMMUTABLE after creation and cannot be changed. Requires UPDATE_LLM_OWN permission for LLMs you own (or UPDATE_LLM_ANY for admin users).

Parameters

id (str) — The unique identifier of the resource to update.
request (LLMUpdateRequest | dict) — The update payload. Accepts a LLMUpdateRequest instance or a plain dict with the same fields. Only specified fields will be modified.

Returns

LLMResponse — Returns the LLM configuration.

Example

from goodmem.types import LLMUpdateRequest
# Option 1: typed request object
updated = client.llms.update(id="your-llm-id", request=LLMUpdateRequest(
    display_name="Doc LLM (updated)",
    merge_labels={"version": "2"},
))
assert updated.llm_id == "your-llm-id"
# Option 2: plain dict (validated via pydantic)
updated = client.llms.update(id="your-llm-id", request={
    "display_name": "Doc LLM (updated)",
    "merge_labels": {"version": "2"},
})
assert updated.llm_id == "your-llm-id"

Delete an LLM

llms.delete(*,id: str) -> None

Permanently deletes an LLM configuration. This operation cannot be undone and removes the LLM record and securely deletes stored credentials.

IMPORTANT: This does NOT invalidate or delete any previously generated content using this LLM - existing generations remain accessible. Requires DELETE_LLM_OWN permission for LLMs you own (or DELETE_LLM_ANY for admin users).

Parameters

id (str) — The unique identifier of the LLM to delete

Returns

None

Example

client.llms.delete(id="your-llm-id")

Async usage: client.llms exposes the same methods on AsyncGoodmem; use await / async for as needed.

Data Models

All data models are pydantic v2 models. Fields are shown with their Python attribute names; JSON responses use camelCase aliases (e.g., owner_id → ownerId).

LLMProviderType

String enum: "OPENAI" · "LITELLM_PROXY" · "OPEN_ROUTER" · "VLLM" · "OLLAMA" · "LLAMA_CPP" · "CUSTOM_OPENAI_COMPATIBLE"

LLMCapabilities

Capabilities and features supported by an LLM service

supports_chat (bool, optional) — Supports conversational/chat completion format with message roles
supports_completion (bool, optional) — Supports raw text completion with prompt continuation
supports_function_calling (bool, optional) — Supports function/tool calling with structured responses
supports_system_messages (bool, optional) — Supports system prompts to define model behavior and context
supports_streaming (bool, optional) — Supports real-time token streaming during generation
supports_sampling_parameters (bool, optional) — Supports sampling parameters like temperature, top_p, and top_k for generation control

LLMSamplingParams

Sampling and generation parameters for controlling LLM text output

max_tokens (int, optional) — Maximum tokens to generate (>0 if set; provider-dependent limits apply)
temperature (float, optional) — Sampling temperature; valid range depends on the configured provider
top_p (float, optional) — Nucleus sampling threshold 0.0-1.0 (smaller values focus on higher probability tokens)
top_k (int, optional) — Top-k sampling limit (>0 if set; primarily for local/open-source models)
frequency_penalty (float, optional) — Frequency penalty; valid range depends on the configured provider
presence_penalty (float, optional) — Presence penalty; valid range depends on the configured provider
stop_sequences (list[str], optional) — Generation stop sequences; generation halts on exact match

CreateLLMResponse

Response containing the created LLM and any informational status messages

llm (LLMResponse) — The created LLM configuration
statuses (list[GoodMemStatus], optional) — Optional status messages providing information about server-side operations performed during creation, such as capability inference results

LLMResponse

LLM configuration information

llm_id (str) — Unique identifier of the LLM
display_name (str) — User-facing name of the LLM
description (str, optional) — Description of the LLM
provider_type (LLMProviderType) — Type of LLM provider
endpoint_url (str) — API endpoint base URL
api_path (str, optional) — API path for chat/completions request
model_identifier (str) — Model identifier
supported_modalities (list[Modality]) — Supported content modalities
credentials (EndpointAuthentication, optional) — Structured credential payload used for upstream authentication
labels (dict[str, str]) — User-defined labels for categorization
version (str, optional) — Version information
monitoring_endpoint (str, optional) — Monitoring endpoint URL
capabilities (LLMCapabilities) — LLM capabilities defining supported features and modes
default_sampling_params (LLMSamplingParams, optional) — Default sampling parameters for generation requests
max_context_length (int, optional) — Maximum context window size in tokens
client_config (dict[str, Any], optional) — Provider-specific client configuration
owner_id (str) — Owner ID of the LLM
created_at (int) — Creation timestamp (milliseconds since epoch)
updated_at (int) — Last update timestamp (milliseconds since epoch)
created_by_id (str) — ID of the user who created the LLM
updated_by_id (str) — ID of the user who last updated the LLM

LLMUpdateRequest

Request body for updating an existing LLM. All fields are optional - only specified fields will be updated. supported_modalities replaces the stored set only when the array contains at least one value; empty or omitted leaves it unchanged and does not count as an update by itself.

display_name (str, optional) — Update display name
description (str, optional) — Update description
endpoint_url (str, optional) — Update endpoint base URL (OpenAI-compatible base, typically ends with /v1)
api_path (str, optional) — Update API path
model_identifier (str, optional) — Update model identifier (cannot be empty)
supported_modalities (list[Modality], optional) — Update supported modalities (if array contains >=1 elements, replaces stored set; if empty or omitted, no change and does not count as an update by itself)
credentials (EndpointAuthentication, optional) — Update credentials
version (str, optional) — Update version information
monitoring_endpoint (str, optional) — Update monitoring endpoint URL
capabilities (LLMCapabilities, optional) — Update LLM capabilities (replaces entire capability set; clients MUST send all flags)
default_sampling_params (LLMSamplingParams, optional) — Update default sampling parameters
max_context_length (int, optional) — Update maximum context window size in tokens
client_config (dict[str, Any], optional) — Update provider-specific client configuration (replaces entire config; no merging)
replace_labels (dict[str, str], optional) — Replace all existing labels with this set. Empty map clears all labels. Cannot be used with merge_labels.
merge_labels (dict[str, str], optional) — Merge with existing labels: upserts with overwrite. Labels not mentioned are preserved. Cannot be used with replace_labels.

LLMs

Equivalent REST calls

Equivalent REST calls

Equivalent REST calls

Equivalent REST calls

Equivalent REST calls

On this page