LLMs
Methods on this page are called as client.llms.<method>(...) where client is either a synchronous Goodmem or asynchronous AsyncGoodmem instance initialized below:
from goodmem import Goodmem
client = Goodmem(base_url='http://localhost:8080', api_key='gm_...')from goodmem import AsyncGoodmem
client = AsyncGoodmem(base_url='http://localhost:8080', api_key='gm_...')Create a new LLM
Creates a new LLM configuration for text generation services. LLMs represent connections to different language model API services (like OpenAI, vLLM, etc.) and include all the necessary configuration to use them for text generation.
DUPLICATE DETECTION: Returns HTTP 409 Conflict (ALREADY_EXISTS) if another LLM exists with identical {owner_id, provider_type, endpoint_url, api_path, model_identifier, credentials_fingerprint} after URL canonicalization. Uniqueness is enforced per-owner. Credentials are hashed (SHA-256) for uniqueness while remaining encrypted. The api_path field defaults to '/chat/completions' if omitted. Requires CREATE_LLM_OWN permission (or CREATE_LLM_ANY for admin users).
- display_name (
str) — User-facing name of the LLM - model_identifier (
str) — When a known model, auto-fillsprovider_type,endpoint_url,max_context_length, andsupported_modalities. - api_key (
str, optional) — A convenience shorthand forcredentials. Converts a plain API key string to the full EndpointAuthentication structure (i.e.{"kind": "CREDENTIAL_KIND_API_KEY", "api_key": {"inline_secret": "sk-..."}}). At most one ofapi_keyandcredentialsmay be provided. - api_path (
str, optional, server default='/chat/completions') — API path for chat/completions request (defaults to/chat/completionsif not provided). - capabilities (
LLMCapabilities, optional) — LLM capabilities defining supported features and modes. Optional — server infers capabilities from model identifier if not provided. - client_config (
dict[str, Any], optional) — Provider-specific client configuration as flexible JSON structure - credentials (
EndpointAuthentication, optional) — Structured credential payload describing how to authenticate with the provider. Can also be set via the convenience shorthandapi_key. At most one ofapi_keyandcredentialsmay be provided. - default_sampling_params (
LLMSamplingParams, optional) — Default sampling parameters for generation requests - description (
str, optional) — Description of the LLM - endpoint_url (
str, optional) — Base URL for the LLM endpoint (OpenAI-compatible base, typically ends with/v1). Auto-inferred fromprovider_typefor known providers; required whenmodel_identifieris not in the registry. - labels (
dict[str, str], optional) — User-defined labels for categorization - llm_id (
str, optional) — Optional client-provided UUID for idempotent creation. If not provided, server generates a new UUID. Returns ALREADY_EXISTS if ID is already in use. - max_context_length (
int, optional) — Maximum context window size in tokens. Auto-inferred frommodel_identifierfor known models; recommended whenmodel_identifieris not in the registry. - monitoring_endpoint (
str, optional) — Monitoring endpoint URL - owner_id (
str, optional) — Optional owner ID. If not provided, derived from the authentication context. Requires CREATE_LLM_ANY permission if specified. - provider_type (
LLMProviderType, optional) — Provider backend — one of"OPENAI","LITELLM_PROXY","OPEN_ROUTER","VLLM","OLLAMA","LLAMA_CPP","CUSTOM_OPENAI_COMPATIBLE". Use"CUSTOM_OPENAI_COMPATIBLE"for third-party OpenAI-compatible endpoints such as Anthropic, Google Gemini, or Mistral. Auto-inferred frommodel_identifierfor known models; required whenmodel_identifieris not in the registry. - supported_modalities (
list[Modality], optional, server default="['TEXT']") — Modalities supported by this LLM (e.g.["TEXT"]). Auto-inferred frommodel_identifierfor known models; defaults to["TEXT"]on the server if omitted. - version (
str, optional) — Version information
resp = client.llms.create(
display_name="Doc LLM",
model_identifier="gpt-4o-mini",
api_key="sk-...",
labels={"env": "docs"},
)
llm = resp.llmGet an LLM by ID
Retrieves the details of a specific LLM configuration by its unique identifier. Requires READ_LLM_OWN permission for LLMs you own (or READ_LLM_ANY for admin users to view any user's LLMs). This is a read-only operation with no side effects.
- id (
str) — The unique identifier of the LLM to retrieve
LLMResponse — Returns the LLM configuration.
llm = client.llms.get(id="your-llm-id")
print(llm.display_name)List LLMs
Retrieves a list of LLM configurations accessible to the caller, with optional filtering.
LABEL FILTERS: Label filters accept either label.<key>=<value> or label[key]=value (for example, label.environment=production or label[environment]=production). PERMISSION-BASED FILTERING: With LIST_LLM_OWN permission, you can only see your own LLMs (owner_id filter is ignored if set to another user). With LIST_LLM_ANY permission, you can see all LLMs or filter by any owner_id. This is a read-only operation with no side effects.
- label (
dict[str, str], optional) — Filter by label key-value pairs. Label filters accept either label.<key>=<value> or label[key]=value (for example, label.environment=production or label[environment]=production). - owner_id (
str, optional) — Filter LLMs by owner ID. With LIST_LLM_ANY permission, omitting this shows all accessible LLMs; providing it filters by that owner. With LIST_LLM_OWN permission, only your own LLMs are shown regardless of this parameter. - provider_type (
LLMProviderType, optional) — Filter LLMs by provider type. Allowed values match the LLMProviderType schema.
list[LLMResponse]for llm in client.llms.list():
print(llm.llm_id, llm.display_name)Update an LLM
Updates an existing LLM configuration including display information, endpoint configuration, model parameters, credentials, and labels. All fields are optional - only specified fields will be updated.
IMPORTANT: provider_type is IMMUTABLE after creation and cannot be changed. Requires UPDATE_LLM_OWN permission for LLMs you own (or UPDATE_LLM_ANY for admin users).
- id (
str) — The unique identifier of the resource to update. - request (
LLMUpdateRequest | dict) — The update payload. Accepts a LLMUpdateRequest instance or a plain dict with the same fields. Only specified fields will be modified.
LLMResponse — Returns the LLM configuration.
from goodmem.types import LLMUpdateRequest
# Option 1: typed request object
updated = client.llms.update(id="your-llm-id", request=LLMUpdateRequest(
display_name="Doc LLM (updated)",
merge_labels={"version": "2"},
))
assert updated.llm_id == "your-llm-id"
# Option 2: plain dict (validated via pydantic)
updated = client.llms.update(id="your-llm-id", request={
"display_name": "Doc LLM (updated)",
"merge_labels": {"version": "2"},
})
assert updated.llm_id == "your-llm-id"Delete an LLM
Permanently deletes an LLM configuration. This operation cannot be undone and removes the LLM record and securely deletes stored credentials.
IMPORTANT: This does NOT invalidate or delete any previously generated content using this LLM - existing generations remain accessible. Requires DELETE_LLM_OWN permission for LLMs you own (or DELETE_LLM_ANY for admin users).
- id (
str) — The unique identifier of the LLM to delete
Noneclient.llms.delete(id="your-llm-id")Async usage: client.llms exposes the same methods on AsyncGoodmem; use await / async for as needed.
Data Models
All data models are pydantic v2 models. Fields are shown with their Python attribute names; JSON responses use camelCase aliases (e.g., owner_id → ownerId).
LLMProviderType
String enum: "OPENAI" · "LITELLM_PROXY" · "OPEN_ROUTER" · "VLLM" · "OLLAMA" · "LLAMA_CPP" · "CUSTOM_OPENAI_COMPATIBLE"
LLMCapabilities
Capabilities and features supported by an LLM service
- supports_chat (
bool, optional) — Supports conversational/chat completion format with message roles - supports_completion (
bool, optional) — Supports raw text completion with prompt continuation - supports_function_calling (
bool, optional) — Supports function/tool calling with structured responses - supports_system_messages (
bool, optional) — Supports system prompts to define model behavior and context - supports_streaming (
bool, optional) — Supports real-time token streaming during generation - supports_sampling_parameters (
bool, optional) — Supports sampling parameters like temperature, top_p, and top_k for generation control
LLMSamplingParams
Sampling and generation parameters for controlling LLM text output
- max_tokens (
int, optional) — Maximum tokens to generate (>0 if set; provider-dependent limits apply) - temperature (
float, optional) — Sampling temperature 0.0-2.0 (0.0=deterministic, 2.0=highly random) - top_p (
float, optional) — Nucleus sampling threshold 0.0-1.0 (smaller values focus on higher probability tokens) - top_k (
int, optional) — Top-k sampling limit (>0 if set; primarily for local/open-source models) - frequency_penalty (
float, optional) — Frequency penalty -2.0 to 2.0 (positive values reduce repetition based on frequency) - presence_penalty (
float, optional) — Presence penalty -2.0 to 2.0 (positive values encourage topic diversity) - stop_sequences (
list[str], optional) — Generation stop sequences (≤10 sequences; each ≤100 chars; generation halts on exact match)
CreateLLMResponse
Response containing the created LLM and any informational status messages
- llm (
LLMResponse) — The created LLM configuration - statuses (
list[GoodMemStatus], optional) — Optional status messages providing information about server-side operations performed during creation, such as capability inference results
LLMResponse
LLM configuration information
- llm_id (
str) — Unique identifier of the LLM - display_name (
str) — User-facing name of the LLM - description (
str, optional) — Description of the LLM - provider_type (
LLMProviderType) — Type of LLM provider - endpoint_url (
str) — API endpoint base URL - api_path (
str, optional) — API path for chat/completions request - model_identifier (
str) — Model identifier - supported_modalities (
list[Modality]) — Supported content modalities - credentials (
EndpointAuthentication, optional) — Structured credential payload used for upstream authentication - labels (
dict[str, str]) — User-defined labels for categorization - version (
str, optional) — Version information - monitoring_endpoint (
str, optional) — Monitoring endpoint URL - capabilities (
LLMCapabilities) — LLM capabilities defining supported features and modes - default_sampling_params (
LLMSamplingParams, optional) — Default sampling parameters for generation requests - max_context_length (
int, optional) — Maximum context window size in tokens - client_config (
dict[str, Any], optional) — Provider-specific client configuration - owner_id (
str) — Owner ID of the LLM - created_at (
int) — Creation timestamp (milliseconds since epoch) - updated_at (
int) — Last update timestamp (milliseconds since epoch) - created_by_id (
str) — ID of the user who created the LLM - updated_by_id (
str) — ID of the user who last updated the LLM
ListLLMsResponse
Response containing a list of LLM configurations
- llms (
list[LLMResponse]) — List of LLM configurations matching the request filters
LLMUpdateRequest
Request body for updating an existing LLM. All fields are optional - only specified fields will be updated.
- display_name (
str, optional) — Update display name - description (
str, optional) — Update description - endpoint_url (
str, optional) — Update endpoint base URL (OpenAI-compatible base, typically ends with /v1) - api_path (
str, optional) — Update API path - model_identifier (
str, optional) — Update model identifier (cannot be empty) - supported_modalities (
list[Modality], optional) — Update supported modalities (if array contains ≥1 elements, replaces stored set; if empty or omitted, no change) - credentials (
EndpointAuthentication, optional) — Update credentials - version (
str, optional) — Update version information - monitoring_endpoint (
str, optional) — Update monitoring endpoint URL - capabilities (
LLMCapabilities, optional) — Update LLM capabilities (replaces entire capability set; clients MUST send all flags) - default_sampling_params (
LLMSamplingParams, optional) — Update default sampling parameters - max_context_length (
int, optional) — Update maximum context window size in tokens - client_config (
dict[str, Any], optional) — Update provider-specific client configuration (replaces entire config; no merging) - replace_labels (
dict[str, str], optional) — Replace all existing labels with this set. Empty map clears all labels. Cannot be used with merge_labels. - merge_labels (
dict[str, str], optional) — Merge with existing labels: upserts with overwrite. Labels not mentioned are preserved. Cannot be used with replace_labels.