framework.models.get_chat_completion#

framework.models.get_chat_completion(message, max_tokens=1024, model_config=None, provider=None, model_id=None, budget_tokens=None, enable_thinking=False, output_model=None, base_url=None)[source]#

Execute direct chat completion requests across multiple AI providers.

This function provides immediate access to LLM model inference with support for advanced features including extended thinking, structured outputs, and automatic TypedDict conversion. It handles provider-specific API differences, credential management, and HTTP proxy configuration transparently.

The function supports multiple interaction patterns: - Simple text-to-text completion for basic use cases - Structured output generation with Pydantic models or TypedDict - Extended thinking workflows for complex reasoning tasks - Enterprise proxy and timeout configuration

Provider-specific features: - Anthropic: Extended thinking with budget_tokens, content block responses - Google: Thinking configuration for enhanced reasoning - OpenAI: Structured outputs with beta chat completions API - Ollama: Local model inference with JSON schema validation - CBORG: OpenAI-compatible API with custom endpoints (LBNL-provided service)

Parameters:
  • message (str) – Input prompt or message for the LLM model

  • max_tokens (int) – Maximum tokens to generate in the response

  • model_config (dict, optional) – Configuration dictionary with provider and model settings

  • provider (str, optional) – AI provider name (‘anthropic’, ‘google’, ‘openai’, ‘ollama’, ‘cborg’)

  • model_id (str, optional) – Specific model identifier recognized by the provider

  • budget_tokens (int, optional) – Thinking budget for Anthropic/Google extended reasoning

  • enable_thinking (bool) – Enable extended thinking capabilities where supported

  • output_model (Type[BaseModel], optional) – Pydantic model or TypedDict for structured output validation

  • base_url (str, optional) – Custom API endpoint, required for Ollama and CBORG providers

Raises:
  • ValueError – If required provider, model_id, api_key, or base_url are missing

  • ValueError – If budget_tokens >= max_tokens or other invalid parameter combinations

  • pydantic.ValidationError – If output_model validation fails for structured outputs

  • anthropic.APIError – For Anthropic API-specific errors

  • openai.APIError – For OpenAI API-specific errors

  • ollama.ResponseError – For Ollama API-specific errors

Returns:

Model response in format determined by provider and output_model settings

Return type:

Union[str, BaseModel, list]

Note

Extended thinking is currently supported by Anthropic (with budget_tokens) and Google (with thinking_config). Other providers will log warnings if thinking parameters are provided.

Warning

When using structured outputs, ensure your prompt guides the model toward generating the expected structure. Not all models handle schema constraints equally well.

Examples

Simple text completion:

>>> from framework.models import get_chat_completion
>>> response = get_chat_completion(
...     message="Explain quantum computing in simple terms",
...     provider="anthropic",
...     model_id="claude-3-sonnet-20240229",
...     max_tokens=500
... )
>>> print(response)

Extended thinking with Anthropic:

>>> response = get_chat_completion(
...     message="Solve this complex reasoning problem...",
...     provider="anthropic",
...     model_id="claude-3-sonnet-20240229",
...     enable_thinking=True,
...     budget_tokens=1000,
...     max_tokens=2000
... )
>>> # Response includes thinking process and final answer

Structured output with Pydantic model:

>>> from pydantic import BaseModel
>>> class AnalysisResult(BaseModel):
...     summary: str
...     confidence: float
...     recommendations: list[str]
>>>
>>> result = get_chat_completion(
...     message="Analyze this data and provide structured results",
...     provider="openai",
...     model_id="gpt-4",
...     output_model=AnalysisResult
... )
>>> print(f"Confidence: {result.confidence}")

Using configuration dictionary:

>>> config = {
...     "provider": "ollama",
...     "model_id": "llama3.1:8b",
...     "max_tokens": 1000
... }
>>> response = get_chat_completion(
...     message="Hello, how are you?",
...     model_config=config,
...     base_url="http://localhost:11434"
... )

See also

get_model() : Create model instances for PydanticAI agents configs.config.get_provider_config() : Provider configuration loading pydantic.BaseModel : Base class for structured output models Convention over Configuration: Configuration-Driven Registry Patterns : Complete model configuration and usage guide