vaklm was created to simplify intecctions with OpenAI-compatible APIs, providing a cleaner alternative to the verbose client.chat.completions.create pattern. This Python client offers a more intuitive interface while adding valuable reasoning capabilities to API responses.
vaklm is designed to make working with AI APIs more efficient and developer-friendly, providing an easy way to interact with OpenAI-compatible API endpoints while supporting reasoning capabilities.
Installation
pip install vaklm
Usage
Basic Usage
from vaklm import vaklm, VAKLMException
try:
content, reasoning = vaklm(
endpoint="http://localhost:11434/v1/chat/completions",
model_name="llama3.2:latest",
user_prompt="Explain quantum computing in simple terms",
system_prompt="You are a helpful AI assistant",
api_key="YOUR_API_KEY",
temperature=0.7
)
print("Content:", content)
print("Reasoning:", reasoning)
except VAKLMException as e:
print(f"Error: {str(e)}")
Streaming Usage
from vaklm import vaklm
print("\nStreaming example:")
try:
for content, reasoning in vaklm(
endpoint="http://localhost:11434/v1/chat/completions",
model_name="llama3.2:latest",
user_prompt="Write a short story about a cat.",
system_prompt="You are a creative writer.",
api_key="YOUR_API_KEY",
stream=True,
temperature=0.7
):
print(content, end='', flush=True)
if reasoning:
print(f"\n[Reasoning: {reasoning}]")
except VAKLMException as e:
print(f"Error: {str(e)}")
Features
- Supports both streaming and non-streaming responses
- Includes reasoning content in responses
- Automatic retry logic for failed requests
- Configurable temperature and max tokens
- System prompt support for context setting
- Comprehensive error handling
Configuration
The vaklm function accepts the following parameters:
endpoint: API endpoint URL (required)model_name: Model identifier (required)user_prompt: User’s input message (required)system_prompt: Optional system context messageapi_key: API key for authenticationstream: Whether to stream the response (default: False)temperature: Sampling temperature (0-2, default: 1.0)max_tokens: Maximum tokens to generatetimeout: Request timeout in seconds (default: 30)max_retries: Maximum retry attempts (default: 3)retry_delay: Base delay between retries (default: 1.0)
Error Handling
The client raises VAKLMException for general errors, with specific subclasses:
APIError: For API-specific errorsStreamingError: For streaming-specific errors
Always wrap calls in try/except blocks to handle potential errors.
vaklm provides a cleaner, more intuitive way to work with OpenAI-compatible APIs while adding valuable reasoning capabilities. By simplifying the API interaction pattern and adding features like automatic retries and streaming support, VAKLM helps developers focus on building great AI-powered applications rather than wrestling with API boilerplate. Give it a try in your next project and experience the difference!
About Hemanth HM
Hemanth HM is a Sr. Staff Engineer at PayPal, Google Developer Expert, TC39 delegate, FOSS advocate, and community leader with a passion for programming, AI, and open-source contributions..