Local LLM inference powered by Ollama and gemma3:4b
Send a prompt and get a response (non-streaming):
curl -X POST https://llm.rxweb.ca/api/generate \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR_KEY" \
-d '{"prompt": "What is aspirin?", "stream": false}'
For streaming (Server-Sent Events):
curl -X POST https://llm.rxweb.ca/api/generate \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR_KEY" \
-d '{"prompt": "What is aspirin?", "stream": true}'
{ prompt, stream?, system?, temperature?, max_tokens? }All requests to /api/generate require an X-API-Key header. Contact an admin to get a key.