Instead of initializing model at start time we do it at run time to be able to swap model provider more easily. Also introduce a third driver for openai-compatible providers, which among other allows for local models with Ollama