Instead of initializing model at start time we do it at run time to be able to swap model provider more easily. Also introduce a third driver for openai-compatible providers, which among other allows for local models with Ollama
isLoaded
Out of memory
Run yarn dev while server running on port 3000
yarn dev
3000