Hugging Face has introduced a new integration that allows developers to connect Inference Providers directly with GitHub Copilot Chat in Visual Studio Code. The update means that open-source large language models — including Kimi K2, DeepSeek V3.1, GLM 4.5, and others — can now be accessed and tested from inside the VS Code editor, without the need to switch platforms or juggle multiple tools.
The workflow is designed to be simple. Developers install the Hugging Face Copilot Chat extension, open VS Code’s chat interface, select the Hugging Face provider, enter their Hugging Face token, and then add the models they want to use. Once connected, they can seamlessly switch between providers and models using the familiar model picker interface.
One practical note quickly surfaced in community discussions: the feature requires an up-to-date version of the editor. As AI researcher Aditya Wresniyandaka highlighted on LinkedIn:
The document forgot to mention you need VS Code August 2025 version 1.104.0
GitHub Copilot Chat has traditionally relied on a closed set of proprietary models. By linking it with Hugging Face’s network of Inference Providers, developers gain access to a much broader set of AI tools, including experimental and highly specialized open models.
Muhammad Arshad Iqbal praised the move, noting:
Oh, this is so cool! Now we can use all those powerful open-source coding AIs right inside VS Code. No more switching tabs just to test a model like Qwen3-Coder.
This integration opens the door for developers to use Copilot Chat with models optimized for particular programming tasks, industries, or research domains, rather than being limited to the defaults. The update is powered by Hugging Face Inference Providers, a service that gives developers access to hundreds of machine learning models through a single API.
The key value proposition is unification: instead of juggling multiple APIs with different reliability guarantees, developers can query models across providers through one consistent interface. Hugging Face emphasizes several benefits:
- Instant access to cutting-edge models, beyond what a single vendor catalog could offer.
- Zero vendor lock-in, since developers can switch between providers with minimal code changes.
- Production-ready performance, with high availability and low-latency inference.
- Developer-friendly integration, including drop-in compatibility with the OpenAI chat completions API and client SDKs for Python and JavaScript.
Hugging Face has structured the integration for accessibility. There is a free tier of monthly inference credits to experiment with, while Pro, Team, and Enterprise plans provide additional capacity and pay-as-you-go pricing. According to Hugging Face, what developers pay is exactly what the providers charge — with no markup added.