API Reference
The GitHub Copilot API Gateway exposes OpenAI-compatible endpoints that translate requests to your authenticated Copilot session. This reference documents all available endpoints.
Authentication
Authentication happens automatically through your VS Code GitHub Copilot session. For the API client, you have two options:
Option 1: No Authentication (Local)
When running locally (127.0.0.1), you can use any string as the API key:
client = OpenAI(
base_url="http://127.0.0.1:3030/v1",
api_key="anything" # Ignored by server
)
Option 2: Bearer Token (When API Key Configured)
If you've set server.apiKey in settings, include it in your requests:
Authorization: Bearer your-configured-api-key
Base URL
http://127.0.0.1:3030
Or if running on LAN: http://YOUR-IP:3030
OpenAI API
Full compatibility with OpenAI's Chat Completions API.
Returns a list of models available through your Copilot subscription.
Example Request
curl http://127.0.0.1:3030/v1/models
Example Response
{
"object": "list",
"data": [
{ "id": "gpt-4o", "object": "model" },
{ "id": "gpt-4", "object": "model" },
{ "id": "gpt-3.5-turbo", "object": "model" },
{ "id": "claude-3.5-sonnet", "object": "model" }
]
}
Anthropic API
Compatible with Anthropic Claude SDK.
Request Body
| Parameter | Type | Description |
|---|---|---|
model required |
string | e.g., claude-3-5-sonnet-20241022 |
messages required |
array | Array of message objects |
max_tokens required |
integer | Maximum tokens to generate |
system |
string | System prompt |
stream |
boolean | Enable streaming |
Example Request
import anthropic client = anthropic.Anthropic( base_url="http://127.0.0.1:3030", api_key="copilot" ) message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=[ {"role": "user", "content": "Explain quantum computing"} ] ) print(message.content[0].text)
Google API
Compatible with Google Generative AI SDK.
URL Parameters
| Parameter | Type | Description |
|---|---|---|
:model required |
string | Model name (e.g., gemini-pro) |
Example Request
import google.generativeai as genai genai.configure( api_key="copilot", transport="rest", client_options={"api_endpoint": "http://127.0.0.1:3030"} ) model = genai.GenerativeModel("gemini-pro") response = model.generate_content("Tell me a joke") print(response.text)
Llama API
Compatible with Llama client libraries.
Same request/response format as OpenAI Chat Completions, but routed through the Llama endpoint.
Example Request
curl -X POST http://127.0.0.1:3030/llama/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.1-70b",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Utilities
Returns server usage statistics including request counts and token usage.
Example Response
{
"totalRequests": 1234,
"successfulRequests": 1200,
"failedRequests": 34,
"totalInputTokens": 50000,
"totalOutputTokens": 75000,
"uptime": 3600
}
Access interactive API documentation using Swagger UI. Test endpoints directly from your browser.
Open in browser: http://127.0.0.1:3030/docs
- Try It Out: Send real requests and see responses
- Schema Explorer: View detailed request/response schemas
- Offline: Works locally without external dependencies
Need Integration Guides?
Check out the Wiki for step-by-step guides on integrating with LangChain, Cursor, Aider, and more.
View Integration Guides