API Reference

The GitHub Copilot API Gateway exposes OpenAI-compatible endpoints that translate requests to your authenticated Copilot session. This reference documents all available endpoints.

Authentication

Authentication happens automatically through your VS Code GitHub Copilot session. For the API client, you have two options:

Option 1: No Authentication (Local)

When running locally (127.0.0.1), you can use any string as the API key:

Python
client = OpenAI(
    base_url="http://127.0.0.1:3030/v1",
    api_key="anything"  # Ignored by server
)

Option 2: Bearer Token (When API Key Configured)

If you've set server.apiKey in settings, include it in your requests:

HTTP
Authorization: Bearer your-configured-api-key

Base URL

http://127.0.0.1:3030

Or if running on LAN: http://YOUR-IP:3030

OpenAI API

Full compatibility with OpenAI's Chat Completions API.

POST /v1/chat/completions Create a chat completion

Request Body

Parameter Type Description
model required string Model ID (e.g., gpt-4o, gpt-4, gpt-3.5-turbo)
messages required array Array of message objects with role and content
stream boolean Enable streaming responses (SSE)
temperature number Sampling temperature (0-2)
max_tokens integer Maximum tokens to generate
tools array List of tool/function definitions
response_format object Set to {"type": "json_object"} for JSON mode

Example Request

Python
from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:3030/v1",
    api_key="copilot"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    temperature=0.7
)

print(response.choices[0].message.content)

Example with Streaming

Python
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
GET /v1/models List available models

Returns a list of models available through your Copilot subscription.

Example Request

curl
curl http://127.0.0.1:3030/v1/models

Example Response

JSON
{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model" },
    { "id": "gpt-4", "object": "model" },
    { "id": "gpt-3.5-turbo", "object": "model" },
    { "id": "claude-3.5-sonnet", "object": "model" }
  ]
}

Anthropic API

Compatible with Anthropic Claude SDK.

POST /v1/messages Create a message (Claude)

Request Body

Parameter Type Description
model required string e.g., claude-3-5-sonnet-20241022
messages required array Array of message objects
max_tokens required integer Maximum tokens to generate
system string System prompt
stream boolean Enable streaming

Example Request

Python
import anthropic

client = anthropic.Anthropic(
    base_url="http://127.0.0.1:3030",
    api_key="copilot"
)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain quantum computing"}
    ]
)

print(message.content[0].text)

Google API

Compatible with Google Generative AI SDK.

POST /v1beta/models/:model:generateContent Generate content (Gemini)

URL Parameters

Parameter Type Description
:model required string Model name (e.g., gemini-pro)

Example Request

Python
import google.generativeai as genai

genai.configure(
    api_key="copilot",
    transport="rest",
    client_options={"api_endpoint": "http://127.0.0.1:3030"}
)

model = genai.GenerativeModel("gemini-pro")
response = model.generate_content("Tell me a joke")
print(response.text)

Llama API

Compatible with Llama client libraries.

POST /llama/v1/chat/completions Chat completions (Llama)

Same request/response format as OpenAI Chat Completions, but routed through the Llama endpoint.

Example Request

curl
curl -X POST http://127.0.0.1:3030/llama/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-70b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Utilities

GET /v1/usage Get usage statistics

Returns server usage statistics including request counts and token usage.

Example Response

JSON
{
  "totalRequests": 1234,
  "successfulRequests": 1200,
  "failedRequests": 34,
  "totalInputTokens": 50000,
  "totalOutputTokens": 75000,
  "uptime": 3600
}
GET /docs Swagger UI documentation

Access interactive API documentation using Swagger UI. Test endpoints directly from your browser.

Open in browser: http://127.0.0.1:3030/docs

  • Try It Out: Send real requests and see responses
  • Schema Explorer: View detailed request/response schemas
  • Offline: Works locally without external dependencies

Need Integration Guides?

Check out the Wiki for step-by-step guides on integrating with LangChain, Cursor, Aider, and more.

View Integration Guides