What you’ll learn
In this tutorial, you’ll learn how to:- Deploy a vLLM worker serving the
Qwen/qwen3-32b-awqmodel. - Configure your environment variables for n8n compatibility.
- Create a simple n8n workflow to test your integration.
- Connect your workflow to your Runpod endpoint.
Requirements
Before you begin, you’ll need:- A Runpod account (with available credits).
- A Runpod API key.
- An n8n account.
Step 1: Deploy a vLLM worker on Runpod
First, you’ll deploy a vLLM worker to serve theQwen/qwen3-32b-awq model.
1
Create a new vLLM endpoint
Open the Runpod console and navigate to the Serverless page.Click New Endpoint and select vLLM under Ready-to-Deploy Repos.
2
Configure your endpoint
In the deployment modal:
- In the Model field, enter
Qwen/qwen3-32b-awq. - Expand the Advanced section to configure your vLLM environment variables:
- Set Max Model Length to
8192. - Near the bottom of the page, check Enable Auto Tool Choice.
- Set Tool Call Parser to
Hermes. - Set Reasoning Parser to
Qwen3.
- Set Max Model Length to
- Click Next.
- Click Create Endpoint.
3
Copy your endpoint ID
Once deployed, you’ll be taken to the detail page for your endpoint in the Runpod console. You can find your endpoint ID in the Overview tab:
You can also find your endpoint ID in the URL of the endpoint detail page. For example, if the URL for your endpoint is

https://console.runpod.io/serverless/user/endpoint/isapbl1e254mbj, the endpoint ID is isapbl1e254mbj.Copy your endpoint ID to your clipboard. You’ll need it to configure your n8n workflow.Step 2: Create an n8n workflow
Next, you’ll create a simple n8n workflow to test your integration.1
Create a new workflow
Open n8n and navigate to your workspace, then click Create Workflow.
2
Add a chat message trigger
Click Add first step and select On chat message. Click Test chat to confirm.
3
Add AI Agent node
Click the + button and search for AI Agent and select it. Click Execute step to confirm.
4
Add OpenAI Chat Model node
Click the + button labeled Chat Model. Search for OpenAI Chat Model and select it.
5
Create a new credential
Click the dropdown under Credential to connect with and select Create new credential.
Step 3: Configure the OpenAI Chat Model node
Now you’ll configure the n8n OpenAI Chat Model node to use the model running on your Runpod endpoint.1
Add your Runpod API key
Under API Key, add your Runpod API Key. You can create an API key on the settings page of the Runpod console.
2
Configure the base URL
Under Base URL, replace the default OpenAI URL with your Runpod endpoint URL:Replace
ENDPOINT_ID with your vLLM endpoint ID from Step 1.3
Test the connection
Click Save, and n8n will automatically test your endpoint connection.It may take a few minutes for your endpoint to scale up a worker to process the request. You can monitor the request using the Workers and Requests tabs for your vLLM endpoint in the Runpod console.If you see the message “Connection tested successfully,” that means your endpoint is reachable, but it doesn’t gaurantee that it’s fully compatible with n8n—we’ll do that in the next step.
4
Select the Qwen3 model
Press escape to return to the OpenAI Chat Model configuration modal.Under Model, select
qwen/qwen3-32b-awq, then press escape to return to the workflow canvas.5
Type a test message
Type a test message into the chat box like “Hello, how are you?” and press enter.If everything is working correctly, you should see each of the nodes in your workflow go green to indicate successful execution, and a response from the model in the chat box.
Next steps
Congratulations! You’ve successfully used Runpod to power an AI agent on n8n. Now that you’ve integrated with n8n, you can:- Build complex AI-powered workflows using your Runpod endpoints.
- Explore other integration options with Runpod.
- Learn about OpenAI compatibility features in vLLM.