Skip to content

[Feature]: Enable custom init containers in vLLM helm chart #26927

@jgchn

Description

@jgchn

🚀 The feature, motivation and pitch

I'm currently exploring the integration of the upstream vLLM Helm chart with llm-d, a Kubernetes-native distributed inferencing stack. llm-d utilizes a sidecar container as a routing proxy for prefill/decode scenarios, which forwards requests to prefill pods. This proxy is deployed as an init container on decode instances to ensure it is available before the main server starts.

However, the current upstream vLLM Helm chart has a limitation: when .extraInit is specified, the init container is hardcoded to perform model downloads. This restricts our ability to customize the init container behavior for use cases like llm-d. To enable benchmarking llm-d, so we need more flexible init container configuration.

I have two potential approaches to address this. Depending on community feedback, I'm happy to open a PR for the preferred solution.

Alternatives

Solution 1: breaking change for existing users but cleaner, a refactor that introduces a more extensible init container specification.

Move model download specs into .extraInit.downloadModel. If .extraInit.downloadModel.enable == true, the wait-download-model container is the first initContainer. The containers inside .extraInit.custom are appended. By default, values.yaml will have downloadModel.enable set to True. The user's values.yaml specs may look like this for including model download container:

extraInit:
  # If any of the fields is non-empty, create the model download container first in the list
  downloadModel:
    enable: true 
    s3modelpath: "relative_s3_model_path/opt-125m"
    pvcStorage: "1Gi"
    awsEc2MetadataDisabled: true

  # Add custom init containers
  custom:
  - name: llm-d-routing-proxy
    image: ghcr.io/llm-d/llm-d-routing-sidecar:v0.2.0
    args: []
    command: []
  - name: another-init
    # ...

and without download container:

extraInit:
  # If any of the fields is non-empty, create the model download container first in the list
  downloadModel:
    enable: false 

  # Add custom init containers
  custom:
  - name: llm-d-routing-proxy
    image: ghcr.io/llm-d/llm-d-routing-sidecar:v0.2.0
    args: []
    command: []
  - name: another-init
    # ...

Solution 2: Non-breaking (but not programmatically elegant). A workaround that maintains backward compatibility but may not be ideal in terms of chart design.

Adds a new field to values.yaml, namely .Values.extraCustomInit, where the user can specify their inits. This will append new initContainers to the wait-model-download container if .Values.extraInit fields are non-empty. This approach does not break existing users's deployments but adds a cognitive overload to the values interface. By default, the values.yaml will have extraCustomInit: [].

For including model download container:

extraInit:
  s3modelpath: "relative_s3_model_path/opt-125m"
  pvcStorage: "1Gi"
  awsEc2MetadataDisabled: true

# Add custom init containers
extraCustomInit: 
- name: llm-d-routing-proxy
  image: ghcr.io/llm-d/llm-d-routing-sidecar:v0.2.0
  args: []
  command: []
- name: another-init
  # ...

and excluding model download:

extraInit: {}

# Add custom init containers
extraCustomInit: 
- name: llm-d-routing-proxy
  image: ghcr.io/llm-d/llm-d-routing-sidecar:v0.2.0
  args: []
  command: []
- name: another-init
  # ...

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions