vllm.compilation.piecewise_backend ¶
ConcreteSizeEntry dataclass
¶
Source code in vllm/compilation/piecewise_backend.py
PiecewiseBackend ¶
Source code in vllm/compilation/piecewise_backend.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|
compiled_graph_for_general_shape instance-attribute
¶
is_last_graph instance-attribute
¶
__call__ ¶
__call__(*args) -> Any
Source code in vllm/compilation/piecewise_backend.py
__init__ ¶
__init__(
graph: GraphModule,
vllm_config: VllmConfig,
piecewise_compile_index: int,
total_piecewise_compiles: int,
sym_shape_indices: list[int],
compiled_graph_for_general_shape: Callable,
vllm_backend: VllmBackend,
)
The backend for piecewise compilation. It mainly handles the compilation of static shapes and dispatching based on runtime shape.
We will compile self.graph
once for the general shape, and then compile for different shapes specified in compilation_config.compile_sizes
.