Float #1

19h · 2023-06-21T22:03:02Z

No description provided.

* contrib : update roles * contrib : merge PR sections + add link to CI instructions Updated pull request guidelines for contributors and collaborators, and clarified merging practices for maintainers.

…#16124) * claim responsibility for ci, gguf-py and convert * add myself to various src/llama- files

* Vulkan: add conv_transpose_2d operation * Vulkan: fix typo in conv_transpose_2d shader(s0mp, s0L, s1mp, s1L) * Vulkan: fix incorrect indentation in conv_transpose_2d shader * Vulkan: add checking the push constants size limit and reuse conv2d_mm.comp for conv_transpose_2d operation * Vulkan: revert the order of the index calculation and bound check in conv_2d shader * Vulkan: explicity check push constants limit in supports_op() for conv_transpose_2d operation. * Vulkan: remove unnecessary lower bound checks for H/W_idx in the conv_2d shader.

* ggml : add ggml_op_is_empty * ggml : move to ggml-impl.h

* ggml : extend ggml_can_fuse to work with non-sequential nodes in the graph * cont : fix wrong bounds check condition * cont : remove unnecessary overload

These two local variables 'arg' and 'arg_prefix' have been overriden by: 1. for (const auto & arg : opt.args) 2. for (int i = 1; i < argc; i++) { const std::string arg_prefix = "--"; std::string arg = argv[i];

* common : use the json parser Signed-off-by: Adrien Gallouët <[email protected]> * common : enable --offline mode without CURL support This change refactors the download logic to properly support offline mode even when the project is built without CURL. Without this commit, using `--offline` would give the following error: error: built without CURL, cannot download model from the internet even if all the files are already cached. Signed-off-by: Adrien Gallouët <[email protected]> --------- Signed-off-by: Adrien Gallouët <[email protected]>

--------- Co-authored-by: slaren <[email protected]>

* implement set_rows with i32 index * template fix * test quantized path warnings-- * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> * forgotten name change * deduplicate cuda/sycl and test-fix * indent++ * vulkan: support set_rows with i32 index type (#16162) * disable i32 index for webgpu for now --------- Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: Jeff Bolz <[email protected]>

Disable 'performance-enum-size' checking: Enum 'llama_token_type' uses a larger base type ('unsigned int', size: 4 bytes) than necessary for its value set, consider using 'std::uint8_t' (1 byte) as the base type to reduce its size.

…n) (#16177) This is a configuration of the hparams in the GraniteHybrid architecture that devolves to the Granite (or GraniteMoe) architecture (ie Granite 3.x). It may be used for some models in the Granite 4 family with the GraniteHybrid architecture acting as a superset arch. Rather than support it directly in the c++ graph, we simply coerce the architecture flag back to the correct "granite" or "granitemoe" architecture. Branch: gabe-l-hart/GraniteNonHybridConversion Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>

* devops: add s390x dockerfile Signed-off-by: Aaron Teo <[email protected]> * devops: add missing ninja Signed-off-by: Aaron Teo <[email protected]> * devops: move s390x docker into cpu docker Signed-off-by: Aaron Teo <[email protected]> * devops: rework s390x docker Signed-off-by: Aaron Teo <[email protected]> * devops: copy more tools Signed-off-by: Aaron Teo <[email protected]> * devops: add server build step Signed-off-by: Aaron Teo <[email protected]> * devops: remove apt clean steps as distroless misses it Signed-off-by: Aaron Teo <[email protected]> * devops: remove apt commands from distroless Signed-off-by: Aaron Teo <[email protected]> * devops: fix shared libs in distroless Signed-off-by: Aaron Teo <[email protected]> * devops: use correct libs path Signed-off-by: Aaron Teo <[email protected]> * devops: fix shared libs Signed-off-by: Aaron Teo <[email protected]> * devops: add collector stage Signed-off-by: Aaron Teo <[email protected]> * devops: fix missing stage ref Signed-off-by: Aaron Teo <[email protected]> * devops: fix permission issue Signed-off-by: Aaron Teo <[email protected]> * devops: fix unknown model loading failures Signed-off-by: Aaron Teo <[email protected]> * devops: attempt at fixing model loading failure Signed-off-by: Aaron Teo <[email protected]> * devops: fix missing ggml shared object failure to load model Signed-off-by: Aaron Teo <[email protected]> * devops: remove move shared objects Signed-off-by: Aaron Teo <[email protected]> * devops: move libggml-cpu and blas into bin Signed-off-by: Aaron Teo <[email protected]> * devops: finalise hardened server stage Signed-off-by: Aaron Teo <[email protected]> * devops: add cli target Signed-off-by: Aaron Teo <[email protected]> * devops: fix typos Signed-off-by: Aaron Teo <[email protected]> * devops: fix missing shared libraries in base Signed-off-by: Aaron Teo <[email protected]> * devops: update debian target Signed-off-by: Aaron Teo <[email protected]> * devops: formalise llama.cpp loc Signed-off-by: Aaron Teo <[email protected]> * Revert "devops: formalise llama.cpp loc" This reverts commit 0a7664a. Signed-off-by: Aaron Teo <[email protected]> * devops: formalise llama.cpp loc Signed-off-by: Aaron Teo <[email protected]> (cherry picked from commit 0a7664a) Signed-off-by: Aaron Teo <[email protected]> * devops: attempt at fixing missing dir Signed-off-by: Aaron Teo <[email protected]> * devops: attempt at making it cache the build Signed-off-by: Aaron Teo <[email protected]> * devops: fix copying process Signed-off-by: Aaron Teo <[email protected]> * devops: make build dir an argument Signed-off-by: Aaron Teo <[email protected]> * Revert "devops: make build dir an argument" This reverts commit 4386989. Signed-off-by: Aaron Teo <[email protected]> * devops: add build stage for gguf-py Signed-off-by: Aaron Teo <[email protected]> * devops: move gguf-py installation into build stage Signed-off-by: Aaron Teo <[email protected]> * devops: break system packages? Signed-off-by: Aaron Teo <[email protected]> * devops: add rust compiler installer Signed-off-by: Aaron Teo <[email protected]> * devops: fix rustc not found Signed-off-by: Aaron Teo <[email protected]> * devops: remove cache mount to allow rustc to persist Signed-off-by: Aaron Teo <[email protected]> * devops: move rustc installation to another layer Signed-off-by: Aaron Teo <[email protected]> * devops: move gguf-py installation to full stage, fix copying Signed-off-by: Aaron Teo <[email protected]> * devops: remove rustc installation in build Signed-off-by: Aaron Teo <[email protected]> * devops: disable full target for now Signed-off-by: Aaron Teo <[email protected]> * devops: attempting static build Signed-off-by: Aaron Teo <[email protected]> * devops: merge s390x dockerfile into cpu for now Signed-off-by: Aaron Teo <[email protected]> * devops: switch to gcc image for build step Signed-off-by: Aaron Teo <[email protected]> * devops: remove build essentials Signed-off-by: Aaron Teo <[email protected]> * devops: install openblas into base target Signed-off-by: Aaron Teo <[email protected]> * devops: go back to s390x dockerfile Signed-off-by: Aaron Teo <[email protected]> * devops: remove libggml and libblas Signed-off-by: Aaron Teo <[email protected]> * devops: add full target Signed-off-by: Aaron Teo <[email protected]> * devops: add break system packages Signed-off-by: Aaron Teo <[email protected]> * devops: add libjpeg Signed-off-by: Aaron Teo <[email protected]> * devops: add missing cmake dep Signed-off-by: Aaron Teo <[email protected]> * devops: finalise docker images for s390x Signed-off-by: Aaron Teo <[email protected]> * devops: add custom openblas patch Signed-off-by: Aaron Teo <[email protected]> * devops: use libopenblas-dev instead of libopenblas-openmp-dev Signed-off-by: Aaron Teo <[email protected]> * devops: add s390x docker build Signed-off-by: Aaron Teo <[email protected]> --------- Signed-off-by: Aaron Teo <[email protected]>

@danbev

This commit adds examples/model-conversion/ to the CODEOWNERS file and assigns myself (@danbev) as the code owner for this directory.

* zdnn: initial matmul refactor Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: rm static from funcs Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: update ggml-zdnn.h Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: change header files to hpp Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: switch to common.hpp Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: move mulmat forward around Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: rm inline from utils Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: code cleanup Signed-off-by: Aaron Teo <[email protected]> * docs: add zDNN docs Signed-off-by: Aaron Teo <[email protected]> --------- Signed-off-by: Aaron Teo <[email protected]>

) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true

* ci : disable AMD workflows + update NVIDIA workflows * cont : fixes * cont : update nvidia vulkan workflows

Fix two incorrect make targets in the readme. Signed-off-by: Jie Fu <[email protected]>

This commit adds a leading slash to the paths of root-level files in the CODEOWNERS file. The motivation for this is that these might otherwise match files in subdirectories that have other/additional owners will override them. Refs: #16209 (comment)

Signed-off-by: Jie Fu <[email protected]>

Signed-off-by: Uilian Ries <[email protected]>

…ontaining "." (#16215) Signed-off-by: Jie Fu <[email protected]>

* model : add label for LiquidAI LFM2-2.6B model HF link: [LiquidAI/LFM2-2.6B](https://huggingface.co/LiquidAI/LFM2-2.6B). Support for GGUF conversion and inference is added in #14620. However, due to similar `n_embd`, it identifies as a 1.2B model. Fix the label by using `n_ff` to identify the model instead. Output of `llama-bench`: ``` | model | size | params | backend | threads | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: | | lfm2 1.2B F16 | 2.18 GiB | 1.17 B | CPU | 10 | pp512 | 223.97 ± 5.32 | | lfm2 2.6B F16 | 4.79 GiB | 2.57 B | CPU | 10 | pp512 | 92.53 ± 4.14 | | lfm2 350M F16 | 676.25 MiB | 354.48 M | CPU | 10 | pp512 | 725.52 ± 11.70 | | lfm2 700M F16 | 1.38 GiB | 742.49 M | CPU | 10 | pp512 | 336.22 ± 12.93 | ``` * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>

…15815) * ggml : make gallocr respect the backend's max buffer size * if the graph requires more memory than can fit into a single allocation, split it into multiple backend buffers * vulkan: report the actual max allocation size in buffer type interface * fix missing newline, apple-clang warning * track size of individual chunks in ggml_dyn_tallocr and raise max chunks. revert to use suballocation_block_size as max chunk size for vulkan. * track (chunk, offset) pairs instead of "global" offsets through gallocr. * simpler, don't need loops to map between local/global offsets * touches more code * fix dyn_tallocr_max_size and initialization * fix memory leak when buffers are reused due to same buffer type appearing multiple times * make vbuffer allocation follow the same logic as backend_buffer did before * continue to use leftover unallocated space of previous chunks after a new one has been created * treat free blocks of each chunk as separate list * they're still allocated together, but start/end of each chunk is tracked, and allocate/free iterate over sub-ranges * exhaust freed blocks of all chunks before considering their last blocks with unallocated space * start with 0 chunks/blocks and create chunks as needed * allow the last chunk to grow beyond max size * refactor: move adding new free block and new chunk into separate functions * allocate chunks individually with a separate free-blocks list for each one * needs a bit more memory/allocations/indirections, but code is simpler * fix warnings (missing static) & debug checks

* CUDA: use fastdiv + ggml_cuda_mad for mmvf * use bf16 directly + fix formatting * Add exception for HIP code

Enable CMP0147 so custom build steps (invoking vulkan-shader-gen) are run in parallel. Enable /MP so source files are compiled in parallel.

…16577)

Signed-off-by: Stefan Savic <[email protected]> Co-authored-by: Stefan Savic <[email protected]>

* metal : avoid using Metal's gpuAddress property * metal : fix rope kernels buffer check

* CUDA set scheduling strategy to spinning for cc121 * Using prop.major and prop.minor, include HIP and MUSA * Exclude HIP and MUSA * Remove trailing whitespace Co-authored-by: Johannes Gäßler <[email protected]> * Remove empty line Co-authored-by: Johannes Gäßler <[email protected]> --------- Co-authored-by: Johannes Gäßler <[email protected]>

* llama-quant: add support for mmproj * Update src/llama.cpp Co-authored-by: Georgi Gerganov <[email protected]> * check prefix instead * small fix --------- Co-authored-by: Georgi Gerganov <[email protected]>

* optimise GGML_OP_SUM * add non-contiguous tests by permuting the input * change tests to require full contiguity of OP_SUM * cuda : add check GGML_OP_SUM --------- Co-authored-by: Georgi Gerganov <[email protected]>

* opencl: add mm_q8_0_f32 * opencl: fix data loading for incomplete tile * opencl: use q8_0 mm for larger matrix * opencl: add some tests to cover the path

* CPU: Add support for FLOOR,CEIL,ROUND and TRUNC unary operators - Added the operators to unary op enum - Implemented API functions - Implemented forward and unary-op logic in CPU backend - Updated ggml_get_n_tasks - Updated operators names array and static_assert - Updated docs and enabled automatic tests * docs: add documentation for ggml_trunc and ggml_trunc_inplace in ggml.h * chore: remove trailing whitespace from ggml.h * Remove unresolved merge markers * Apply review suggestions: cleanup formatting, enum order and leftover artifacts * Regenerate ops.md using create_ops_docs.py

BF16 requires special handling in this script while it's a 2-bytes data, but view is 1-byte by default. Switch to correct view before attempting byteswapping. With this change correctly byteswapping models like Meta-Llama-3-8B-Instruct-bf16-GGUF should be possible.

* SYCL: Add GGML_OP_MEAN operator support * SYCL: Fix formatting for GGML_OP_MEAN case * Update ggml/src/ggml-sycl/ggml-sycl.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>

## Why it failed When compiling with strict compiler flags (-Wwrite-strings -Werror=discarded-qualifiers), the build fails with the following error: ``` cmake \ -S . \ -B ../llama.cpp.build \ --preset=x64-linux-gcc-debug \ -DCMAKE_INSTALL_PREFIX=/tmp/local \ -DCMAKE_C_FLAGS="-Wwrite-strings -Werror=discarded-qualifiers" && \ cmake --build ../llama.cpp.build/ ... /home/otegami/work/cpp/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: In function ‘ggml_cpu_init’: /home/otegami/work/cpp/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3572:24: error: passing argument 1 of ‘putenv’ discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] 3572 | putenv("KMP_BLOCKTIME=200"); // 200ms | ^~~~~~~~~~~~~~~~~~~ In file included from /home/otegami/work/cpp/llama.cpp/ggml/src/./ggml-impl.h:10, from /home/otegami/work/cpp/llama.cpp/ggml/src/ggml-cpu/ggml-cpu-impl.h:6, from /home/otegami/work/cpp/llama.cpp/ggml/src/ggml-cpu/traits.h:3, from /home/otegami/work/cpp/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:6: /usr/include/stdlib.h:786:26: note: expected ‘char *’ but argument is of type ‘const char *’ 786 | extern int putenv (char *__string) __THROW __nonnull ((1)); | ~~~~~~^~~~~~~~ cc1: some warnings being treated as errors ninja: build stopped: subcommand failed. ``` The issue is that putenv() expects a non-const char * but receives a string literal (const char *). ## How to fix This PR replaces putenv("KMP_BLOCKTIME=200") with setenv("KMP_BLOCKTIME", "200", 0). Benefits of setenv(): - Accepts const char * parameters (no qualifier warnings) - Makes copies of the strings (safer memory handling) - The third parameter (0) ensures we don't overwrite if already set

* Update the docs on -t --threads * Revert "Update the docs on -t --threads" This reverts commit eba9734. * docs: clarify -t/--threads parameter uses CPU threads and defaults to all available cores * Update arg.cpp

This commit applies .clang-format rules to all source files under the ggml-cann directory to ensure consistent coding style and readability. The .clang-format option `SortIncludes: false` has been set to disable automatic reordering of include directives. No functional changes are introduced. Co-authored-by: hipudding <[email protected]>

* SYCL: update element-wise ops and presets * clean arange * Re-trigger CI --------- Co-authored-by: Gitty Burstein <[email protected]>

…iters (#16599) * fix: added a normalization step for MathJax-style \[\] and \(\) delimiters So inline and block equations are converted before KaTeX rendering, enabling proper display of model-generated LaTeX in the WebUI * chore: update webui build output

* SYCL/SET: implement operator + wire-up; docs/ops updates; element_wise & ggml-sycl changes * sycl(SET): re-apply post-rebase; revert manual docs/ops.md; style cleanups * move SET op to standalone file, GPU-only implementation * Update SYCL SET operator for F32 * ci: fix editorconfig issues (LF endings, trailing spaces, final newline) * fixed ggml-sycl.cpp --------- Co-authored-by: Gitty Burstein <[email protected]>

… conversion logic (#16626)

* initial: headers and metal-device.cpp updates * adding conv_transpose_2d * fix type * fix type: int32->int64 * Update ggml/src/ggml-metal/ggml-metal.metal Co-authored-by: Georgi Gerganov <[email protected]> * Update ggml/src/ggml-metal/ggml-metal.metal Co-authored-by: Georgi Gerganov <[email protected]> * Update ggml/src/ggml-metal/ggml-metal.metal Co-authored-by: Georgi Gerganov <[email protected]> * add checks for src[0] and src[1]; add type checks * Update ggml-metal.metal Co-authored-by: Georgi Gerganov <[email protected]> * add more tests, add optimization to threading * add dynamic memory allocation in metal --------- Co-authored-by: Georgi Gerganov <[email protected]>

* webui: reorganize settings layout * chore: update webui build output * fix: remove unused variable * chore: update webui build output

Fix incorrect task-to-batch index calculation in the quantization phase. The bug caused out-of-bounds access to qnbitgemm_args array when compute_idx exceeded per_gemm_block_count_m, leading to invalid pointer dereferences and SIGBUS errors. Correctly map tasks to batches by dividing compute_idx by per_gemm_block_count_m instead of block_size_m. Example: batch_feature=1, gemm_m=30, block_size_m=4 per_gemm_block_count_m = 8, task_count = 8 Old: gemm_idx = 4/4 = 1 (out of bounds New: gemm_idx = 4/8 = 0 (correct) Tested on SpaceMit K1 RISC-V64 with qwen2.5:0.5b model. Co-authored-by: muggle <[email protected]>

ggerganov and others added 30 commits September 22, 2025 10:58

contrib : update roles (#16113)

5c6106a

* contrib : update roles * contrib : merge PR sections + add link to CI instructions Updated pull request guidelines for contributors and collaborators, and clarified merging practices for maintainers.

codeowners : claim responsibility for ci, models, gguf-py and convert (…

b2d980f

…#16124) * claim responsibility for ci, gguf-py and convert * add myself to various src/llama- files

codeowners : update ownership for @ngxson and @allozuar (#16128)

05a2458

ggml : add ggml_op_is_empty (#16122)

a71ae3b

* ggml : add ggml_op_is_empty * ggml : move to ggml-impl.h

ggml : extend ggml_can_fuse to work with non-sequential nodes (#16123)

4f324a5

* ggml : extend ggml_can_fuse to work with non-sequential nodes in the graph * cont : fix wrong bounds check condition * cont : remove unnecessary overload

common : remove unused local variables (#16140)

d05affb

These two local variables 'arg' and 'arg_prefix' have been overriden by: 1. for (const auto & arg : opt.args) 2. for (int i = 1; i < argc; i++) { const std::string arg_prefix = "--"; std::string arg = argv[i];

embedding : fix typos in README (#16171)

c6db9a1

webui : fix handling incomplete chunks (#16107)

138c87c

codeowners : update + cleanup (#16174)

432cf43

--------- Co-authored-by: slaren <[email protected]>

ggml-cpu : fix typo in gemm comments [no ci] (#16189)

85e7227

codeowners : add @danbev to model-conversion example [no ci] (#16190)

0bc7cc7

This commit adds examples/model-conversion/ to the CODEOWNERS file and assigns myself (@danbev) as the code owner for this directory.

ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (#15928

f6b4af3

) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true

ggml-cpu: Respect cpumask settings (#16164)

4e29084

ci : enable Vulkan workflow on Mac (#16194)

0889589

ci : disable AMD workflows + update NVIDIA workflows (#16200)

f505bd8

* ci : disable AMD workflows + update NVIDIA workflows * cont : fixes * cont : update nvidia vulkan workflows

model-conversion : fix the make targets in the README.md (#16209)

8ba548d

Fix two incorrect make targets in the readme. Signed-off-by: Jie Fu <[email protected]>

model-conversion : run-org-model.py fails to run on mac m1 (#16213)

7735706

Signed-off-by: Jie Fu <[email protected]>

codeowners : match all requirements files (#16214)

c0c59c1

common : add missing chrono header for common.cpp (#16211)

152729f

Signed-off-by: Uilian Ries <[email protected]>

model-conversion : make causal-verify-logits fails with model names c…

63b54c8

…ontaining "." (#16215) Signed-off-by: Jie Fu <[email protected]>

am17an and others added 30 commits October 14, 2025 13:16

CUDA: use fastdiv + ggml_cuda_mad for mmvf (#16557)

1ee9d0b

* CUDA: use fastdiv + ggml_cuda_mad for mmvf * use bf16 directly + fix formatting * Add exception for HIP code

CUDA: enable FA for FP32 KV cache (#16546)

9c7185d

vulkan: Improve build time for MSVC (#16545)

7ea15bb

Enable CMP0147 so custom build steps (invoking vulkan-shader-gen) are run in parallel. Enable /MP so source files are compiled in parallel.

vulkan: Support FA with K/V in F32 (#16543)

4258e0c

CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (#…

120bf70

…16577)

vulkan: Add ACC_TYPE_VEC2 implementation (#16203)

ffa0590

Signed-off-by: Stefan Savic <[email protected]> Co-authored-by: Stefan Savic <[email protected]>

metal : avoid using Metal's gpuAddress property (#16576)

fa882fd

* metal : avoid using Metal's gpuAddress property * metal : fix rope kernels buffer check

server : fix mtmd checkpoints (#16591)

554fd57

llama-quant: add support for mmproj (#16592)

3e3cb19

* llama-quant: add support for mmproj * Update src/llama.cpp Co-authored-by: Georgi Gerganov <[email protected]> * check prefix instead * small fix --------- Co-authored-by: Georgi Gerganov <[email protected]>

server : fix img token logs (#16595)

17304cb

metal: optimise GGML_OP_SUM (#16559)

f4ce81c

* optimise GGML_OP_SUM * add non-contiguous tests by permuting the input * change tests to require full contiguity of OP_SUM * cuda : add check GGML_OP_SUM --------- Co-authored-by: Georgi Gerganov <[email protected]>

Add server-driven parameter defaults and syncing (#16515)

f9fb33f

opencl: fix FA for f32 (#16584)

d93f843

opencl: add q8_0 mm support (#16469)

0cb7a06

* opencl: add mm_q8_0_f32 * opencl: fix data loading for incomplete tile * opencl: use q8_0 mm for larger matrix * opencl: add some tests to cover the path

common : Update the docs on -t --threads (#16236)

6f5d924

* Update the docs on -t --threads * Revert "Update the docs on -t --threads" This reverts commit eba9734. * docs: clarify -t/--threads parameter uses CPU threads and defaults to all available cores * Update arg.cpp

sycl : add ARANGE operator (#16362)

b22572e

* SYCL: update element-wise ops and presets * clean arange * Re-trigger CI --------- Co-authored-by: Gitty Burstein <[email protected]>

mtmd : support home-cooked Mistral Small Omni (#14928)

1bb4f43

grammar : use int64_t to avoid int overflows in int schema to grammar…

79967ec

… conversion logic (#16626)

vulkan: fix debug build (add_rms_len/data not found) (#16624)

b194915

webui: reorganize settings layout (#16607)

ababae7

* webui: reorganize settings layout * chore: update webui build output * fix: remove unused variable * chore: update webui build output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Float #1

Float #1

19h commented Jun 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

76 participants

Float #1

Are you sure you want to change the base?

Float #1

Conversation

19h commented Jun 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

76 participants