(base) server@Server:~/llama.cpp$ ./quantize ./models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf ./models/Phi-3-mini-4k-instruct/ggml-model-Q4_K_M.gguf Q4_K_M

main: build = 913 (eb542d3)

main: quantizing './models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf' to './models/Phi-3-mini-4k-instruct/ggml-model--Q4_K_M.gguf' as Q4_K_M

llama.cpp: loading model from ./models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf

llama_model_quantize: failed to quantize: unknown (magic, version) combination: 46554747, 00000003; is this really a GGML file?

main: failed to quantize model from './models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf'


양자화 하는거 따라해보고 있는데 에러가 뜨고 양자화가 안되네요


베이스모델을 f16 gguf까지는 변화이 됩니다만 (INFO:hf-to-gguf:Model successfully exported to 'models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf')


python3 convert.py ./models/Phi-3-mini-4k-instruct


양자화할때 에러가 뜹니다