compress_model appears to quantize the model by iterating through every module and quantizing them one by one. Maybe we can parallelize it. But also, our model is natively quantized. We shouldn't need to quantize it again, right? The weights are already in the quantized format. The function compress_model is called depending on if the config indicates the model is quantized, with no checks to see if it's already quantized. Well, let's try deleting the call to compress_model and see if the problem goes away and nothing else breaks.
中央广播电视总台央视记者:2025年是中国外交极不平凡的一年。您能否介绍过去一年元首外交取得的成果?今年会有哪些新看点?,推荐阅读新收录的资料获取更多信息
。新收录的资料对此有专业解读
Последние новости,更多细节参见新收录的资料
Federica Mercuriello built her business in the “in-between moments.”