I switched because of the support of max_memory to the actual build (so NOT v1.1.0) and it was running without any problem to the end until I need to selected which trial I want to save locally.
I select a trial, give it a name and when it should start saving I directly get a "CUDA out of memory" error. The app still runs, but it happened 3 times again, then I switched back to the trial selection, select the same one again and this time the app crashed completely with the same message.
Don't know what issue it is, maybe a memory handling problem.
I tried to use heretic on my local RTX 4090 on the Mistral Nemo (12B) model which didn't fully goes into VRAM. I run into a out of memory issue with 1.1.0 and could only use a batch size of 32, failed on 64, with limiting the VRAM to 18GB I could use a batch size of 128 what, even when more of the model is on CPU, still a lot faster. But I hit that problem at the end.
Nvidia-smi (I'm on CachyOS) said heretic used 22,5GB VRAM while I only selected a trail to save, so before saving.
I added my tries to save it with all errors until the app crashed into the log.txt.
log.txt
I switched because of the support of max_memory to the actual build (so NOT v1.1.0) and it was running without any problem to the end until I need to selected which trial I want to save locally.
I select a trial, give it a name and when it should start saving I directly get a "CUDA out of memory" error. The app still runs, but it happened 3 times again, then I switched back to the trial selection, select the same one again and this time the app crashed completely with the same message.
Don't know what issue it is, maybe a memory handling problem.
I tried to use heretic on my local RTX 4090 on the Mistral Nemo (12B) model which didn't fully goes into VRAM. I run into a out of memory issue with 1.1.0 and could only use a batch size of 32, failed on 64, with limiting the VRAM to 18GB I could use a batch size of 128 what, even when more of the model is on CPU, still a lot faster. But I hit that problem at the end.
Nvidia-smi (I'm on CachyOS) said heretic used 22,5GB VRAM while I only selected a trail to save, so before saving.
I added my tries to save it with all errors until the app crashed into the log.txt.
log.txt