Description
I have encountered the following error while training binary classification task with lightgbm 4.5.0 on H100 and device="cuda":
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pywrapper_utils/run_thread/full_batch_run_thread.py", line 47, in _execute_user_function
result = self.user_main_function(**kwargs)
File "/opt/module/source/main.py", line 31, in main
model.perform_all_calculations()
File "/opt/module/source/model/feature_selector.py", line 61, in perform_all_calculations
selected_features: List[Tuple] = self.select_features(base_model, kfold)
File "/opt/module/source/model/feature_selector.py", line 84, in select_features
model.fit(X_train, y_train)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 1284, in fit
super().fit(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 955, in fit
self._Booster = train(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/engine.py", line 307, in train
booster.update(fobj=fobj)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 4135, in update
_safe_call(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 296, in _safe_call
raise LightGBMError(_LIB.LGBM_GetLastError().decode("utf-8"))
lightgbm.basic.LightGBMError: [CUDA] invalid argument /tmp/pip-install-9rgzugd6/lightgbm_37941d8e64514c0e844ef71f72ef6b9c/src/boosting/goss.hpp 63
Environment info
python3.9
cuda 12.4
scikit-learn==1.6.1
Command(s) you used to install LightGBM
pip install lightgbm --config-settings=cmake.define.USE_CUDA=ON
Description
I have encountered the following error while training binary classification task with lightgbm 4.5.0 on H100 and
device="cuda":Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pywrapper_utils/run_thread/full_batch_run_thread.py", line 47, in _execute_user_function
result = self.user_main_function(**kwargs)
File "/opt/module/source/main.py", line 31, in main
model.perform_all_calculations()
File "/opt/module/source/model/feature_selector.py", line 61, in perform_all_calculations
selected_features: List[Tuple] = self.select_features(base_model, kfold)
File "/opt/module/source/model/feature_selector.py", line 84, in select_features
model.fit(X_train, y_train)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 1284, in fit
super().fit(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 955, in fit
self._Booster = train(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/engine.py", line 307, in train
booster.update(fobj=fobj)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 4135, in update
_safe_call(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 296, in _safe_call
raise LightGBMError(_LIB.LGBM_GetLastError().decode("utf-8"))
lightgbm.basic.LightGBMError: [CUDA] invalid argument /tmp/pip-install-9rgzugd6/lightgbm_37941d8e64514c0e844ef71f72ef6b9c/src/boosting/goss.hpp 63
Environment info
python3.9
cuda 12.4
scikit-learn==1.6.1
Command(s) you used to install LightGBM