MFDNN-14690: Replace XE3P_35_10/11/UNKNOWN Core enum values with Xe3p#4981
MFDNN-14690: Replace XE3P_35_10/11/UNKNOWN Core enum values with Xe3p#4981
Conversation
| @@ -1300,7 +1293,7 @@ class GRF : public Register | |||
| static constexpr int maxRegs() { return 512; } | |||
| static constexpr int maxRegs(HW hw) { | |||
| return (hw < HW::XeHP) ? 128 | |||
| : (hw == HW::XE3P_35_11) ? 512 | |||
| : (hw >= HW::Xe3p) ? 512 | |||
There was a problem hiding this comment.
Per my understanding of the spec, all xe3p variants should support 512 grf.
| static int max_slm_size(gpu_arch_t gpu_arch, gpu_product_t product); | ||
| static int max_slm_size_per_tg(gpu_arch_t gpu_arch, gpu_product_t product); | ||
| static int max_slm_size_per_tg(gpu_arch_t gpu_arch, int tg_size, | ||
| bool large_grf_mode, gpu_product_t product); |
There was a problem hiding this comment.
We should be able to remove the gpu_arch as inputs, as gpu_product() alone contains all the necessary information.
da01903 to
c2f22a0
Compare
There was a problem hiding this comment.
Please change these to NVLP and CRI as well.
There was a problem hiding this comment.
These are the values we were instructed to use for unembargo.
There was a problem hiding this comment.
Confirmed with @karturov, we can name these however we like. We should use NVLP and CRI to match upstream naming and other names in the enum.
| static int max_slm_size_per_tg(gpu_arch_t gpu_arch); | ||
| static int max_slm_size_per_tg( | ||
| gpu_arch_t gpu_arch, int tg_size, bool large_grf_mode = false); | ||
| static int max_slm_size(gpu_product_t product); |
There was a problem hiding this comment.
Since this is only called using a product coming from a device_info object, can we convert this to a non-static function with no arguments?
There was a problem hiding this comment.
max_slm_size is used in a static way here, doesn't look like product is coming from device_info:
There was a problem hiding this comment.
That's max_slm_size_per_tg(), not max_slm_size(). I'd be ok with a shortcut device info accessor in addition to the static version if you want to keep the static one around.
|
make test |
c2f22a0 to
f62297a
Compare
|
make test |
Collapse ngen::Core::XE3P_35_10, XE3P_35_11, and XE3P_UNKNOWN into a single ngen::Core::Xe3p value. Use ngen::ProductFamily to distinguish hardware-specific features where needed (F4 DPAS support, SLM capacity). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f62297a to
60fc6b4
Compare
Collapse ngen::Core::XE3P_35_10, XE3P_35_11, and XE3P_UNKNOWN into a single ngen::Core::Xe3p value. Use ngen::ProductFamily to distinguish hardware-specific features where needed (512 GRFs, F4 DPAS support, SLM capacity).
addresses MFDNN-14960