Skip to content

UpdateGpu

Description

Use the "UpdateGpu" command with the CEC/FPGA/HGX_H100/H100_FPGA/Intel_Gaudi2/Intel PVC of GPU firmware image to update the GPU firmware of a managed system by SUM.

Syntax

Single System

OOB

sum -i <IP or host name> -u <username> -p <password> -c UpdateGpu --item <CEC|FPGA|HGX_H100|H100_FPGA|PVC_RETIMER|GAUDI_RETIMER|PVC_AMC|PVC_UBB_CPLD|MGX_GPU> --file <filename> [--reboot] [--post_complete] [--dev_id <device ID>]

In-Band

sum -I Redfish_HI -u <username> -p <password> -c UpdateGpu --item <CEC|FPGA|HGX_H100|H100_FPGA|PVC_RETIMER|GAUDI_RETIMER|PVC_AMC|GAUDI_UBB_CPLD> --file <filename> [--reboot] [--post_complete] [--dev_id <device ID>]
sum -c UpdateGpu --dev_id <device_id> --item <GAUDI_OAM_CPLD|GAUDI_SPI|PVC_IFWI|PVC_PSCBIN> --file <filename> [--reboot] [--dev_id <device ID>]

Remote In-Band

sum -I Remote_INB --oi <IP address> --ou <username> --op <password> -c UpdateGpu --dev_id <device_id> --item <GAUDI_OAM_CPLD|GAUDI_UBB_CPLD|GAUDI_SPI|PVC_IFWI|PVC_PSCBIN> --file <filename> [--reboot] --remote_sum <remote sum_location> [--dev_id <device ID>]

Multiple Systems

OOB

sum -l <system list file> [-u <username> -p <password>] -c UpdateGpu --item <CEC|FPGA|HGX_H100|H100_FPGA|PVC_RETIMER|GAUDI_RETIMER|PVC_AMC|PVC_UBB_CPLD|MGX_GPU> --file <filename> [--reboot] [--post_complete] [--dev_id <device ID>]

Remote In-Band

sum -I Remote_INB -l <system list file> -c UpdateGpu --dev_id <device_id> --item <GAUDI_OAM_CPLD|GAUDI_UBB_CPLD|PVC_IFWI|PVC_PSCBIN|GAUDI_SPI> --file <filename> [--reboot] --remote_sum <remote sum_location> [--dev_id <device ID>]

Options

  • --item <item_type>: Specifies the GPU component to update (e.g., CEC, FPGA, HGX_H100, etc.).
  • --file <filename>: Specifies the GPU firmware image file.
  • --reboot: Reboots the system after update.
  • --post_complete: Waits for POST to complete after reboot.
  • --dev_id <device ID>: Specifies the device ID.
  • --remote_sum <remote sum_location>: Specifies the path to remote SUM executable.

Examples

OOB

[SUM_HOME]# ./sum -i 192.168.34.56 -u ADMIN -p PASSWORD -c UpdateGpu --file GPU_CEC.bin --item CEC

The console output contains the following information.

Managed system................192.168.34.56
 HGX Model................HGX A100
 CEC version................4.0
 FPGA version................3.03
Local GPU CEC image file......GPU_CEC.bin
Status: Start updating CEC for 192.168.34.56
************************************WARNING****************************
 Do not remove AC power from the server.
************************************************************************
Uploading GPU CEC FW...Done
Updating GPU CEC FW ...>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>Done
Status: GPU CEC is updated for 192.168.34.56
Note: You have to reboot or power up the system for the changes to take effect
[SUM_HOME]# ./sum -i 192.168.34.56 -u ADMIN -p PASSWORD -c UpdateGpu --file NVDIA_HGX_H100.pkg --item HGX_H100 --reboot --post_complete

Notes

  • The UpdateGpu command only supports NVIDIA GPU.
  • Supported platforms vary by GPU type and chipset.