Spring Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code = simple70

Pass the NVIDIA-Certified Professional NCP-AII Questions and answers with Dumpstech

Exam NCP-AII Premium Access

View all detail and faqs for the NCP-AII exam

Practice at least 50% of the questions to maximize your chances of passing.
Viewing page 1 out of 3 pages
Viewing questions 1-10 out of questions
Questions # 1:

A user wants to restrict a Docker container to use only GPUs 0 and 2. Which command achieves this?

Options:

A.

docker run --gpus '"device=0,2"' nvidia/cuda:12.1-base nvidia-smi

B.

docker run -e NVIDIA_VISIBLE_DEVICES=0,2 nvidia/cuda:12.1-base nvidia-smi

C.

docker run --gpus all nvidia/cuda:12.1-base nvidia-smi -id=0,2

D.

docker run --device /dev/nvidia0,/dev/nvidia2 nvidia/cuda:12.1-base nvidia-smi

Questions # 2:

When updating the firmware on an NVLink switch transceiver, how can an engineer apply new firmware without interrupting the network?

Options:

A.

mlxfwreset -d -lid 27 reset --yes to reset the transceiver

B.

Physically disconnect and reconnect the transceiver.

C.

flint -d -lid 27 --linkx --linkx_auto_update --activate

D.

nv action reboot system to force immediate activation.

Questions # 3:

A user encounters "permission denied" errors when running GPU-accelerated containers on a Secure Boot-enabled system. What resolves this?

Options:

A.

Enroll the MOK and sign NVIDIA kernel modules.

B.

Reinstall Docker without the NVIDIA runtime.

C.

Disable SELinux to relax unnecessary security policies.

D.

Run Docker with sudo for elevated privileges.

Questions # 4:

What command sequence is used to identify the exact name of the server that runs as the master SM in a multi-node fabric?

Options:

A.

sminfo, then smpquery ND

B.

ibstat, then sminfo

C.

ibnetdiscover, then ibsim

D.

sminfo, then smpquery NI

Questions # 5:

After NCCL burn-in reports "transport retry count exceeded," which corrective action addresses the underlying fabric issue?

Options:

A.

Switch from Ring to Tree algorithms via NCCL_ALGO=TREE

B.

Reduce message size to decrease network utilization

C.

Increase NCCL_IB_TIMEOUT to tolerate longer latencies

D.

Inspect InfiniBand link quality metrics (BER, symbol errors) and replace faulty cables

Questions # 6:

You are leading a project to enhance the energy efficiency of a data center that heavily relies on AI workloads. NVIDIA suggests moving beyond traditional metrics like Power Usage Effectiveness (PUE) to better capture the efficiency of modern data centers. Which strategy should you prioritize?

Options:

A.

Use Power Usage Effectiveness as the primary metric while supplementing it with additional measures of useful work done per unit of energy.

B.

Use watts used as the primary measure of efficiency, as it accurately reflects the power input at any given time.

C.

Develop benchmarks tailored to specific workloads, such as MLPerf for AI applications, to better understand energy use in real-world scenarios.

D.

Focus on integrating kilowatt-hours into existing metrics to better reflect the actual energy used for productive work.

Questions # 7:

To validate bisectional bandwidth across two racks in a Spectrum-X Ethernet fabric, which NCCL test configuration isolates East-West traffic?

Options:

A.

NCCL_TESTS_SPLIT="OR 0x7" ./all_reduce_perf -g 8

B.

Run without splits and analyze per-rack averages.

C.

NCCL_TESTS_SPLIT="MOD 2" ./all_reduce_perf -g 8

D.

NCCL_TESTS_SPLIT="DIV 8" ./all_reduce_perf -g 1

Questions # 8:

An engineer needs to verify NVLink isolation on a single node with 8 GPUs. Which NCCL test configuration stresses switch bisection bandwidth?

Options:

A.

Use NCCL_TESTS_SPLIT="DIV 8" with point-to-point tests

B.

Use all_reduce_perf -b 8 -e 16G -f 2 -g 8 with NCCL_TESTS_SPLIT="AND 0x1"

C.

Use reduce_scatter_perf -b 8 -e 16G -f 2 -g 4

D.

Use all_reduce_perf -b 8 -e 16G -f 2 -g 8 without splits

Questions # 9:

For an NVIDIA Enterprise AI Factory with 256 GPUs, which storage solution characteristic is most critical to validate during scaling tests?

Options:

A.

Consistent per-node throughput >8 GiB/s.

B.

Single-node write performance during idle clusters.

C.

RAID rebuild times under disk failure.

D.

Maximum 4K random read IOPS exceeding 1 million.

Questions # 10:

After configuring NGC CLI with ngc config set, a user receives ”Authentication failed” errors when pulling containers. What step was most likely omitted?

Options:

A.

Installing the CLI with apt-get instead of manual extraction.

B.

Entering the API key during ngc config set or storing it in ~/.ngc/config.

C.

Setting --format_type=json to enable API interactions.

D.

Running sudo systemctl restart docker after configuration.

Viewing page 1 out of 3 pages
Viewing questions 1-10 out of questions