Medium1 markMultiple Choice

GCP PCA · Question 48 · Domain 5: Managing Implementation and Ensuring Solution and Operations Reliability

Your company is running a large Hadoop cluster on-premises and wants to migrate to Google Cloud. The workloads consist of ephemeral Spark jobs that run for 2 hours every night to process logs. Cost optimization is a primary concern. Which TWO GCP services/features should you use? (Select TWO)

Answer options:

A.

Cloud Dataproc

B.

Cloud Dataflow

C.

Spot VMs (Preemptible VMs)

D.

Committed Use Discounts (CUDs)

E.

Compute Engine with custom Hadoop installation

How to approach this question

Identify the managed Hadoop service, and the compute pricing model best suited for short, fault-tolerant batch jobs.

Full Answer

Cloud Dataproc (Option A) is Google's managed Hadoop and Spark service, allowing you to lift-and-shift existing Spark jobs easily. Because the jobs are batch processes running only 2 hours a night, they are highly fault-tolerant. Configuring the Dataproc cluster to use Spot VMs (Option C) for its worker nodes provides massive cost savings compared to on-demand pricing.

Common mistakes

Selecting CUDs (D). CUDs require a 1 or 3-year commitment for resources running 24/7. You would waste money on the 22 hours a day the cluster is off.

Practice the full GCP Professional Cloud Architect Practice Exam 4

50 questions · hints · full answers · grading

More questions from this exam