Medium1 markMultiple Choice
Domain 2.2: Design data integrationDomain 2Data IntegrationData FactoryDatabricks

AZ-305 · Question 21 · Domain 2.2: Design data integration

Your company is designing an ELT (Extract, Load, Transform) pipeline.

The pipeline must extract data from 50 different on-premises and cloud data sources, load the raw data into Azure Data Lake Storage Gen2, and then perform complex transformations using Apache Spark. The data engineering team prefers writing transformation logic in Python and Scala notebooks.

Which TWO Azure services should you combine to build this solution? (Select TWO)

Answer options:

A.

Azure Data Factory

B.

Azure Databricks

C.

Azure Stream Analytics

D.

Azure Event Hubs

E.

Azure SQL Database

How to approach this question

Identify the orchestrator/extractor (Data Factory) and the Spark-based transformation engine (Databricks).

Full Answer

A common modern data warehouse pattern is to use Azure Data Factory for data movement (Extract and Load) because of its extensive library of linked services and copy activities. Once the data is in the Data Lake, Data Factory triggers an Azure Databricks notebook to perform the complex transformations (Transform) using Apache Spark, Python, and Scala.

Common mistakes

Trying to use Stream Analytics for batch processing, or assuming Data Factory can do complex Python/Scala transformations natively without a compute engine like Databricks or Synapse Spark.

Practice the full Azure Solutions Architect Expert AZ-305 Practice Exam 1

55 questions · hints · full answers · grading

More questions from this exam