Synapse Pipeline
  • 19 Nov 2024
  • 4 Minutes to read
  • Dark
    Light
  • PDF

Synapse Pipeline

  • Dark
    Light
  • PDF

Article summary

Introduction

Azure Synapse Analytics is an integrated platform service that brings together the abilities of data warehousing, data integrations, ETL pipelines, analytics tools, and services, as well as the scale for big data, visualization, and dashboards.

A Synapse Workspace can have one or more pipelines. A pipeline is a logical arrangement of operations that work together to complete a goal. It is a cloud-based ETL (Extract, Transform, and Load) and data integration tool that allows you to design data-driven workflows for orchestrating data movement and converting data at scale.

Pre-requisites

  • Synapse Administrator access is necessary at the Service Principal level to start associating a Synapse Pipeline to a Business Application:

  • Navigate to Synapse workspace -> Synapse Studio in Azure Portal to grant the required permission for Synapse Pipeline resources.

Refer here to know more about Synapse RBAC roles.

Run Pipeline

  • The Run pipeline option allows users to start a Synapse Pipeline by entering values into the parameters configured in the Azure portal.

pipeline 1.JPG

Pipeline Runs

  • The pipeline runs of a Synapse pipeline can be viewed along with its pipeline activity runs by clicking on the corresponding pipeline identifier.

pipeline 2.JPG

Parameters

The parameters specified when triggering a Synapse pipeline are displayed in the Parameters column for the corresponding pipeline run.

pipeline 3.png

Filtering

The Synapse pipeline runs can be filtered based on any one of the following statuses:

  • Succeeded
  • In Progress
  • Queued
  • Failed
  • Cancelled

pipeline 4.png

Rerun failed pipeline runs

  • Synapse Pipelines can be re-run using the previously configured parameters.

  • A Synapse Pipeline run may fail to owe to a failure in any of its actions. There might be cases where those failed activity runs will be required to be run again. Turbo360 expands its support for such scenarios by enabling users to run single or multiple pipeline runs.

  • When the re-run operation is finished, a tag is appended to the child run that allows users to obtain the details of the original activity run. A tag on the parent pipeline run allows users to access the run history of all its child runs.

pipeline 5.png

Ignore pipeline run

  • The Ignore option allows users to disregard any pipeline run that is no longer necessary to take into account for further processing. The corresponding pipeline run receives an ignore tag as a result of this operation, which is visible in the Tags column.

pipeline 6.png

Ignore tag is only visible in Turbo360 and has no effect on actual runs.

Stop pipeline run

Any ongoing pipeline run can be interrupted by using the Cancel option.

Action Required

The Action Required tab displays the list of all the failed pipeline runs that require an action to proceed further. A failed pipeline run can be ignored or rerun.

pipeline 7 action.JPG

Favorites

Any pipeline run that is frequently used can be marked as a favorite along with an optional description from the Pipeline runs and Action Required tabs.

pipeline 8 favorite.JPG

Inline task

Business Application allows users to quickly re-run failed Synapse pipeline runs from the source pipeline that is currently being explored within the specified hours from within the resource.

The configuration created to run immediately can also be saved for future use in the Automated Tasks section.

Inline task to rerun failed runs

Users can use this feature to quickly retry pipeline runs that failed within the allotted time frame (Minimum 1 hour).
pipeline 9 - inline task.JPG

Task status can be viewed by navigating to the Automated Tasks section and switching to the Task history tab.

  • Users can use this feature to quickly create a task that runs immediately.

  • Navigate to the Automated Tasks section in Turbo360 to create a task with a more detailed configuration, schedule tasks to run at a specific time, or automate the task to run on the specified hours, days, and more.

Automated Task

The failed pipeline runs can be scheduled to run at a specific time from the Automated Task section with options provided to Include previous reruns and Rerun the ignored runs.

The following illustration reruns the failed pipeline runs using an Automated task:

pipeline 10 automated task.png

Resource Dashboard

A default Synapse Pipeline Dashboard is available to users within the Synapse Pipeline resource, enabling improved data visualization and tracking of real-time data.

pipeline 11 dashboard.JPG

Users are provided with the following pre-defined Dashboard widgets, which can be customized to meet their specific needs.

1. Cancelled Activity Runs
2. Failed Activity Runs vs Succeeded Activity Runs
3. Succeeded Activity Runs

The widget values for Synapse pipeline resource will be available only when the time granularity is set to 1 minute or 1 hour.

Monitoring

  • Users can monitor their Synapse Pipeline resources by configuring the rules available for monitoring.

  • Navigate to the Monitoring section of the resource to configure the monitoring rules for Synapse Pipeline.

  • Users can specify monitoring threshold values based on their needs.

  • When the monitoring rule type is a metric, selecting metrics against metric rules is also an option.

pipeline 12 monitoring.JPG

Synapse pipeline resources will be monitored only when the Rules evaluation frequency and the Aggregation period are set to 1 minute or 1 hour, regardless of their possible combinations.

Synapse Pipelines won't have access to any monitoring rules using the System profile.


Was this article helpful?