This guide will show you how to securely connect Synq to your Airflow setup.

We need this information so we can extract metadata about Airflow. By default, all tasks and all DAGs will be reported to us.

To be able to finish this guide, you’ll need the following:
→ Access to modify your Airflow configuration code

⏱️ Estimated time to finish: 10 minutes.

Setup

  1. Install the required dependency in your Airflow
pip install acryl-datahub-airflow-plugin
  1. Disable lazy plugin loading. Chose one of the options below depending on your setup

Option 1: In the airflow.cfg file

[core]
lazy_load_plugins = False

Option 2: Set env variable

AIRFLOW__CORE__LAZY_LOAD_PLUGINS=False
  1. Setup the REST hook
  • conn-host: https://datahubapi.synq.io/datahub/v1/
  • conn-password: Token from Synq you obtain when you click ‘Create’ on this page
airflow connections add  --conn-type 'datahub_rest' 'datahub_rest_default' --conn-host '<host from Synq>' --conn-password '<token from Synq>'

Read more

Synq uses Datahub’s open-source plugin to collect relevant task execution and lineage data from your Airflow setup. You can read more about it here.

For more details about what exact data we collect via datahub collector, the source code is open source and available here:

datahub/airflow_generator.py at master · datahub-project/datahub