This guide will show you how to securely connect SYNQ to your Airflow setup.We need this information so we can extract metadata about Airflow. By default, all tasks and all DAGs will be reported to us.To be able to finish this guide, you’ll need the following:
→ Access to modify your Airflow configuration code
⏱️ Estimated time to finish: 10 minutes.

Setup

  1. Install the required dependencies in your Airflow
pip install 'acryl-datahub-airflow-plugin>=1.1.0.4' apache-airflow-providers-openlineage
For detailed installation instructions and compatibility information, see the DataHub Airflow documentation.
  1. Setup the REST hook
  • conn-host: https://datahubapi.synq.io/datahub/v1/
  • conn-password: Token from SYNQ you obtain when you click ‘Create’ on this page
airflow connections add  --conn-type 'datahub_rest' 'datahub_rest_default' --conn-host '<host from SYNQ>' --conn-password '<token from SYNQ>'

Airflow 2.7+ with OpenLineage Provider

If you’re using Airflow 2.7+, the native Airflow OpenLineage provider will improve the quality of lineage and metadata information obtained from your Airflow setup. The OpenLineage provider package is already included in the installation above since the DataHub plugin requires it. For AWS MWAA, add both acryl-datahub-airflow-plugin and apache-airflow-providers-openlineage to your requirements.txt file. Note: Native OpenLineage support is planned for SYNQ. Currently, the DataHub plugin is required to collect Airflow metadata.

Log Forwarding

Log forwarding is required to include task failure snippets in SYNQ alerts. SYNQ supports multiple methods for forwarding Airflow logs:

AWS MWAA CloudWatch Logs

For AWS Managed Workflows for Apache Airflow (MWAA), you can forward logs from CloudWatch using the synq-aws-cloudwatch Lambda function:
  1. Deploy the Lambda function: Use the synq-aws-cloudwatch repository to deploy a Lambda function that forwards CloudWatch logs to SYNQ.
  2. Configure log forwarding: The Lambda automatically forwards Airflow logs from CloudWatch to SYNQ when properly configured with your SYNQ credentials.
  3. Set up log group subscription: Configure CloudWatch to trigger the Lambda when new Airflow logs are available.

API Endpoint

You can also send logs directly to SYNQ using our API endpoint. This endpoint accepts Airflow log data and forwards it to the SYNQ platform for processing and analysis.

Remote Logging (S3/GCS)

SYNQ supports uploading logs to SYNQ after files are created on S3 or GCS using Airflow’s Remote Logging feature. For assistance with this setup, please reach out to SYNQ support.

Read more

SYNQ uses Datahub’s open-source plugin to collect relevant task execution and lineage data from your Airflow setup. You can read more about it here. For more details about what exact data we collect via datahub collector, the source code is open source and available here: datahub/airflow_generator.py at master · datahub-project/datahub