We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.
If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”
Essential cookies are necessary to provide our site and services and cannot be deactivated. They are usually set in response to your actions on the site, such as setting your privacy preferences, signing in, or filling in forms.
Performance cookies provide anonymous statistics about how customers navigate our site so we can improve site experience and performance. Approved third parties may perform analytics on our behalf, but they cannot use the data for their own purposes.
Functional cookies help us provide useful site features, remember your preferences, and display relevant content. Approved third parties may set these cookies to provide certain site features. If you do not allow these cookies, then some or all of these services may not function properly.
Advertising cookies may be set through our site by us or our advertising partners and help us deliver relevant marketing content. If you do not allow these cookies, you will experience less relevant advertising.
Blocking some types of cookies may impact your experience of our sites. You may review and change your choices at any time by selecting Cookie preferences in the footer of this site. We and selected third-parties use cookies or similar technologies as specified in the AWS Cookie Notice.
We display ads relevant to your interests on AWS sites and on other properties, including cross-context behavioral advertising. Cross-context behavioral advertising uses data from one site or app to advertise to you on a different company’s site or app.
To not allow AWS cross-context behavioral advertising based on cookies or similar technologies, select “Don't allow” and “Save privacy choices” below, or visit an AWS site with a legally-recognized decline signal enabled, such as the Global Privacy Control. If you delete your cookies or visit this site from a different browser or device, you will need to make your selection again. For more information about cookies and how we use them, please read our AWS Cookie Notice.
To not allow all other AWS cross-context behavioral advertising, complete this form by email.
For more information about how AWS handles your information, please read the AWS Privacy Notice.
We will only store essential cookies at this time, because we were unable to save your cookie preferences.
If you want to change your cookie preferences, try again later using the link in the AWS console footer, or contact support if the problem persists.
Amazon Managed Workflows for Apache Airflows (MWAA), is a managed Apache Airflow service used to extract business insights across an organization by combining, enriching, and transforming data through a series of tasks called a workflow. Managed Workflows free you from managing, configuring, and scaling the Airflow environment while you orchestrate data processing workflows and manage their execution through AWS-backed logging and monitoring capabilities. You can run your existing Airflow workflows on Amazon MWAA and interact with their environment programmatically using the AWS console, API, and command line interface (CLI).
You should use Amazon MWAA to spend more engineering/data science time building workflows and less time managing the infrastructure and Airflow environment, all while realizing consistent performance from the managed service. Data engineering and data science teams have been using Airflow as a leading open source orchestration environment for building and executing workflows that define extract-transform-load (ETL) jobs and machine learning data pipelines. You will appreciate the Airflow’s ability to programmatically build, schedule, and monitor workflows authored in Python, the preferred language of data processing. The Airflow task plugin model and open architecture that allows you to build custom workflows including support for on-premise data sources. However, a team that wants to reap the benefits of Airflow’s programmatic interface must first configure and maintain the servers and monitoring for it to function. Many customers dedicate data engineers to manage the worker fleet, install dependencies, scale the system up and down, and restart the scheduler. Managed Workflows eliminate the need for these hands-on operations by providing you with a managed Airflow environment that is highly available, monitored, and automatically scalable.
Amazon MWAA manages the work involved in setting up Airflow, from provisioning the infrastructure capacity (server instances and storage) to installing the software and providing simplified user management and authorization through AWS Identity and Access Management (IAM) and Single Sign-On (SSO).
Amazon MWAA is a workflow environment that allows data engineers and data scientists to build workflows using other AWS, on-premise, and other cloud services. Amazon MWAA workflows retrieve input from sources like S3 using Athena queries, perform transformations on EMR clusters, and can use the resulting data to train machine learning (ML) models on SageMaker. Workflows in Amazon MWAA are authored as Directed Acyclic Graphs (DAGs) using Python. A key benefit of Airflow is its open extensibility through plugins which allows you to create task plugins for any AWS or on-premise resources required for your workflows including Athena, Batch, Cloudwatch, DynamoDB, DataSync, EMR, ECS/Fargate, EKS, Firehose, Glue, Lambda, Redshift, SQS, SNS, Sagemaker, and S3.
Amazon MWAA will support all of the 100+ Airflow community plugins developed to date, plus any custom plugins you create, simply by placing them in an S3 bucket.
Amazon MWAA provides access to the Apache Airflow UI, where you can monitor your workflows through Graph and Grid views, examining task logs and execution details for each DAG run. The Grid view shows the complete history and status of each task instance in your workflows. Supporting this interface, CloudWatch shows key metrics about your environment's health and performance, while CloudWatch Logs help you pinpoint and troubleshoot any issues during workflow execution.
You should use Amazon MWAA if you prioritize open source and portability. Airflow has a large and active open source community that contributes new functionality and integrations regularly. Amazon MWAA supports existing Airflow workflows and integrations without changes to code, migration is easy, and the environment is familiar.
You should use Step Functions if you prioritize cost and performance. For example, if you were processing streaming data and transforming it through multiple steps before putting it in a DynamoDB database or S3, you should use Step Functions because it has higher performance at a lower cost.