This is a memo to share what I have learnt in Apache Airflow, capturing the learning objectives as well as my personal notes. The course is taught by Mike Metzger from DataCamp.
这是一份备忘录,旨在分享我在Apache Airflow中学到的知识,记录学习目标以及我的个人笔记。 该课程由DataCamp的Mike Metzger教授。
A data engineer’s job includes writing scripts, adding complex CRON tasks, and trying various ways to meet an ever-changing set of requirements to deliver data on schedule. Airflow can do all these while adding scheduling, error handling, and reporting.
数据工程师的工作包括编写脚本,添加复杂的CRON任务以及尝试各种方法来满足日新月异的要求,以按计划交付数据。 Airflow可以在添加计划,错误处理和报告的同时完成所有这些工作。
I have learnt the following topics:
我已经学习了以下主题:
- Workflows / DAGs / Tasks 工作流程/ DAG /任务
- Operators (BashOperator, PythonOperator, BranchPythonOperator, EmailOperator) 运算子(BashOperator,PythonOperator,BranchPythonOperator,EmailOperator)
- Dependencies between tasks / Bitshift operators 任务之间的依赖关系/移位运算符
- Sensors (to react to workflow conditions and state) 传感器(对工作流程条件和状态做出React)
- Scheduling DAGs 安排DAG
- SLAs / Alerting to maintain visibility on workflows SLA /警报以保持工作流程的可见性
- Templates for maximum flexibility when defining tasks 定义任务时具有最大灵活性的模板
- Branching, to add conditional logic to DAGs 分支,为DAG添加条件逻辑
- Airflow interfaces: command line / UI 气流接口:命令行/ UI
- Airflow executors 气流执行器
- Debugging / Troubleshooting 调试/故障排除
My next steps would be:
我的下一步将是:
- Set up my own environment for practice 建立自己的练习环境
- Explore other operators (eg. Amazon’s S3, Postgresql) and sensors (eg. HDFS) 探索其他运营商(例如Amazon的S3,Postgresql)和传感器(例如HDFS)
- Experiment with dependencies with a large number of tasks 试验具有大量任务的依赖项
- Look into parts of Airflow: XCom, Connections, etc 查看气流的各个部分:XCom,连接等
- Refer to Airflow documentations 请参阅气流文档
- Keep building workflows 继续构建工作流程
More notes and codes can be found on my GitHub.
在我的GitHub上可以找到更多注释和代码。
Overall, I have enjoyed learning this course and would highly recommend it!
总的来说,我很喜欢学习这门课程,并强烈推荐它!
翻译自: https://medium.com/swlh/introduction-to-airflow-in-python-67b554f06f0b
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/387992.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!