• Business background.
  • Architecture and implementation of the features built on DolphinScheduler
  • Community contribution

Business background

Cisco Webex Data Islands — Before

Cisco Webex Data Islands — After

Architecture and implementation of the features built on DolphinScheduler

DolphinScheduler with Kubernetes Integration

  • Using Kubernetes makes our daily operation much more smooth and effortless. In the DevOps principle, we are both application developers and operation owners for all the application and data processing jobs we developed. After building data pipelines and data platform features, my team also covers the CI/CD pipelines for deploying these applications and pipelines. We also build a monitoring platform based on metrics and analysis. If we want to build premises for metrics, it usually takes 1 or 2 days for the infrastructure provisioning and service building, even using automation scripts. But it takes literally 2 minutes if we use the Prometheus Operator in Kubernetes.
  • The second reason for onboarding Kubernetes is that it allows us to deploy all kinds of containerized services inside it. Yarn supports all kinds of JVM-based jobs, for example, Flink, Spark jobs and batch jobs, and real-time jobs. While Kubernetes supports even more kinds of jobs, actually as long as it’s in a containerized image. Prometheus and Redis also can run in the same cluster. It saves us a lot of our operation work by utilizing the hybrid development characteristics of Kubernetes. We used to deploy our data platform serves as a dedicated VMS. Now, we have this separate monitoring cluster with Prometheus operators installed for all the data processing jobs. Now, all the services as a monitoring component are consolidated within a single Kubernetes cluster. Also, the CI/CD pipeline is much easier to maintain because everything is in Kubernetes.

Multi-Cluster ETL Job Management

Kubernetes Multi-Cluster Management

Cisco Webex Data Residency

Simple ETL pipeline

Simple ETL pipeline

UDF Management

Automatic Scaling

Automatic Scaling

Flink Jobs on Kubernetes

Flink Jobs on Kubernetes
  • Flink Jar jobs support
  • Flink SQL jobs support
  • Time ranged scaling

Kubernetes batch job

Kubernetes batch job

SQL Task Customization

SQL Task Customization
  • Snowflake Support in SQL Task
  • Upsert feature for Snowflake Spark connector

Community Contribution

Join the Community

Logo

开发云社区提供前沿行业资讯和优质的学习知识,同时提供优质稳定、价格优惠的云主机、数据库、网络、云储存等云服务产品

更多推荐