Job roles and responsibilities:
- Minimum 3 to 4 years hands on designing, building and operationalizing large-scale enterprise data solutions and applications using GCP data and analytics services like, Cloud DataProc, Cloud Dataflow, Cloud BigQuery, Cloud PubSub, Cloud Functions.
- Hands-on experience in analyzing, re-architecting and re-platforming on-premise data warehouses to data platforms on GCP cloud using GCP/3rd party services.
- Experience in designing and building data pipelines within a hybrid big data architecture using Java, Python, Scala & GCP Native tools.
- Hands-on Orchestrating and scheduling Data pipelines using Composer, Airflow.
- Experience in performing detail assessments of current state data platforms and creating an appropriate transition path to GCP cloud.
Technical Skills Required:
- Strong Experience in GCP data and Analytics Services
- Working knowledge on Big data ecosystem-Hadoop, Spark, Hbase, Hive, Scala etc.,
- Experience in building and optimizing data pipelines in Spark
- Strong skills in Orchestration of workflows with Composer/Apache Airflow.
- Good knowledge on object-oriented scripting languages: Python (must have) and Java or C++.
- Good to have knowledge in building CI/CD pipelines with GCP Cloud Build and native GCP services.
- GCP Professional Data Engineer certification is a PLUS.