What you'll do?
Extract, Transform and Load data from multiple sources and multiple formats using Big Data Technologies.
Development, enhancement, and support of data ingestion jobs from various source systems using GCP Services such as Dataproc, Dataflow, BigQuery, Airflow, etc.
Work closely with senior engineers to optimize query and data access techniques.
Apply modern software development practices (serverless computing, microservices architecture, CI/CD, infrastructure-as-code, etc.)
Required Skill Sets
4+ years experience with Bachelor's degree in Computer Science, Systems Engineering or equivalent experience.
GCP Services- Dataflow, Dataproc, BigQuery, Airflow, GCS, PubSub
Programming Language - JAVA, Python
BigData framework - Apache Beam, Apache Spark, Airflow 2+
Hadoop Ecosystem - HDFS, HIVE, PySpark
Relational databases (e.g. SQL Server, Oracle, MySQL)
Source code control management systems (e.g. SVN/Git, Github)
Build tools - Maven or Gradle.
Agile environments (e.g. Scrum)
Atlassian tooling (e.g. JIRA, Confluence)
Good to have:
- Active Cloud Certification