Sr. Data Engineer : 6 Years - Etl, Spark , Python ,Pyspark

Job description

Job Description

Minimum of 6+ years of experience in developing ETL jobs using any industry leading ETL tool.
Ability to design, develop, and optimize Apache Spark applications for large-scale data processing.
Ability to implement efficient data transformation and manipulation logic using Spark RDDs and Data Frames.
Ability to design, implement, and maintain Apache Kafka pipelines for real-time data streaming and event-driven architectures.
Development and deep technical skill in Python, Scala, NIFI andSQL/Procedure.
Working knowledge and understanding on Unix/Linux operating system like awk, ssh, crontab, etc.,
Ability to write transact SQL, develop and debug stored procedures and user defined functions in python.
Working experience on Postgres and/or Redshift database is required.
Exposure to CI/CD tools like bit bucket, Jenkins, ansible, docker, Kubernetes etc. is preferred.
Ability to understand relational database systems and its concepts.
Ability to handle large table/dataset of 2+TB in a columnar database environment.
Ability to integrate data pipelines with Splunk/Grafana for real-time monitoring, analysis, and visualization.
Ability to create and schedule the Airflow Jobs.

Refer code: 905836. Western Digital - The previous day - 2024-02-03 14:53