Required Educational Qualifications :- Bachelor's degree in Computer Science Technology, or a related field- Master's degree in Computer Science Technology or computer application is an added advantage.Scope :- 3-6 years of software development experience with a proven track record of building, scaling, and supporting production Data Pipelines, data warehouses and data modeling- High proficiency in writing idiomatic code, preferably in PYSPARK- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, EMR/Snowflake, Redshift/Bigquery and AWS/GCP 'big data' technologies.- Maintaining the high volume and compute Data Pipelines and ensuring that process is not interrupted- Hands on Hadoop, SPARK and other relevant Big Data technologies- Strong analytic skills related to working with unstructured datasetsKey Performance Indicators :- Develop algorithms to transform data into useful, actionable information- Build, test, and maintain database pipeline architectures- Collaborate with management to understand company objectives- Create new data validation methods and data analysis tools- Ensure compliance with data governance and security policies- Should be able to write scripts to automate repetitive tasks.- Demonstrated ability to work effectively in teams, in both a lead and support role.- Experience working with cloud infrastructure services like Amazon Web Services and Google Cloud is preferred- Effective time management skills, including demonstrated ability to manage and prioritize multiple tasks and projects.Technical Competencies :- 3-6 years of experience in Data Engineering, Database management, Data structures & ETL development.Familiar with:- Proven understanding and experience in distributed computing frameworks, particularly Spark, Spark-SQL, Hive, Azure Data bricks, Big query, EMR, AWS Glue etc.- Hands on experience on any one of cloud services and ETL/ELT workflows- Advance understanding of advance ETL design and Big Data processing, optimization and data models- Proficient knowledge of relational databases, including SQL, and large-scale distributed systems such as Hadoop and Spark.- Amazon Redshift experience- good to have (ref:hirist.com)