Roles and responsibilities:
- Candidate with at least 6 years relevant experience
- Tech Stack: AWS, Snowflake, Airflow, Python, Spark, Big Data and ETL Pipelines
- Strong programing skills, being well versed in Object-Oriented Programming system (OOPS), data structures, and algorithms
- Should be comfortable in executing ETL (Extract, Transform and Load) processes which include data ingestion, data cleaning and curation into a data warehouse, database, or data platform
- Should be comfortable with schema designing
- Experience in distributed computing environment
- Experience in structured/unstructured data and batch processing/real-time processing (good to have)
- Be comfortable with SQL (mandatory), Python(mandatory), Scala (good to have) to manipulate and prepare data and conduct various analysis as needed
- Reading\writing data to\from various sources - APIs, cloud storage, databases, big data platforms
- Experience of working with Big Data environment such as Hadoop and the ecosystem
- Data transformations and applying ML models
- Creating web services to allow create, read, update and delete (CRUD) operations
- Competent in project management framework such as Agile
- Excellent communication skills, both written and verbal