Beyondsoft International (Singapore) Pte. Ltd. was set up in 2007 and established as the regional headquarters for the Southeast Asia (SEA) and European markets in September 2015. Based on our vision of "Using technology to promote social progress, economic development and become a global customer preferred partner" and our concept of "Beyond your expectations", Beyondsoft is committed to provide our customers in countries along the "Belt and Road" with comprehensive solutions and products and creating commercial value for customers to realizing continuous businesses development.
Our core business includes:
- IT development servicesProviding customers with IT consulting, software research and development, software and hardware testing, system integration and operation and maintenance, data analysis and other services;
- New retail solutions and productsThrough intelligent products, helping small and medium-sized enterprises (SMEs) realize the digital transformation of their daily operations;
- Internet of Things (IoT) platform and solutionsComprehensive use of IoT, artificial intelligence, big data, cloud computing and other technologies to provide IoT solutions for intelligent upgrades in cities, parks, buildings and industries, to create a smart future.
For more information, please visit www.beyondsoft.com
DESIGNATION : Operation Engineer - PD
RESPONSIBILITIES
As an Operation Engineer, you'll be responsible for ensuring the stability and reliability of the systems you operate by taking ownership as well as sticking to the operation process. You'll need to triage issues which are either proactively identified or passively reported. With your insights of the systems as well good collaboration with other teams you will serve as a key role for solving incidents happened as well as reducing the incident rate. Your day to day job will also include supporting production release, drafting incident report, shaping monitoring tools as well as resolving service requests.
Role Responsibilities:
- Identify both infrastructure and application issues proactively by appropriately using monitoring systems
- Support incident triage, define the impact scope and report incidents on a timely manner
- Coordinate with developers to do root cause analysis and solve issues, come up with ways of mitigation if issues cannot be resolved immediately
- Draft detailed incident report, identify and analyze recuring issues, suggest ways of prevention
- Support production release and post release verification base on release runbook
- Receive and resolve service request from ticketing system
- Help operation manager on enhancing operation process and preparing necessary operation documentation
- Actively participate in technical discussion and to be able to coordinate with other teams on other required tasks
QUALIFICATIONS
- Bachelor in Information Technology, Computer Science or relevant studies
- Minimum 3 years of proven work experience in system monitoring and operation
- Experience in cloud services, Azure and GCP are required
- Experience in operating services deployed on Kubernetes cluster
- Experience in operating Couchbase Database, MongoDB as well as Postgresql DB is preferred
- Experience in operating Flink, Kafka and Zookeeper based systems is preferred
- Good understanding on various monitoring systems such as Grafana, Prometheus, Azure Monitor, Firebase, etc.
- Proficient in scripting language such as shell and bash, ability to automate execution of scripts using pipeline is preferred.