Company

CaterpillarSee more

addressAddressChennai, Tamil Nadu
type Form of workFull time
CategoryIT

Job description

Career Area:

Business Technologies, Digital and Data

Job Description:

Your Work Shapes the World at Caterpillar Inc.

When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other.  We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.

The monitoring & AI Operations, Observability Manager is responsible for developing, deploying and maintaining a suite of tools and capabilities that accelerate CAT Digital ability to detect, triage and correct technical issues for services and applications that are developed This includes anomaly detection, intelligent alerting, event correlation, self-healing, automation, and other ML/GenAI-based solutions.  Leads a team of technical professionals responsible for the delivery and ongoing operation of our next-generation observability & monitoring tools.

Basic Qualifications:

  • Bachelor’s degree, preferably in Computer Science, Software Engineering, or any other Engineering field.

Responsibilities:

  • Strategize and improvise the monitoring roadmap, implement systems, processes, and governance to meet SLA, SLO and operational standards.
  • Accelerate MTTR with automated incident management.
  • Leads technical prototypes & proof-of-concepts to prove new technologies, tools, and capabilities.
  • Converts prototypes into working, production capabilities that deliver value to internal and external users.
  • Identifies new opportunities for automation, alert noise reduction, and proactive detection of technical issues.
  • Manages an agile backlog & team; drives timely development and deployment of new alerts, monitors, and automation.
  • Manages multiple observability & monitoring tools (adoption, support, contract, costs)
  • Collaborates with peer technical teams and business partners to understand complex requirements and provide comprehensive solutions.

Skills/Requirements:

  • 8+  years of experience in building/deploying monitoring or observability solutions.
  • 7+  years of experience managing a development or operations team (including people management).
  • 12+ years prior experience in DevOps and/or application development teams.
  • Strong experience with observability tools and platforms (e.g., Prometheus, Grafana, ELK Stack, Jaeger, Open Telemetry, Splunk, Loki, Datadog, Zabbix or similar tools). 
  • Proficiency in scripting and automation (e.g., Python, Bash, PowerShell, Terraform, Ansible). 
  • Knowledge of cloud computing platforms (e.g., AWS, Azure, GCP) and container orchestration (e.g., Kubernetes). 
  • Experience with monitoring and logging tools like Elasticsearch, AppDynamics, Dynatrace and Splunk.
  • Knowledge on multiple data stores (MySQL, Postgres, DynamoDB)
  • Experience working with development teams building on microservices and native cloud solutions.
  • Working familiarity with Industry ML & Generative AI capabilities.
  • Certification on AWS Cloud, SIEM tools will be an added advantage.
  • Excellent communication and collaboration skills.

Posting Dates:

March 1, 2024 - March 14, 2024

Caterpillar is an Equal Opportunity Employer (EEO).

Not ready to apply? Join our Talent Community .

Refer code: 945824. Caterpillar - The previous day - 2024-03-05 07:44

Caterpillar

Chennai, Tamil Nadu

Share jobs with friends