Company Description
Join the UAE’s largest bank and one of the world’s largest and safest financial institutions. Our focus is to create value for our employees, customers, shareholders and communities to grow through differentiation, agility and innovation. We are looking for top talent and your success is our success. Accelerate your growth as you help us reach our goals and advance your career. Be ready to make your mark a top company, in an exciting and dynamic industry.
Job Description
Overall objectives
•To establish and maintain robust, scalable, and secure telemetry pipelines for infrastructure, applications, and user experience data.
•To ensure complete, real-time, and context-rich log and event collection across the enterprise IT estate.
•To support analytics, correlation, and anomaly detection by enabling high-quality observability data ingestion and structuring.
•To contribute to system reliability and rapid incident response through unified and centralised logging strategies.
Role specific responsibilities
•Architect and implement telemetry pipelines to collect, parse, transform, and route logs from servers, containers, applications, network devices, and cloud services.
•Maintain and optimise logging agents, ingestion layers, and storage backends for performance, reliability, and cost efficiency.
•Define and enforce log quality standards: consistent schemas, severity levels, and metadata tagging.
•Enable log correlation with metrics and traces for full-stack observability and root cause analysis.
•Partner with GSO teams to ensure log integrity, retention, and availability for compliance and forensic needs.
•Troubleshoot gaps in logging, identify missing telemetry sources, and maintain 100% coverage of mission-critical assets.
General functional responsibilities
•Maintain documentation of telemetry pipelines, logging schemas, retention policies, and access controls.
•Support observability use cases such as anomaly detection, trend reporting, and alert enrichment with log data.
•Participate in SRE/DevOps practices such as incident retrospectives, reliability reviews, and capacity planning.
•Ensure that logs are properly integrated with SIEM, APM, and incident management tools.
•Stay updated with developments in observability standards, particularly OpenTelemetry and related CNCF projects.
•Provide on-call support for critical telemetry pipeline issues as part of a rota.
Qualifications
Core competencies required
•Deep hands-on experience with centralised logging solutions such as ELK Stack (Elasticsearch, Logstash, Kibana), OpenSearch, Splunk, Fluentd, Fluent Bit, or Loki.
•Strong understanding of log formats (structured/unstructured), log parsers, field extraction, and enrichment techniques.
•Experience designing, deploying, and maintaining scalable telemetry pipelines across hybrid environments.
•Familiarity with OpenTelemetry, syslog, JSON, protobuf, OTLP, and cloud-native logging (e.g. AWS CloudWatch Logs, Azure Monitor Logs).
•Strong scripting skills (Bash, Python, PowerShell) for log agent deployment, parsing automation, and data hygiene.
•Capable of translating operational needs into actionable telemetry strategies.