- Location
- Bengaluru, India
- Last Published
- Dec. 4, 2024
- Function
- Operations
At Tekion, we're building the only cloud-native platform that is transforming the automotive retail industry, leapfrogging it into the future and providing unparalleled customer experience. We're creating seamlessly integrated, elegant, and intuitive solutions built with cutting edge technology and powered by Big Data, Machine Learning (ML)/ AI and Internet of Things (Connected vehicle to connected devices). We’re harnessing passion, entrepreneurial spirit, deep industry expertise and the latest technologies to create something very special. We're inventing new technology along the way to overcome barriers and solve big problems, all while having a blast doing it!
We are seeking an experienced and proficient Engineering Manager, NOC to join our team. This role demands a minimum of 8 years of experience in NOC, Production Support, or Command Center roles within a 24 x 7 operational environment. The ideal candidate will possess advanced knowledge of observability platforms and ITIL processes, along with strong analytical and troubleshooting skills.Responsibilities
- Ownwership of NOC Team operation, hiring great talent, building teams from ground up, mentorship, growth and retention of people.
- Lead the monitoring and maintenance of system health using observability platforms such as AppDynamics, Dynatrace, Datadog, or New Relic.
- Provide expert consultation, design, and implementation of APM, Real User Monitoring, Synthetic Monitoring, Infrastructure Monitoring, and Log Management modules.
- Oversee incident, problem, change, and release management processes as per ITIL standards.
- Manage and drive major incident bridge calls and post-incident reviews (PIRs).
- Conduct root cause analysis and troubleshooting using tools like New Relic and
- Kibana.
- Develop and maintain monitoring alerts and dashboards.
- Resolve production issues across various services and stack levels.
- Ensure compliance with Service Level Objectives (SLOs) and Service Level
- Agreements (SLAs).
- Develop monitoring solutions to detect symptoms and prevent outages.
- Automate operational processes to enhance system efficiency and reduce manual
- tasks.
- Take responsibility for on-call rotations to immediately address potential issues or
- disruptions.
- Work rotational shifts (Day/Night/Weekends) as needed.
- 5+ years of experience in NOC/Production Support/Command Center roles.
- 3+ years of experience as an Engineering Manager - people and performance management.
- Extensive experience around establishing organisation wide processes such as Incident Management, and it’s evangelisation.
- Advanced knowledge of observability platforms (AppDynamics, Dynatrace, Datadog,
- New Relic).
- Extensive hands-on experience across multiple platforms.
- In-depth understanding of ITIL processes and best practices.
- Superior problem-solving and troubleshooting skills.
- Proficiency in log monitoring solutions (Sumologic, Splunk) is highly desirable.
- Extensive experience with cloud service providers (Azure, AWS) is preferred.
- Expertise in cloud computing platforms (AWS, Azure).
- Significant experience in technical support or operations roles, ideally in a SaaS
- environment.
- Ability to create comprehensive technical documentation and knowledge bases.
- Proven experience in monitoring and maintaining large-scale systems.
- Advanced knowledge of monitoring and logging tools (Prometheus, Grafana, ELK
- Stack).
- Experience in Dev support is highly beneficial.
- Proficiency in command center technologies and software (e.g., SIEM, NMS, ITSM tools).
- Experience with crisis management and disaster recovery planning.
- Ability to work under pressure and make quick, informed decisions.
- Strong analytical skills to interpret data and trends.
- Ability to work collaboratively with cross-functional teams.
- Excellent organizational skills and attention to detail.
- Experience in developing and managing budgets.
- Familiarity with network and systems architecture.
- Knowledge of cybersecurity principles and practices.
- Availability to work flexible hours, including nights, weekends, and holidays, as needed.
Tekion is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.
For California residents you can review Tekion's California Privacy Policy here.