Job Description / Skills Required
The Metrics Team is responsible for building and maintaining Groupon internal infrastructure, user support, standards, procedures and policies for Splunk. We need people who have systems engineering, development and presentation skills, and the belief that anything worth doing twice is worth automating!
The successful candidate will be versatile, with an extensive unix systems engineering background with deep knowledge of linux internals (kernel/libraries, filesystems, package management, etc.), experience working with many open source technologies (nginx, splunk, nagios, mysql, postgres, git, memcache, redis and ldap are some examples) is a plus. Furthermore, the candidate must have the ability to solve complex technical questions/problems in an highly dynamic operations environment and demonstrate great problem solving abilities.
Additionally, the candidate will need to demonstrate ownership. Such as ensuring the proper handling of the entire lifecycle of projects from concept to implementation, to documentation and training of peers, to the development and implementation of system performance, availability, manageability, and security requirements.
Must work well with a globally distributed team spanning multiple time zones. An understanding of what it means to own a problem, focus on tasks and ability to filter out distractions, and a process mindset are essential skills.
Self-starter and fast-learner
Linux administration experience (5+ years)
– Shell Scripting experience (Bash, C, Bourne)
Expert Splunk experience (3+ years)
Appropriate technical background (Bachelor in computer science or equivalent)
Experience working in a large, complex production environment
Experience working with tiered environment (Sandbox, UAT, Production)
Experience in a high paced 24×7 environment with on-call rotation
Investigate, diagnose, and implement corrective action to development, staging and production environments
Ability to stop, collaborate and listen with technical and non-technical consumers from technical peers to executive level stakeholders
Well organized with a healthy sense of urgency, able to set; communicate; and meet aggressive deadlines with competing priorities
Experience working with metrics/monitoring tools (Ganglia, Graphite, Nagios, Open TSD, etc)
Programming experience with Ruby, Python, Java – or all three!
Proven track record building solutions to solve enterprise problems
Experience presenting relevant information for cross functional teams
Strong interpersonal and communication skills
Knowledge and experience applying operational best practices
Splunk ES and/or ITSI experience/training
Work strategically to build out monitoring and logging infrastructure across Groupon datacenters
Automate system logging configurations
Maintain and upgrade monitoring, logging and alerting tools
Build relationships with key development teams, and solve their monitoring needs
Work tactically during a crisis to find the root cause of an issue, and fix, mitigate, or escalate to the right team in the shortest time possible
Any and all assigned duties; must be able to travel up to 20%
Groupon provides a global marketplace where people can buy just about anything, anywhere, anytime. We’re enabling real-time commerce across an expanding range of categories including local businesses, travel destinations, consumer products, and live or lively events. At the same time, we are providing advertising options and tools that merchants can use to grow and manage their businesses. Culturally, we believe that great people make great companies and that starting with the customer and working backward moves us forward. Community matters to us on an internal, local and global scale—it’s fundamental to our company’s growth and to the well-being of the world at large. We also value self-awareness, candor, lunch and WiFi. If we match with you, please apply to join us.