USNLX Virtual Jobs

USNLX Virtual Careers

Job Information

UnitedHealth Group Site Reliability Engineer - National Remote in Washington, District Of Columbia

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together.

As a Site Reliability Engineer (SRE) you will employ software engineering to automate critical IT operations tasks, including production system management, change management, and incident response. You will be responsible for design review and control; prediction, estimation, and apportionment methodology; failure mode effects and analysis; the planning, operation and analysis of reliability testing and field failures, and the ability to develop and administer reliability information systems for failure analysis, design and performance improvement and reliability program management over the entire product life cycle.

You will help ensure swift incident response and scalable emergency handling, fostering greater reliability and resilience in managing complex systems. You will support our efforts in optimizing system performance and implementing, ensuring the reliability of our technology ecosystem.

You’ll enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges.

Primary Responsibilities:

  • System Reliability and Incident Management: Ensure the reliability, availability, and performance of services. Respond to, troubleshoot, and resolve service outages or degradation. Lead post-incident reviews and drive root cause analysis and mitigation

  • Monitoring and Performance Tuning: Develop and maintain advanced monitoring and alerting systems to detect and mitigate issues proactively. Continuously measure and optimize system performance, identifying bottlenecks and points of failure

  • Continuous Improvement: Advocate for and implement changes to improve system reliability and scalability. Innovate new ways to manage and automate operations tasks

  • Collaboration and Advocacy: Work closely with development teams to incorporate best practices and influence architecture, code health, and operational processes. Promote a culture of shared responsibility for production stability and performance.  Integrate SRE principles into the engineering workflow

  • Capacity Planning and Scalability: Forecast and plan for the infrastructure needs. Implement scalable systems and resource allocation strategies to handle growth and peaks in demand

  • Documentation and Knowledge Sharing: Create and maintain detailed documentation of the systems, processes, and procedures. Facilitate knowledge sharing through regular technical presentations and training sessions

  • Configure, implement, and manage /optimize end-to-end APM solutions, with a focus on Dynatrace, AppDynamics, Splunk, or other relevant tools

  • Work closely with IT teams to seamlessly integrate APM solutions into the existing infrastructure and applications

  • Develop and maintain customized dashboards, reports, and alerts to offer real-time insights into the health and performance of the system

  • Collaborate with diverse teams to understand business requirements and configure APM solutions to meet performance monitoring needs

  • Conduct system analysis, troubleshooting, and optimization across various applications and infrastructure components

  • Provide support to internal stake holders and support teams regarding tweaking configurations, troubleshooting, and tool-specific nuances

  • Continuous performance management, measuring performance and working with stake holders to improve the same

  • Build quality frameworks to provide feedback loop to stakeholders to easy and improved APM product management, patching systems and implementing security controls

  • Document automation procedures to improve the velocity and quality of the effort

  • Continuous performance management, Software release management, configuration management and transition to stakeholders

  • Request feedback from teams, perform tool implementation assessments, offering recommendations for improvements to enhance system reliability and responsiveness

You’ll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.

Required Qualifications:

  • Must possess one of the below industry recognized Reliability Engineer Certification CRE

1.

  1. Engineer In Training Certification (EIT)

  2. Certified Reliability Engineer (CRE)

  3. Certified Maintenance & Reliability Professional (CMRP)

  4. Certified Maintenance Reliability Professional (CMRP)

  5. Certified Quality Engineer (CQE)

  6. Certified Six Sigma Greenbelt (CSSGB)

  • 4+ years hands on experience with scripting languages (e.g., Python, PowerShell) for automation and customization across various APM tools.

  • 4+ years' experience monitoring software performance in terms of service-level agreements (SLAs), service-level indicators (SLIs), and service-level objectives (SLOs)

  • 4+years' experience with APM features such as real user monitoring, synthetic monitoring, and effective root cause analysis

  • 4+ years experience with one of more of the following platforms: Salesforce, Pega, Appian, Microsoft power platform

Preferred Qualifications:

  • ITIL Foundation Certification

  • Must be able to obtain and maintain a government security clearance

  • Bachelor's Degree in computer science or equivalent technical degree

  • Understanding of application architecture, infrastructure, and cloud environments

  • Proficiency in configuring and customizing multiple APM tools like Dynatrace, Splunk, AppDynamics for optimal performance monitoring

  • Additional certifications (e.g. Salesforce Developer, Quality Engineer Certification CQE etc.) are highly desirable

Soft Skills:

  • Ability to communicate both verbally and in written form

  • Excellent communication skills to collaborate effectively with cross-functional teams and convey technical concepts to non-technical stakeholders

  • Strong problem-solving skills, including the ability to analyze complex systems and identify performance bottlenecks

Telecommuting Requirements:

  • Must have reliable internet service that allows for effective telecommuting

  • Must be able to obtain and maintain a government security clearance

  • All work must be conducted in the United States

  • Must be eligible to work in the United States

*All Telecommuters will be required to adhere to UnitedHealth Group’s Telecommuter Policy.

California, Colorado, Nevada, Connecticut, New York, New Jersey, Rhode Island, Hawaii, Washington, or Washington D.C Residents Only : The salary range for California, Colorado, Nevada, Connecticut, New York, New Jersey, Rhode Island, Hawaii, Washington, or Washington D.C residents is $70,200 to $137,800 per year. Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. UnitedHealth Group complies with all minimum wage laws as applicable. In addition to your salary, UnitedHealth Group offers benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with UnitedHealth Group, you’ll find a far-reaching choice of benefits and incentives.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants.

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone–of every race, gender, sexuality, age, location and income–deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes — an enterprise priority reflected in our mission.

Diversity creates a healthier atmosphere: UnitedHealth Group is an Equal Employment Opportunity / Affirmative Action employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, protected veteran status, disability status, sexual orientation, gender identity or expression, marital status, genetic information, or any other characteristic protected by law.

UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment.

#RPO #GREEN

DirectEmployers