NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern deep learning — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company and establish teams with the most thoughtful people in the world.
NVIDIA DGX, HGX, and MGX servers deliver the world's leading solutions for enterprise AI infrastructure at scale.
We are the Server Platform Software Tools team at NVIDIA. We deliver Infrastructure and Tools for server’s readiness for data center deployment & manageability, reliability serviceability & Availability (RAS) and server firmware management. We are looking for a talented and experienced manager having experience with tools and infrastructure to manage teams for Server Platform Software tools.
In this role, you will be making impact to release world’s best resiient gpu and grace servers insuring high uptime for these servers in data centers. This is highly visible role at NVIDIA to ensure high quality RAS and firmware management features for NVIDIA scale out solution including GPU & NVSWITCH products. This role require to work across the server ecosystem to ensure they are building servers in right way and working closely with internal cross-functional teams i.e. hardware engineers, system architects, and software developers.
Join us at the forefront of technological advancement.
What you’ll be doing:
Lead, mentor, and grow your Firmware Lifecycle Infrastructure, Server Validation and Tools engineering team and be responsible for the planning, execution, performance and quality of the projects.
This is a technical leadership role so you will participate in feature design and implementation.
Interact with internal and external partners to understand their use cases and requirements. Collaborate with engineering teams, program and product management, and partners to define the product roadmap.
Continuously review and identify improvement opportunities in established processes, infrastructure, and practices to ensure the teams are executing in the most efficient and transparent manner.
What we need to see:
10+ overall years of experience in the software industry with specialization in system software and/or firmware development.
4+ years of management experience.
BS, MS, or Ph.D. in CS, CE, EE (related technical field) or equivalent experience.
Prior systems software or firmware development experience with a successful track record of taking several complex software features or products through the full product life cycle.
Strong understanding of server architecture, systems software fundamentals, HW-SW interactions and performance analysis/optimizations.
Excellent python programming and debugging skills in Linux.
Experience balancing multiple projects with competing priorities.
Flexibility to work and communicate effectively across different teams and time zones.
Ways to stand out from the crowd:
Familiarity with the architecture of datacenter server software and experience with the in-band and out-of-band management of firmware and hardware components.
Understanding REST architecture style especially JSON over HTTPs with OAuth.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you!
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Drive strategic partnerships and developer engagement for NVIDIA's CAE Cloud Services as a Developer Relations Manager.
Innovate with NVIDIA's Quantum Computing team as a Software Engineer specializing in CUDA-Q and GPU programming within a remote work environment.
Jaxon Engineering seeks a Systems Engineer I to support EMP hardening projects, requiring strong teamwork, communication, and the ability to obtain DoD security clearance.
Kandji is looking for a seasoned Principal Site Reliability Engineer to lead reliability efforts and enhance the resilience of their AWS-hosted platform in Miami.
Seeking a Manufacturing Engineer in Grand Rapids to improve manufacturing processes, enhance product quality, and maintain operational equipment.
A Product Engineering Specialist at 3M will lead technical efforts in energy market products, combining engineering expertise with cross-functional collaboration to advance manufacturing and product excellence.
GET Inc. is hiring innovative IT professionals to support key FAA initiatives across multiple technical roles.
An established industrial bakery equipment manufacturer seeks an Electrical Engineer to lead electrical design and product innovation efforts.
Experienced UAS Flight Test Engineer needed at Flock Safety to advance Drone as First Responder technology and ensure drone system reliability on-site in Lafayette, Indiana.
Lead the technical aspects of Torque Motor and Service Valve products at Moog's Buffalo onsite location, driving innovation and team mentorship within a high-growth aerospace environment.
Environmental consulting firm Farallon is seeking a Staff Engineer to perform engineering support for remediation projects in the Seattle/Bellevue area.
Allied Universal® Technology Services seeks an Applications Engineer skilled in software integration and automation to enhance security system implementations.
SYSTRA is seeking an experienced Bridge Design Engineer to join their Infrastructure Group supporting bridge projects across California and the West Coast.
AECOM seeks an experienced Environmental Engineer to join their Omaha team in delivering impactful federal environmental projects.
A Staff Site Reliability Engineer position at ServiceNow is available to work 3rd shift supporting US Federal cloud infrastructure through automation, monitoring, and technical leadership.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
130 jobs