Company Name:
Category:
Data/Analytics
Software & Web Development
Status:
Full-Time
Contact Email:
smcmillan@nvidia.com
Description:
NVIDIA is looking for a system software engineer to join the datacenter system software team. As a system software engineer, you will work with a team of highly talented software and hardware engineers involving a wide variety of data center technologies. You will play a key role in system health, diagnostics, and support of the DGX appliance and NVIDIA datacenter products from before launch to systems in the field.
What you’ll be doing:
- Define, design, and develop software components with a focus on overall system health and validation.
- Collaborate between multiple groups and subject matter experts to diagnose a wide range of potential software and hardware issues.
- Utilize skills to automate complex tasks and improve the efficiency of system bring-up, acceptance, and preventative maintenance.
What we need to see:
- Bachelor or Master of Science (or equivalent) in Computer Science, Computer Engineering, or related technical discipline with 3+ years of work experience in the following:
- Strong programming and debugging skills in a scripting language such as Python, Perl, Ruby, or Unix shell
- Solid working knowledge of Linux based operating systems (Ubuntu preferred)
- Experience with telemetry and analytics infrastructure, including Elastic Search, Logstash, Splunk, Kibana, collectd and similar
- Experience working with HPC and/or deep learning workloads and benchmarks
- Experience working with computer clusters, MPI, and InfiniBand
- Excellent data analysis skills and the demonstrated ability to troubleshoot complex issues involving multiple software or hardware components
- Excellent communication and organizational skills
Ways to stand out from the crowd:
- Experience applying deep learning frameworks
- GPU programming experience with CUDA