Location: Escondido, California, us
Job Summary:
Job Duties
- Develop optimized C++ CUDA compute kernels for deep learning operations (e.g., matrix multiplies, convolutions).
- Implement software engineering best practices, including regression testing and CI/CD.
- Collaborate with CUDA compiler, deep learning performance, and hardware architecture teams for optimizations.
Required Skills (Keywords)
- C++ programming
- Software design
- Debugging
- Performance analysis
- Parallel programming
- Assembly programming
Required Experiences (Topics)
- Masters/PhD or equivalent in Computer Science, Computer Engineering, Applied Math, or related field.
- Experience in performance-oriented parallel programming.
- Understanding of computer architecture.
Job URLs: