Location: Seattle, WA, US
Job Summary:
Job Duties:
- Develop machine learning systems for large-scale models.
- Design architecture for high concurrency, reliability, and scalability.
- Work on resource scheduling, model training, inference, data management, and workflows.
- Iterate and develop systems based on customer-driven scenarios.
Required Skills:
- Knowledge of distributed and parallel computing principles.
- Familiarity with machine learning algorithms and platforms (e.g., TensorFlow, PyTorch).
- Proficiency in at least one programming language (C/C++, Go, Python) within a Linux environment.
Required Experience:
- Graduate in Computer Science or related field.
- Experience in large model training, GPU-based high-performance computing.
- Participation in competitive programming or relevant internships/work experiences.
Job URLs: