Location: Palo Alto, CA, USA, 94306, US
Job Summary:
Job Duties:
- Build and manage a comprehensive semiconductor dataset.
- Develop software solutions for data scraping and handling.
- Extract and clean data from various modalities (text, images, circuits).
- Prepare data for the Machine Learning team.
- Manage the transfer of customer data and feedback.
- Parse documents in multiple formats.
- Develop software pipelines for data labelers.
- Implement pre-processing systems for AI training.
Required Skills:
- Scalable software solutions
- PDF parsing expertise
- Strong software engineering
- Diverse data modality handling
- Custom data processing libraries
- AI training data preparation techniques
- Cloud data management proficiency
Required Experiences:
- Background in Electrical Engineering (bonus)
- Machine learning model behavior understanding
- Fine-tuning large language models
- Experience at hyper-growth startups
- Training foundation models system building
Job URLs: