Location: Palo Alto, CA, USA, 94306, US
Job Summary:
Job Duties:
- Build and manage a comprehensive semiconductor dataset.
- Develop software solutions for data scraping and handling at scale.
- Extract and clean information from diverse data types (text, images, circuits).
- Prepare data for the Machine Learning team.
- Manage customer data transfer and feedback systems.
- Parse documents in various formats.
- Develop software pipelines for data labelers.
- Implement preprocessing systems for AI training datasets.
Required Skills:
- Scalable software solutions
- PDF parsing
- Software engineering
- Diverse data handling
- Custom data processing libraries
- AI training data techniques
- Cloud data management
Required Experiences:
- Background in Electrical Engineering (bonus).
- Machine learning model behavior related to data quality.
- Fine-tuning large language models (bonus).
- Experience in hyper-growth startups (bonus).
- Building systems for training foundation models (bonus).
Job URLs: