Location: El Dorado Hills, CA, US
Job Summary:
Job Duties
- Own end-to-end data collection, cleaning, and preprocessing for HTML-based datasets.
- Utilize web analysis tools for data extraction from DOM environments.
- Collaborate with ML Engineers on feature engineering experiments.
- Generate synthetic datasets using LLMs.
- Analyze data using dimensionality reduction techniques.
- Automate data workflows for processing and transformation.
- Document data workflows and methodologies.
- Create validation systems for data consistency and integrity.
Required Skills
- Python (Pandas, NumPy)
- Web analysis tools (Selenium, BeautifulSoup)
- HTML and DOM structures
- Natural language processing (NLP) techniques
- Data quality and governance
- Cloud platforms (AWS, GCP, Azure)
Required Experiences
- 2+ years as a Data Analyst
- Cybersecurity or ML-focused environment
- Collaboration with technical teams
- Data manipulation and automation
Job URLs: