Location: El Dorado Hills, CA, US
Job Summary:
Job Duties:
- Manage end-to-end data collection, cleaning, and preprocessing for HTML datasets.
- Utilize web analysis tools for data extraction from DOM environments.
- Collaborate with ML Engineers on feature engineering experiments.
- Generate synthetic datasets using LLMs.
- Analyze data with dimensionality reduction techniques.
- Automate data workflows for processing and transformation.
- Maintain documentation for workflows and processes.
- Create validation systems for data quality.
Required Skills:
- Python, Pandas, NumPy
- Web analysis tools (Selenium, BeautifulSoup)
- HTML, DOM structures
- NLP techniques
- Cloud platforms (AWS, GCP, Azure)
- Problem-solving
Required Experience:
- 2+ years as a Data Analyst
- Cybersecurity or ML-focused environment
- Data workflow automation
- Synthetic dataset generation
Job URLs: