Location: N/A
Job Summary:
Job Duties
- Manage end-to-end data collection, cleaning, and preprocessing for HTML datasets.
- Utilize web analysis tools for data extraction from DOM environments.
- Collaborate with ML Engineers on feature engineering and training datasets.
- Generate synthetic datasets using LLMs.
- Analyze data with dimensionality reduction techniques.
- Automate data workflows and maintain documentation.
- Create validation/data quality systems for dataset integrity.
Required Skills (Keywords)
- Python (Pandas, NumPy)
- Web analysis tools (Selenium, BeautifulSoup)
- HTML, DOM structures
- NLP techniques
- Synthetic datasets, LLMs
- Data quality governance
- Cloud platforms (AWS, GCP, Azure)
Required Experiences (Topics)
- 2+ years as a Data Analyst
- Cybersecurity or ML-focused environment experience
- Collaboration with technical teams
- Bachelor's degree in relevant field or equivalent experience
Job URLs: