GreenChoice analyzes products' health & sustainability data to power product transparency and SKU-level ESG for the grocery and food industries.
Candidate must have:
- 3-5 years hands-on Python scripting for large projects
- 3-5 years SQL (MySQL); familiar with functions, complex joins, and events
- At least one major scraping project that follows multiple links in a site and follows predefined structures
- Experience with optimization and parallelization in Python
- Experience scraping against defenses like thresholds, blocked IPs, bot detectors, etc.
- Data cleaning experience fixing data types, normalizing to standard vocabulary, fuzzy matching / NLP, and scrubbing for database insertion
- BONUS: Experience with life cycle analysis &/or carbon footprint modeling
- BONUS: Experience w/ high-volume ETL
- BONUS: Experience w/ large, complex Python pipelines and legacy code
- BONUS: Experience scraping major food and beverage retailer