We’re looking for a highly skilled and independent Senior Big Data Engineer to lead the design and development of scalable, distributed ML training pipelines and infrastructure. This is a hands-on role with real impact on the product - from architecture decisions to production performance in a real-time AdTech environment processing billions of data points daily.
We’re looking for someone proactive, who thrives in ownership-driven environments and is excited to shape the future of our ML systems.
What you’ll do:Design and implement distributed ML training pipelines
Build infrastructure for data ingestion, preprocessing, and model evaluation
Collaborate with Data Scientists, Engineers, and Product teams
Own the full ML lifecycle: tooling, monitoring, automation, and optimization
Stay up to date with best practices in MLOps and distributed systems
What we’re looking for:
5+ years in backend or ML engineering
Strong Python skills and experience with frameworks like Spark (PySpark/Scala), Dask, etc.
Experience designing scalable ML infrastructure
Solid understanding of ML workflows and lifecycle
Familiarity with cloud (AWS, GCP, OCI) and containerized deployments (Kubernetes)
Experience with SQL and NoSQL/in-memory databases (e.g., Redis, Bigtable, Aerospike)
Proactive mindset, clean code, and strong communication skills
We offer:
B2B contract with 20 paid days off
Flexible hybrid work model from Warsaw (negotiable!)
Opportunity to work with cutting-edge technologies in a high-scale environment
High level of autonomy and real influence on the product
Let’s build something big together - apply now!