iHeartRadio is looking for a talented Data Engineer to help us in our data-driven mission to reshape the world of music and podcasts. You will work in a highly collaborative team of engineers, and alongside data scientists and analysts, to distill existing data processes, import new external data sources, and create complex data mashups. Your work will provide valuable insights and power important music data products. Expect to build high throughput data pipelines and improve the existing big data infrastructure. You will also improve performance, squash bugs, and increase visibility across the data ecosystem. You will have end-to-end ownership of your code, though ideally you also relish reviewing a good pull request. If you enjoy working with large sets of data and the challenges associated with them this is the role for you.
- Working in an Agile development methodology and own data driven solutions end-to-end
- Experimenting with various frameworks in the Big Data ecosystem to identify the optimal approach for extracting insights from out datasets
- Identifying performance bottlenecks in data pipelines and architect faster, more efficient solutions when necessary
- Creating new data warehouse solutions and define and demonstrate best practices in schema and table design in varied databases like Hive, Redshift, Spectrum etc.
- Developing end-to-end batch and real time pipelines for large data sets to our Hadoop/Spark clusters, and bring summarized results back into a data warehouse for downstream business analysis.
- Increasing efficiency and automate processes by collaborating with our SRE team to update existing data infrastructure (data model, hardware, cloud services, etc.)
- Designing, building, launching and maintaining efficient and reliable data pipelines in production
- Designing, developing, and owning new systems and tools to enable our consumers to understand and analyze the data more quickly.
- Experience ingesting, processing, storing, and querying large datasets
- Ability to write well-abstracted, reusable code components in Python, Scala or similar language(s)
- Experience working in an Hadoop/Spark ecosystem
- Experience with workflow managers and schedulers, such as Airflow
- Ability to investigate data issues across a large and complex system by working alongside multiple departments and systems
- A self-starter who thrives in owning the products and pipelines they develop
- Experience with AWS big data technologies (S3, Redshift, EC2, RDS, EMR, Dynamo) is a plus
- Experience with configuration management tools (Ansible, Chef, Puppet, etc) is a plus
- Experience with streaming frameworks like Spark or Kafka is a plus
iHeartRadio is now the number one commercial podcast publisher globally! We connect fans with their favorite music and radio by offering users thousands of live radio stations, personalized custom artist stations created by just one song or seed artist, and the top podcasts and personalities. We have continued to build on our position as the only major multi-platform audio company, with a reach extending across more than 250 platforms, 2,000 different connected devices, and 135 million registered users. Most recently we acquired Jelli (our technology platform for targeted advertisements) as well as Stuff Media (owner of How Stuff Works content). Outside of the U.S., we are in New Zealand, Australia, Canada, and Mexico!