iHeartRadio is looking for a talented Data Engineer to help us in our data-driven mission to reshape the world of music and the spoken word. You will work in a highly collaborative team of engineers, and alongside data scientists and analysts, to distill existing data processes, import new external data sources, and create complex data mashups. Your work will provide valuable insights and power important music data products. Expect to build high throughput data pipelines and improve the existing big data infrastructure. You will also improve performance, squash bugs, and increase visibility across the data ecosystem. You will have end to end ownership of your code, though ideally you also relish reviewing a good pull request. If you enjoy working with large sets of data and the challenges associated with them this is the role for you.
- Working in an Agile development methodology and own data driven solutions end-to-end
- Experimenting with various frameworks in the Big Data ecosystem to identify the optimal approach for extracting insights from out datasets
- Identifying performance bottlenecks in data pipelines and architect faster, more efficient solutions when necessary
- Creating new data warehouse solutions and define and demonstrate best practices in schema and table design in varied databases like Hive, Redshift, Spectrum etc.
- Developing end-to-end batch and real time pipelines for large data sets to our Hadoop/Spark clusters, and bring summarized results back into a data warehouse for downstream business analysis.
- When needed, performing data housekeeping, data cleansing, normalization, and implementation of required data model changes
- Increasing efficiency and automate processes by collaborating with our SRE team to update existing data infrastructure (data model, hardware, cloud services, etc.)
- Designing, building, launching and maintaining efficient and reliable data pipelines in production
- Designing, developing, and owning new systems and tools to enable our consumers to understand and analyze the data more quickly.
- 2+ years of experience ingesting, processing, storing, and querying large datasets
- 2+ years of experience working in the Hadoop/Spark ecosystem
- Ability to write well-abstracted, reusable code components in Python, Scala or similar language(s)
- Ability to investigate data issues across a large and complex system by working alongside multiple departments and systems
- A self-starter who thrives in owning the products and pipelines they develop
- Experience with configuration management tools (Ansible, Chef, Puppet, etc) is a plus
- Experience with Spark, Kafka, or similar is a plus
- Experience with AWS technologies (S3, Redshift, EC2, RDS, EMR, Dynamo) is a plus
- Proven proficiency with Scala is a plus
iHeartRadio, iHeartMedia’s digital radio platform, is the fastest growing digital audio service in the U.S. and offers users thousands of live radio stations, personalized custom artist stations created by just one song or seed artist, and the top podcasts and personalities. iHeartRadio is a great environment for people who like to innovate and have the power to influence decisions. We have 120+ million registered users across over 200 different platforms, and outside the US, we are in New Zealand, Australia, Canada, and Mexico!