Spotlight – A Voice-Enabled Personalized Ad Generator

2025-01-15

Event Advert API – How We Built a Voice-Enabled Event Advertisement Generator

Measure of Music 2025 Hackathon | Team 13

In the whirlwind of the Measure of Music 2025 hackathon, our team—Team 13—embarked on an ambitious project to dynamically create personalized Spotify ads based on user listening habits and nearby events. While my contribution focused on building the voice-enabled Event Advert API that generates professional-sounding audio ads with background music and TTS conversion, other team members worked on complementary components such as personalized ad targeting and Spotify API integrations.

Inspiration and Concept

We wanted to blend the power of modern voice assistants with real-time data processing to create a tool that dynamically transforms event information into engaging, radio-style advertisements. Our vision was to enable users to experience personalized Spotify ads that are not only data-driven but also delivered in a compelling audio format.

Key aspects of our solution include:

  • Dynamic Text Generation: Using advanced language models to generate tailored advertisement scripts based on event details.
  • Asynchronous Processing: Leveraging FastAPI and Python’s asyncio to handle multiple I/O tasks concurrently, ensuring quick response times.
  • High-Quality Audio Output: Converting generated text to speech via LMNT’s TTS service and overlaying it with background music using pydub.
  • Serverless Deployment: Deploying the solution on AWS Lambda with API Gateway integration for scalable, on-demand performance.

How the Magic Happens (Component Breakdown)

Tech Stack:

  • FastAPI: For building a fast, asynchronous HTTP API.
  • Python's asyncio: To manage non-blocking I/O operations such as TTS conversion and audio mixing.
  • LMNT API: For high-quality text-to-speech (TTS) conversion.
  • pydub: For seamless audio processing and mixing.
  • AWS Lambda & API Gateway: To deploy a scalable, serverless solution.
  • Terraform: For automated infrastructure provisioning.

My Contribution: Event Advert API

My part of the project involved creating an API that:

  • Accepts event details (artist, venue, genre, etc.) and dynamically generates a radio-style advertisement script.
  • Converts the script to speech using LMNT’s TTS service asynchronously.
  • Mixes the synthesized speech with background music stored on S3, producing a polished MP3 file.
  • Deploys the solution as an AWS Lambda function, allowing it to scale seamlessly and be accessed via an HTTP endpoint through API Gateway.

Other Team Contributions

While I focused on the voice-enabled advertisement generator, my teammates tackled other crucial aspects:

  • Personalized Ad Targeting: Integrating with Spotify’s API to fetch user listening habits and tailoring ad content accordingly.
  • Local Data Integration: Incorporating nearby event data and contextual information from various data sources.
  • Frontend and User Experience: Designing an intuitive interface that allows users to interact with the system and receive real-time audio ads.

Challenges and Solutions

Asynchronous Coordination

Handling multiple asynchronous tasks—such as TTS conversion and audio mixing—was essential. FastAPI and asyncio allowed us to process these tasks concurrently, keeping the system responsive even under load.

Smooth Audio Streaming

Streaming high-quality MP3 files in real time posed its own set of challenges. We optimized our audio processing using pydub, ensuring that the mixed audio output is delivered seamlessly to the end user.

Rate Limiting and External API Integration

Integrating external APIs, like those for personalized data retrieval, introduced rate-limiting constraints. We implemented caching mechanisms to mitigate repeated calls and improve overall performance.

Serverless Deployment

Deploying on AWS Lambda with API Gateway allowed us to scale our solution effortlessly. Using Terraform for infrastructure as code made the deployment process reproducible and maintainable.

Future Plans

While our hackathon prototype demonstrates the exciting potential of voice-enabled ad generation, there are several avenues for future improvements:

  • Enhanced Audio Processing: Refining the audio mixing algorithms to further improve sound quality.
  • Auto-Scaling: Exploring Kubernetes or additional AWS services for even greater scalability.
  • CI/CD Pipelines: Establishing robust automated testing and deployment pipelines.
  • Expanded Personalization: Deepening integration with Spotify and other data sources for more targeted ad creation.

Conclusion

The Event Advert API stands as a testament to the power of asynchronous programming and serverless architectures in building innovative voice-enabled applications. By combining FastAPI, asyncio, LMNT’s TTS, and pydub, we have created a system capable of dynamically generating personalized, high-quality audio ads in real time. I’m proud of what our team—Team 13 at Measure of Music 2025—accomplished over a single weekend, and I look forward to the continued evolution of this project.

Happy coding, and may your APIs always be fast and your audio crystal clear!