Prescreening Assignment for Engineer Role

Thank you for your interest in joining our team. As part of our prescreening process, we'd like you to complete a role-specific assignment based on a real project :

Please complete the section corresponding to the role you are applying for:

AI Engineer ( Job ID : ELV2025-ML)
Embedded Systems Engineer : ( Job ID : ELV2025-EM)

You have 5 days to complete this assignment. Please submit your responses in a PDF document.

-------------------------------------------------------------------------------------------------------------

Prescreening Interview Assignment for AI Engineer Role ( Job ID : ELV2025-ML)

Role Overview: As an AI Engineer on this project, you will focus on developing and fine-tuning machine learning models for wake word detection, voice cloning, and integrating search functionalities. This prescreening assignment evaluates your expertise in building custom ML models, preparing datasets, fine-tuning LLMs, and understanding search/indexing mechanisms. The assignment should take 4-6 hours to complete. Submit your responses as a PDF report including code snippets, explanations, and any diagrams.

Assignment Tasks:

Wake Word Detection Model Design (40% weight): Design a custom neural network architecture for wake word detection in a voice assistant. Assume the wake word is "Hey Assistant."
- Describe the model architecture (e.g., using CNNs, RNNs, or transformers) and explain why you chose it for low-latency, real-time audio processing.
- Outline how you would train this model, including data augmentation techniques to handle variations in accents, noise, and environments.
- Provide pseudocode or a high-level Python snippet (using libraries like PyTorch or TensorFlow) for the model's forward pass and loss function. Discuss potential metrics for evaluation (e.g., precision, recall, false positive rate).
Fine-Tuning LLM for Voice Cloning (30% weight): Explain how you would fine-tune a pre-trained LLM (e.g., something like Whisper or a TTS model such as Tacotron) for voice cloning in the context of generating audio responses.
- Detail the steps to prepare a custom dataset: How would you collect, preprocess, and annotate audio samples for cloning a specific voice? Include considerations for ethical data sourcing and diversity (e.g., multiple speakers, languages).
- Describe the fine-tuning process, including hyperparameters (e.g., learning rate, batch size) and techniques to avoid overfitting.
- How would you integrate this with a search engine output to convert text results into cloned voice audio?
Search and Indexing Knowledge (30% weight): Describe how search and indexing work in a voice-assisted search engine.
- Explain the role of inverted indexes, vector embeddings (e.g., using FAISS or Pinecone), and relevance ranking (e.g., BM25 or semantic search with BERT-like models).
- Propose how to index audio/text data for fast retrieval in response to voice queries, including handling multimodal data (audio + text).
- Provide a simple example: Sketch a system diagram showing query processing from audio input to indexed search and audio output, highlighting potential bottlenecks.

Submission Guidelines:

Include references to any papers, tools, or frameworks you mention (e.g., Kaldi for ASR).
Emphasize trade-offs between accuracy, latency, and resource usage.
We value clear reasoning over perfect code—focus on problem-solving.

Evaluation Criteria:

Technical depth and relevance to voice AI.
Creativity in handling real-world challenges like noisy inputs.
Clarity and structure of your report.

---------------------------------------------------------------------------------------------------------------

Prescreening Interview Assignment for Embedded Systems Engineer Role ( Job ID : ELV2025-EM)

Role Overview: As an Embedded Systems Engineer, you will optimize the end-to-end audio pipeline for low latency in a voice assistant search engine, integrating protocols like RTSP and audio codecs. This prescreening assignment assesses your skills in audio processing, optimization, and embedded hardware/software integration. The assignment should take 4-6 hours. Submit your responses as a PDF report with code snippets, diagrams, and explanations.