Junior Data Analyst
Posted on March 2, 2025 by MEDIA METER
- Quezon City, Philippines
- $296528.0 - $375471.0
- Full Time

As a Data Analyst, you will play a crucial role in the data preprocessing phase of our project to fine-tune the Whisper model for Taglish and other languages. Your responsibilities will include collecting, organizing, cleaning, and preparing high-quality multilingual data for model training. You will work closely with the machine learning team to ensure that the data meets the necessary standards for effective model training.
Key Responsibilities:
Data Collection and Organization:
- Gather raw audio files in various formats (e.g., MP3, WAV, FLAC) from diverse sources such as interviews, podcasts, and YouTube videos.
- Organize files into a structured directory hierarchy, ensuring a clear and consistent file naming convention.
Audio Preprocessing:
- Convert audio files to the required format (16kHz mono, 16-bit signed integer WAV) using tools like FFmpeg.
- Transcribe audio files, either manually or through a transcription service, and store text files with corresponding filenames.
Data Cleaning and Normalization:
- Clean and normalize text data to address spelling variations, punctuation issues, and formatting inconsistencies.
- Standardize abbreviations and contractions, and remove special characters or unnecessary symbols.
Data Segmentation and Labeling:
- Split lengthy audio recordings into smaller, manageable segments.
- Create and maintain a metadata file that maps audio files to their corresponding transcriptions and alignment details.
Quality Assurance and Validation:
- Conduct thorough quality checks to validate the dataset for accuracy, consistency, and completeness.
- Identify and resolve issues in the audio and text data, such as misalignments or incorrect transcriptions.
Data Analysis and Reporting:
- Use data analysis techniques to evaluate dataset health and completeness.
- Provide regular reports on data collection progress, challenges, and recommendations for improvements.
Collaboration and Communication:
- Work closely with the machine learning team to address any data-related issues.
- Provide regular updates on data collection and preprocessing progress.
Qualifications:
● Strong Proficiency in Python: Experience with data manipulation, cleaning, and preprocessing using Python libraries such as Pandas, NumPy, and TensorFlow.
● Data Cleaning and Preprocessing: Proven ability to clean, organize, and preprocess data for machine learning applications.
● NLP Knowledge: Familiarity with natural language processing techniques, including text normalization and handling multilingual or code-mixed data.
● SQL Skills: Experience with SQL for data querying and management.
● Problem-Solving Skills: Ability to identify and solve complex data-related problems with creativity and efficiency.
● Work Under Pressure: Capable of handling multiple tasks simultaneously and meeting deadlines in a fast-paced environment.
● Adaptability: Willingness to learn new tools and techniques as needed for the project.
● Attention to Detail: Meticulous attention to detail to ensure data accuracy and integrity.
● Communication Skills: Excellent communication skills to collaborate effectively with cross-functional teams.
Desired Skills:
● Familiarity with audio processing tools like FFmpeg.
● Familiarity with transcription tools and alignment software (e.g., Aeneas, Gentle).
● Knowledge of Taglish language nuances and variations.
● Experience with version control systems like Git.
● Familiarity with code-mixing or multilingual NLP techniques.
Job Types: Full-time, Permanent
Pay: Php23,000.00 - Php33,000.00 per month
Benefits:
- Health insurance
Schedule:
- 8 hour shift
Supplemental Pay:
- 13th month salary
Education:
- Bachelor's (Preferred)
Experience:
- Web Development: 1 year (Preferred)
Advertised until:
April 1, 2025
Are you Qualified for this Role?
Click Here to Tailor Your Resume to Match this Job
Share with Friends!
Similar Internships
Junior Data Analyst
Job Title: Data Analyst – Corporate BankingRoleOverviewWe are seeking a highly analytical and…
ESG Junior Data Analyst (Mandarin Speaker)
Let’s be #BrilliantTogether OVERVIEW ISS ESG offers expertise across a full range of environm…
Junior Data Analyst – Stockholm (part-time)
Are you a meticulous and analytical student with a sharp eye for numbers and structure? Do you want…
Junior Data Analyst - Finance Analytics
Ready for a Challenge? Then Just Eat Takeaway.com might be the place for you. We’re a leading…
Marketing Analytics, Junior Data Analyst (2025 Fresh Graduates)
Department Marketing LevelEntry Level LocationSingapore Our Marketing teams conceptualise and imple…
Junior Data Analyst
Location: Amsterdam, North Holland, Netherlands | Agency: Kinesso - Netherlands Ref#: 10243 | Type …