Nguyen Van Anh Tuan

Nguyen Van Anh Tuan

Research Engineer

About Me

Hi, I'm Tuan Nguyen — a Research Engineer with a passion for bridging theory and practice in AI and speech technology. I currently work at a research agency in Singapore, focusing on multilingual, low-resource, and code-switching speech recognition for 11 Southeast Asian languages.

I earned my degree early from a 4-year Data Science program in Vietnam, and have since published multiple first-author papers at top-tier conferences like Interspeech. My research interests lie in building robust, multilingual models and deploying efficient AI systems for real-world use.

I’m currently aiming for long-term growth in both academia and industry, with a goal of pursuing a Ph.D. at a top global university. Then working as a Research Scientist on interesting and high-rewards projects.

Work Experience

Research Engineer

I2R, A*STAR, Singapore

Aug 2024 - Present

  • Developing state-of-the-art multilingual & code-switching end-to-end ASR for 11 Southeast Asian languages.
  • Training large-scale bilingual & code-switching models on 42,000 hours of speech data.
  • Conducting applied research to improve ASR for multilingual & code-switching cases, especially low-resource languages.
  • Coordinating team members to deliver high-quality ASR models.
  • Guiding interns in data collection, processing, and experimenting with TTS/Speech-LLMs for code-switching ASR.

Research Intern

I2R, A*STAR, Singapore

July 2023 - April 2024

  • Worked on a project leveraging GANs to improve ASR robustness in noisy environments.
  • Researched and developed multi-lingual ASR models for South-East Asian languages.
  • Researched and developed Speech Enhancement models for extremely noisy environments (e.g., telecommunication).

AI Engineer Freelance

Upwork

Feb 2023 - June 2023

  • Assessed translation outputs using TransRepair for myLanguage.
  • Fine-tuned Whisper for Indian English ASR and built APIs.
  • Developed Alpaca 7B prompts for task classification and summarization.
  • Deployed Vicuna models (7B/13B) for domain-specific tasks in sales and real estate.
  • Adapted GPT-J 6B for Vietnamese question answering with BERT-based PDF interaction.

ML Researcher

EduplaX, Vietnam

Aug 2021 - Feb 2023

  • Developed AI features, including adaptive testing and English speech recognition.
  • Built RESTful APIs and frontends for new feature integration and AI model deployment.

Publications

Tuan Nguyen⭐, Tran Huy Dat. "Can we train ASR systems on Code-switch without real code-switch data? Case study for Singapore's languages." Interspeech 2025.

Tuan Nguyen⭐, Long-Vu Hoang⭐, Tran Huy Dat. "Acoustic scattering AI for non-invasive object classifications: A case study on hair assessment." Interspeech 2025.

Tuan Nguyen⭐, Tran Huy Dat. "LingWav2Vec2: Linguistic-augmented wav2vec 2.0 for Vietnamese Mispronunciation Detection." Interspeech 2024.

Duc-Tuan Truong⭐, Ruijie Tao, Tuan Nguyen, Hieu-Thi Luong, Kong Aik Lee, Eng Siong Chng. "Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection." Interspeech 2024.

Tuan Nguyen⭐, Trinh, T. B. B., Nguyen, T. T. H., Phan, L. H. V., Tran, N. B., Dinh, H. H. D., ... Nguyen, C. K. "Using Bayesian and Neyman-Pearson hypothesis testing for Autoencoder to detect anomalies in network security." JST - IUH 2023, Ho Chi Minh City, Vietnam. doi:10.46242/jstiuh.v61i07.4724

Tuan Nguyen⭐, Nguyen, T. T. H., Nguyen, T. D., Pham, M. T., Dao, D. T., Dang, T. P. "A novel approach for Vietnamese Speech Recognition using Conformer." FDSE 2022 (pp. 723–730), Singapore. doi:10.1007/978-981-19-8069-553

Projects

Automatic Pronunciation Error Detection (APED)

Implementation - Presentation | Aug 2022 - Dec 2022

Developed an AI system for automatic pronunciation error detection (for non-native speakers learning English) using semi-supervised learning and the LCS algorithm, based on state-of-the-art research published in the related field.

Achieved a significant reduction in Phoneme Error Rate (PER) from 17% (teacher model) to 12% (student model) on the Phoneme Recognition task using the Libri-Light and LibriSpeech dataset.

Key Learnings: ASR, Noisy Student Training, wav2vec2.0, Conformer, Self-Supervised Learning (SSL).

Skills

Tools & Technologies:

Scikit-learnNLTKSpaCy PyTorchPyTorch LightningHuggingFace LangchainOpenAI APIsJavaScript ReactJSFlaskFastAPI DjangoNumpyPandas MatplotlibPlotlyBokeh PostgreSQLMongoDBC/C++

Languages:

Vietnamese (Native) English (Working professional)

Miscellaneous:

Academic researchTeaching Teaching assistantTrainingConsultation

Awards & Achievements

First Prize - Vietnamese Mispronunciation Detection shared task at VLSP - AI Competition

2023

Third Prize - Euréka at IUH - Student Research at University level

2022

Top 20 - startup contest Inno Greenlife IUH - Data Analyst dashboard for teachers

2021

Second Prize - TDMU - Entropy Data Analytics contest - Hackathon in Data Science

2021

Consolation Prize - Vietnam Olympiad in Informatics (VOI) - programming contest

2020

Consolation Prize - ACM-ICPC Vietnam Asia Round - programming contest

2019

Education

B.Eng. in Data Science

Industrial University of Ho Chi Minh City (IUH), Vietnam

Aug 2019 - July 2024

GPA: 3.39/4.0

Vlogs

🎥 Dive into my journey as a Research Intern in Singapore!
Experience a day in my life, from lab work to city adventures.

Vlog: A Day in My Life as a Research Intern in Singapore 🇸🇬

Want more? Subscribe to my channel for future vlogs and insights!