Skip to content

Add Japanese and trilingual text normalization for numbers and symbols #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 16, 2025

Conversation

yuyun2000
Copy link
Contributor

Changes

Implemented Japanese text normalization module to handle numbers, symbols and special characters
Added trilingual (presumably Chinese/English/Japanese) text normalization support
Created regex patterns for converting numbers and symbols into pronounceable text
Integrated the new normalization modules into the existing text processing pipeline

Why

This enhancement improves pronunciation accuracy when synthesizing Japanese content and multilingual text containing numbers and symbols, ensuring more natural-sounding speech output across all supported languages.

Testing

Verified correct normalization of various Japanese numerical expressions
Tested with mixed language text containing numbers and special symbols
Confirmed proper pronunciation of normalized text through synthesized audio output
Compared results against expected pronunciations in each language

yuyun2000 and others added 5 commits May 15, 2025 15:17
Streamline and simplify code in the SOLA module for improved readability and maintenance
Refactor SOLA component code
Implement regex-based text normalization functionality to support trilingual (CJE) content processing
Add text normalization for Chinese, Japanese, and English
@Abandon-ht Abandon-ht merged commit e479b19 into m5stack:dev May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants