Page 63 - profile-ok

P. 63

研究人員｜ Research Faculty

● Associate Research Fellow, IIS, Academia Sinica (2002 - ) Technical Paper Award, The Chinese Institute of Engineers (1995)
王新民 Hsin-Min Wang ● Assistant Research Fellow, IIS, Academia Sinica (1996 - 2002) Editorial board member, International Journal of Computational
● Postdoctoral Fellow, IIS, Academia Sinica (1995 - 1996) Linguistics and Chinese Language Processing (2004 - )
● Ph.D., EE, National Taiwan University (1995)
● B.S., EE, National Taiwan University (1989)
副研究員 Associate Research Fellow
Ph.D., Electrical Engineering, National Taiwan University
Tel: +886-2-2788-3799 ext. 1714
Fax: +886-2-2782-4814
Email: whm@iis.sinica.edu.tw
http://www.iis.sinica.edu.tw/pages/whm

代表著作 Publications

1. H. M. Wang, T. H. Ho, R. C. Yang, J. L. Shen, B. R. Bai, J. C. Hong, 2004(17), pp. 2626-2639, December 2004.
W. P. Chen, T. L. Yu, and L. S. Lee, “Complete recognition of continu- 16. H. M. Wang, B. Chen, J. W. Kuo, and S. S. Cheng, “MATBN: A Man-
研究簡介 Research Description ous Mandarin speech for Chinese language with very large vocabu- darin Chinese broadcast news corpus,” International Journal of Com-
lary using limited training data,” IEEE Trans. on Speech and Audio
Processing, 5(2), pp. 195-200, March 1997. putational Linguistics and Chinese Language Processing, 10(2), pp.
219-236, June 2005.
我們的研究興趣包括語音處理、自然語言處理、多媒體 Our research interests include speech processing, natural language processing, mul- 2. H. M. Wang, “Statistical analysis of Mandarin acoustic units and au-
資訊檢索及模型識別。 timedia information retrieval, and pattern recognition. tomatic extraction of phonetically rich sentences based upon a very 17. W. H. Tsai and H. M. Wang, “On the extraction of vocal-related in-
formation to facilitate the management of popular music collections,”
large Chinese text corpus,” International Journal of Computational IEEE/ACM Joint Conference on Digital Libraries (JCDL2005), Den-
發展人機語音介面是人類自電腦發明以來的夢想，數十 Communicating with computers using speech has been a dream of many people Linguistics and Chinese Language Processing, 3(2), pp. 93-114, Au- ver, USA, June 2005.
年來，從語音指令、語音輸入及語音合成，到簡單的口 since the invention of computers. Progress towards realizing this dream has been gust 1998. 18. C. Y. Tseng, S. H. Pin, Y. Lee, H. M. Wang, Y. C. Chen, “Fluent speech
語交談系統，這個夢想正緩慢地逐步實現。語音辨識、 slow but steady through the development of systems supporting voice commands, 3. J. L. Shen, H. M. Wang, R. Y. Lyu, and L. S. Lee, “Automatic selec- prosody: framework and modeling,” Speech Communication, 46(3-4),
tion of phonetically distributed sentence sets for speaker adaptation
語音合成、語言了解及交談管理等技術是發展人機語音 dictation, text-to-speech synthesis, and human-computer spoken dialogue. Speech with application to large vocabulary Mandarin speech recognition,” pp. 284-309, July 2005.
介面不可或缺的要件。我們目前的研究主要著重在語音 recognition, speech synthesis, language understanding, dialogue management, etc. Computer Speech and Language, 13(1), pp. 79-97, January 1999. 19. W. H. Tsai and H. M. Wang, “Automatic singer recognition of popular
辨識、語音合成及語者辨識。新近的研究成果包括應用 are crucial to the development of human-computer speech interface. Our research 4. L. F. Chien, H. M. Wang, B. R. Bai and S. C. Lin, “A spoken access music recordings via estimation and modeling of singer vocal signal,”
IEEE Trans. on Audio, Speech, and Language Processing, 14(1), pp.
於自動音素分段的最小邊界誤差鑑別式聲學模型訓練與 has been focused mainly on speech recognition, speech synthesis, and speaker approach for Chinese text and speech information retrieval,” Journal 330-341, January 2006.
of the American Society for Infomration Science, 51(4), pp. 313-323,
搜尋架構及以核心鑑別分析改良替代假說特性描述之語 recognition. The recent achievements include a minimum-boundary-error-based February 2000. 20. W. H. Tsai and H. M. Wang, “Speech utterance clustering based on the
者確認技術等。我們參與 ISCSLP2006 舉辦的語者確認評 discriminative acoustic model training and decoding framework for automatic 5. B. R. Bai, B. Chen, and H. M. Wang, “Syllable-based Chinese text/ maximization of within-cluster homogeneity of speaker voice charac-
比，在六個參賽系統中名列第二。 phoneme segmentation, a novel characterization of the alternative hypothesis us- spoken document retrieval using text/speech queries,” International teristics,” The Journal of the Acoustical Society of America, 120(3),
pp. 1631-1645, September 2006.
ing kernel discriminant analysis for LLR-based speaker verification, etc. Our speaker Journal of Pattern Recognition and Artificial Intelligence, 14(5), pp.
近年來，隨著網路和多媒體技術的發展，影音數位博物 verification system was ranked 2nd out of 6 participants in the ISCSLP2006 speaker 603-616, August 2000. 21. W. H. Tsai, S. S. Cheng, and H. M. Wang, “Automatic speaker cluster-
館的建立成為各國數位博物館計畫的重點工作。這幾 recognition evaluation. 6. H. M. Wang, “Experiments in syllable-based retrieval of broadcast ing using a voice characteristic reference space and maximum purity
estimation,” IEEE Trans. on Audio, Speech and Language Process-
年，我們針對廣播、電視新聞開發音訊分段、分群、語 news speech in Mandarin Chinese,” Speech Communication, 32(1-2), ing, 15(4), pp. 1461-1474, May 2007.
pp. 49-60, September 2000.
音辨識、自動摘要、索引及檢索技術，已累積相當經 Due to the rapid advance of multimedia and internet technology, there are many 22. W. H. Tsai and H. M. Wang, “Automatic identification of the sung lan-
驗，並建構完成雛型檢索系統。新近的研究成果包括基 digital library projects worldwide on how multimedia digital libraries can be estab- 7. H. M. Wang and B. Chen, “Content-based language models for spo- guage in popular music recordings,” Journal of New Music Research,
ken document retrieval,” International Journal of Computer Process-
於貝氏資訊準則距離估算及分治法的自動音訊分段技 lished and used. We have been studying audio segmentation, clustering, automatic ing of Oriental Languages, 14(2), pp. 193-209, June 2001. 36(2), pp. 105 - 114, June 2007.
術、基於語音特徵空間與最大群內純度估算的語者分群 speech recognition, summarization, indexing, and retrieval of Mandarin broadcast 8. B. S. Lin, H. M. Wang, and L. S. Lee, “A distributed agent architecture 23. Y. H. Chao, W. H. Tsai, H. M. Wang, and R. C. Chang, “Using ker-
技術及基於機率統計的語音文件摘要技術等。另外，我 news for several years and have developed several basic technologies as well as pro- for intelligent multi-domain spoken dialogue systems,” IEICE Trans. nel discriminant analysis to improve the characterization of the al-
ternative hypothesis for speaker verification,” IEEE Trans. on Audio,
們也投入音樂內涵分析及檢索研究，主要著重在以哼唱 totype retrieval systems. Our recent achievements include a new divide-and-con- on Information and Systems, E84-D(9), pp. 1217-1230, September Speech and Language Processing, 16(8), pp. 1675-1684, November
2001.
方式查詢歌曲及歌聲信號模型評估。未來幾年，多媒體 quer BIC-based method for audio segmentation, a new method based on a voice 9. B. S. Lin, B. Chen, H. M. Wang, and L. S. Lee, “A hierarchical tag- 2008.
聲音資訊分析、辨識、擷取及檢索仍是我們的重點研究 characteristic reference space and maximum purity estimation for speaker cluster- graph search scheme with layered grammar rules for spontaneous 24. H. M. Yu, W. H. Tsai, and H. M. Wang, “A query-by-singing system
項目。 ing, a probabilistic generative framework for extractive spoken document summa- speech understanding,” Pattern Recognition Letters, 23(7), pp. 819- for retrieving karaoke music,” IEEE Trans. on Multimedia, 10(8), pp.
1626-1637, December 2008.
rization, etc. More recently, we have extended our studies to music content analysis 831, May 2002. 25. Y. T. Chen, B. Chen, and H. M. Wang, “A probabilistic generative
and information retrieval. Our research has been focused mainly on query by sing- 10. B. Chen, H. M. Wang, and L. S. Lee, “Discriminating capabilities framework for extractive broadcast news speech summarization,”
ing/humming and solo vocal modeling. Our future plans include further improve- of syllable-based features and approaches of utilizing them for voice IEEE Trans. on Audio, Speech and Language Processing, 17(1),
retrieval of speech information in Mandarin Chinese,” IEEE Trans. on
ment of the speech and music information retrieval technology. Speech and Audio Processing, 10(5), pp. 303-314, July 2002. pp.95-106, January 2009.
11. H. Meng, B. Chen, S. Khudanpur, G. A. Levow, W. K. Lo, D. Oard, 26. W. H. Tsai and H. M. Wang, “Evolutionary minimization of the rand
P. Schone, K. Tang, H. M. Wang, and J. Q. Wang, “Mandarin English index for speaker clustering,” Computer Speech and Language, 23(2),
Information (MEI): Investigating translingual speech retrieval,” Com- pp.165-175, April 2009.
puter Speech and Language, 18(2), pp. 163-179, April 2004. 27. S. S. Cheng, H. C. Fu, and H. M. Wang, “Model-based clustering by
probabilistic self-organizing maps,” IEEE Trans. on Neural Networks,
12. H. M. Wang, S. S. Cheng, Y. C. Chen, “The SoVideo Mandarin 20(5), pp. 805-826, May 2009.
Chinese broadcast news retrieval system,” International Journal of
Speech Technology, 7(2-3), pp. 189-202, April-July 2004. 28. Y. H. Chao, W. H. Tsai, H. M. Wang, and R. C. Chang, “Improving the
characterization of the alternative hypothesis via minimum verifica-
13. B. Chen, H. M. Wang, and L. S. Lee, “A discriminative HMM/n- tion error training with applications to speaker verification,” Pattern
gram-based retrieval approach for Mandarin spoken documents,” Recognition, 42(7), pp. 1351-1360, July 2009.
ACM Trans. on Asian Language Information Processing, 3(2), pp.
128-145, June 2004. 29. Y. H. Chao, W. H. Tsai, and H. M. Wang, “Improving GMM-UBM
speaker verification using discriminative feedback adaptation,” Com-
14. W. H. Tsai, D. Rodgers, and H. M. Wang, “Blind clustering of popular puter Speech and Language, 23(3), pp. 376-388, July 2009.
music recordings based on singer voice characteristics,” Computer
Music Journal, 28(3), pp. 68-78, Fall 2004. 30. S. S. Cheng, H. M. Wang, and H. C. Fu, “BIC-based speaker segmen-
15. S. S. Cheng, H. M. Wang, and H. C. Fu, “A model-selection-based tation using divide-and-conquer strategies with application to speaker
self-splitting Gaussian mixture learning with application to speaker diarization,” IEEE Trans. on Audio, Speech, and Language Process-
identification,” EURASIP Journal on Applied Signal Processing, ing, 18(1), pp. 141-157, January 2010.
62 63

58 59 60 61 62 63 64 65 66 67 68