基于韵律短语的韵律结构生成方法

Prosodic structure forming method based on prosodic phrase

Abstract

本发明提出了一种全新的基于韵律短语的分成韵律结构边界生成方法。该方法采用机器学习与规则相结合的方法,大大提高了中文文本韵律结构边界预测的准确率。在输入文件是已经过分词和词性标注的前提下,首先识别韵律短语边界,然后在此基础上结合韵律短语边界信息生成韵律词边界,最后人为地加入一些规则对系统进行整体的修正。在韵律短语和韵律词边界的判断时,分别设计选取特征,建立特征模版,并利用最大熵算法建立韵律短语模型和韵律词模型,分别用于两阶段的韵律边界识别。同时针对最大熵模型在识别时遇到的错误,利用错误驱动的规则学习方法,选取最优规则,进一步提高其准确率。基于上述的方法,本发明构思了一种基于韵律短语的分层韵律结构生成方法,这种方法可以有效提高韵律结构预测的准确性,提高语音合成的自然度。
The invention provides a novel prosodic structure boundary division forming method based on prosodic phrases. The method combines machine learning with rules to greatly improve the accuracy of the prediction of Chinese text prosodic structure boundary. Prosodic phrase boundaries are firstly identified on the premise that input files goes through word segmentation and part of speech tagging, then prosodic word boundaries are formed by combining prosodic phrase boundary information, and finally a plurality rules are artificially added to carry out integral modification. In prosodic phrase and prosodic word boundary identification, characteristics are respectively designed and selected for establishing a characteristic template, and a prosodic phrase model and a prosodic word model are established by utilizing the maximum entropy algorithm for respectively identifying prosodic boundaries of two stages. In addition, aiming at the errors in identification of a maximum entropy model, an optimal rule is selected by utilizing an error-driven rule learning method to further improve the accuracy. Based on the method, the prosodic structure boundary division forming method based on prosodic phrases is provided, and the method can effectively improve the accuracy of prosodic structure prediction and the naturalness of speed synthesis.

Claims

Description

Topics

Download Full PDF Version (Non-Commercial Use)

Patent Citations (0)

    Publication numberPublication dateAssigneeTitle

NO-Patent Citations (0)

    Title

Cited By (10)

    Publication numberPublication dateAssigneeTitle
    CN-101950284-AJanuary 19, 2011北京新媒传信科技有限公司中文分词方法及系统
    CN-101950284-BMay 08, 2013北京新媒传信科技有限公司中文分词方法及系统
    CN-102063898-AMay 18, 2011北京捷通华声语音技术有限公司Method for predicting prosodic phrases
    CN-102063898-BSeptember 26, 2012北京捷通华声语音技术有限公司Method for predicting prosodic phrases
    CN-103279766-ASeptember 04, 2013北京捷通华声语音技术有限公司分词、韵律短语和多字手写识别方法及装置
    CN-104464751-AMarch 25, 2015科大讯飞股份有限公司Method and device for detecting pronunciation rhythm problem
    CN-104464751-BJanuary 16, 2018科大讯飞股份有限公司发音韵律问题的检测方法及装置
    CN-104867490-AAugust 26, 2015百度在线网络技术(北京)有限公司韵律结构预测方法和装置
    CN-104867490-BMarch 22, 2017百度在线网络技术(北京)有限公司韵律结构预测方法和装置
    CN-105185373-ADecember 23, 2015百度在线网络技术(北京)有限公司Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus