还没回复

半路出家做编程,不做又不行,前天的一个c++函数指针的问题,现在终于是编译通过了,但是究竟是怎么回事还是很懵懂,真是为伊消得人憔悴,衣带渐宽终不悔。办公室几个研一的和那个大四的都觉得编程是软件和计算机的事——很久之前交代给B的语音合成的任务到现在音信杳无,也不想多去催,没名没分的,又不是老师——连老师都不管呢。 

// 上面几行都是昨天写的,昨天睡晚了,今天就觉得有点懒散。 

现在也没有精力系统的记一些编程和看书的笔记,写博客其实也很力不从心了,看看最近一年的运动就知道这前后一年真是有生最失败的一年了,而且这种状况还要继续两三个月才可能有转机。 

忙着,只能记点散的,没有劲头在写完整的笔记。espeak 的源码所有的变量都没有注释,看起来真痛苦。有一个概念是espeak_data,这是什么数据?有个data_path会提供espeak_data,data_path其实是char*,实际用的时候应该是某文件的地址吧。 

/*******************原文引用*************************/
static const char *help_text =
“nespeak [options] [“<words>”]nn”
“-f <text file>   Text file to speakn”
“–stdin    Read text input from stdin instead of a filenn”
“If neither -f nor –stdin, then <words> are spoken, or if none then textn”
“is spoken from stdin, each line separately.nn”
“-a <integer>n”
“t   Amplitude, 0 to 200, default is 100n”
“-g <integer>n”
“t   Word gap. Pause between words, units of 10mS at the default speedn”
“-k <integer>n”
“t   Indicate capital letters with: 1=sound, 2=the word “capitals”,n”
“t   higher values indicate a pitch increase (try -k20).n”
“-l <integer>n”
“t   Line length. If not zero (which is the default), considern”
“t   lines less than this length as end-of-clausen”
“-p <integer>n”
“t   Pitch adjustment, 0 to 99, default is 50n”
“-s <integer>n”
“t   Speed in words per minute, 80 to 450, default is 175n”
“-v <voice name>n”
“t   Use voice file of this name from espeak-data/voicesn”
“-w <wave file name>n”
“t   Write speech to this WAV file, rather than speaking it directlyn”
“-bt   Input text encoding, 1=UTF8, 2=8 bit, 4=16 bit n”
“-mt   Interpret SSML markup, and ignore other < > tagsn”
“-qt   Quiet, don’t produce any speech (may be useful with -x)n”
“-xt   Write phoneme mnemonics to stdoutn”
“-Xt   Write phonemes mnemonics and translation trace to stdoutn”
“-zt   No final sentence pause at the end of the textn”
“–compile=<voice name>n”
“t   Compile pronunciation rules and dictionary from the currentn”
“t   directory. <voice name> specifies the languagen”
“–ipa      Write phonemes to stdout using International Phonetic Alphabetn”
“–path=”<path>”n”
“t   Specifies the directory containing the espeak-data directoryn”
“–pho      Write mbrola phoneme data (.pho) to stdout or to the file in –phonoutn”
“–phonout=”<filename>”n”
“t   Write phoneme output from -x -X –ipa and –pho to this filen”
“–punct=”<characters>”n”
“t   Speak the names of punctuation characters during speaking.  Ifn”
“t   =<characters> is omitted, all punctuation is spoken.n”
“–split=”<minutes>”n”
“t   Starts a new WAV file every <minutes>.  Used with -wn”
“–stdout   Write speech output to stdoutn”
“–voices=<language>n”
“t   List the available voices for the specified language.n”
“t   If <language> is omitted, then list all voices.n”;

/***********************原文引用结束****************************/

这个

–voice=<language>倒确实是语言选择,-vzh, -vde, -ven都尝试了一遍,效果还算是满意。

-a 用来调音量

这两个命令在实际运行的时候都用过,但是源码还没有看透,有点着急,于是想着用GDB调试来看看,结果一run就来这个:
starting programm: *****
[Thread debugging using libthread_db enabled]
[New Thread 0x…… (LWP 5727)]

然后,然后就没有任何动静了,不接受任何GDB的指令,也无法看到这个程序本身对命令的响应,除了Ctrl+C可以让它终止。幸好Zlike正好收到SOS,说命令行调试有专门的入口,上网查了查,原来只要:

run argv1 argv2 …

就行了,蛋都要碎了……

琢磨了两天的espeak源程序,我讨厌没有注释的代码,通过gdb找到语音合成输出应该是这个espeak_Synth(),按照目前的理解,语音合成只需要最基本的两步:Initialize, Synthesize,前面的Initialize已经通过了,唯独加上了这个Synth()就出现段错误,应该是参数给的不对.
 espeak_Synth(text,size,0,POS_CHARACTER,0,synth_flags,NULL,NULL);

文档里对这个合成函数是这么说的 

/****************espeak_Synth 原文引用****************/
ESPEAK_API espeak_ERROR espeak_Synth(const void *text,
 size_t size,
 unsigned int position,
 espeak_POSITION_TYPE position_type,
 unsigned int end_position,
 unsigned int flags,
 unsigned int* unique_identifier,
 void* user_data);

/* Synthesize speech for the specified text.  The speech sound data is passed to the calling program in buffers by means of the callback function specified by espeak_SetSynthCallback(). The command is asynchronous: it is internally buffered and returns as soon as possible. If espeak_Initialize was previously called with AUDIO_OUTPUT_PLAYBACK as argument, the sound data are played by eSpeak. */
/****************espeak_Synth 引用结束****************/

编程最怕的就是访问越界,尤其是我这样半路出家又不想专职码代码的人来说,越是调不出来,就越困。一开始以为是输入的text的长度的问题,因为这个函数要求字符串以n结尾,怕是字符串长度没算清。还好这个speak_lib的作者写明了espeak_Initialize的时候要是注明AUDIO_OUTPUT_PLAYBACK的话Synth()函数会立即输出语音,先前抄来的代码在那个参数位置是AUDIO_OUTPUT_SYNCHRONOUS,于是还回来,哈哈,出声了,这就算是语音功能的可行性通过了,具体的功能实现交给后来人吧,我不管了。目下最要紧的是把框架弄出来,画UML写class去。 

又给Cognitec又发去一个邮件,还是没有回复,没有回复,着急,着急~~

 

 DES: 今天操场上见到三个硬币,要换在小学,足够开心一天的

奶奶个熊,办公室在linux上用vim记得这点东西到了windows上换行符统统没换过来,就算是用win-gvim打开,中文字符居然是乱码——没心思琢磨这个字符编码的问题,换行符替换也懒得琢磨——以后再也不用vim记博客了~

Posted in