Once I decided to integrate MRCP with Asterisk, I considered the following approaches.
1. Use the default unimrcpclient. Save the wave file from asterisk, use unimrcpclient to send the wav file to the MRCP server and get the recognition done.
This had various problems.Scalability, ability to integrate with asterisk etc. Most importantly, barge in cannot be supported. So I decided to directly integrate with asterisk.
2. Implement the Speech API for asterisk. This would have been the best solution. But it would be applicable only to speech recognition. I wanted Asterisk integration with MRCP.
3. Implement an asterisk MRCP module. This was the approach I finalized on as it would allow me access to both Asterisk and Unimrcp libraries.
Building a module turned out to be harder than I thought :). I had to understand the code base of Asterisk and Unimrcp. But luckily, both were very well written pieces of code and I could understand their structure. Here I would like to mention the support given by Arsen of Unimrcp. Without his help it would not have been possible.
Finally I built an Asterisk module which could directly take the audio data from asterisk and send it to the ASR server and I was really surprised by the speed of the recognition. As soon as the user finished speaking the recognizer returned a result immediately. I was able to acieve barge in functionality too.
I tested this one with telisma speech server and it worked very well. There were one or two configuration issues. In unimrcpclient.xml I had to use "ASR" instead of "media" for the telisma server to work. Mainly, for any new MRCP server the following process works.
1. Try to setup demo unimrcpclient. If that works with the MRCP server, then the asterisk module should work.
2. Setup trial dial plan in Asterisk and load the module and do a trial call.
You can details of the implementation at http://www.voip-info.org/wiki/view/Asterisk+cmd+MRCPSpeech
Wednesday, May 20, 2009
Sunday, April 12, 2009
Need for MRCP
Well, I work for a company Ozonetel (www.ozonetel.com) and we specialize in voice products. We work a lot with Asterisk and in almost all projects I realized that as a telephony platform, Asterisk is great. It allows you to make and receive calls. But still there are a lot of loose edges and a lot of functionalities can be added. Almost every project now-a-days deals with speech. Everyone wants speech recognition and Text to speech. And that is where Asterisk is lagging a little bit. There are hacks to deal with TTS(through Festival Cepstral etc) and speech recognition (sphinx etc), but there is no standard solution.
Enter MRCP. If a plug in or module for MRCP existed for asterisk then we immediately get connected to many Speech servers like Nuance, Telisma etc. So I started searching around for a solution. After all I am a very good programmer and I embody the most important trait of a programmer, laziness :) So if someone has already built it, then why bother :).
But surprise of surprises, I could not find any solutions. Then I started to search for atleast MRCP solutions so that I could integrate. Even there, I could not find any stable solutions. Just as I was thinking of creating my own MRCP solution,luckily I found a project which was showing some good activity. I tried considering speechforge and zanzibar as I am mainly a Java guy, but their development was stagnant.On the other hand, unimrcp ( www.unimrcp.com) was being developed actively, so I checked out the code and played around with it. It installed perfectly, though I had to modify the configure parameters for Ubuntu (both sofia sip and apache apr were installed in seperate locations) and it worked perfectly. The code was also very well written and the design was top notch. And most importantly, the main developer, Arsen, responded to the queries on a real time basis.I knew we had a winner.
Next post we will look at how I implemented the solution.
Enter MRCP. If a plug in or module for MRCP existed for asterisk then we immediately get connected to many Speech servers like Nuance, Telisma etc. So I started searching around for a solution. After all I am a very good programmer and I embody the most important trait of a programmer, laziness :) So if someone has already built it, then why bother :).
But surprise of surprises, I could not find any solutions. Then I started to search for atleast MRCP solutions so that I could integrate. Even there, I could not find any stable solutions. Just as I was thinking of creating my own MRCP solution,luckily I found a project which was showing some good activity. I tried considering speechforge and zanzibar as I am mainly a Java guy, but their development was stagnant.On the other hand, unimrcp ( www.unimrcp.com) was being developed actively, so I checked out the code and played around with it. It installed perfectly, though I had to modify the configure parameters for Ubuntu (both sofia sip and apache apr were installed in seperate locations) and it worked perfectly. The code was also very well written and the design was top notch. And most importantly, the main developer, Arsen, responded to the queries on a real time basis.I knew we had a winner.
Next post we will look at how I implemented the solution.
Subscribe to:
Posts (Atom)