Greetings! I am working on a research project that aims towards developing a basic Automatic Speech Recognition system for Urdu (National language of Pakistan) on Sphinx-4 (pure JAVA based speech API).
The following resources I can provide to the developer.
- A medium sized pre-build Urdu speech corpus of Isolated words.
- Any help regarding Urdu language (if required).
Following are the requirement of the desired system.
- A basic ASR prototype built on Sphinx in MS Windows that first train itself for that Urdu corpus & then test it & gives the test report by providing accuracy rate i.e. WER (world error rate) & other features Sphinx offers.
- A training session/report of all the detailed steps involved in building this project.
- The desired system is supposed to be a basic prototype to demonstrate my research work. So even if it train & test "3-5" Isolated Urdu words, that would be sufficient. But it must be capable enough to add up more words into its language model in future & should be made by the proper steps of Sphinx needed to built a speech API.
- The project must be completed within 17 days i.e. before 05-12-2015