Crowd-sourcing dialogue data collection, system building and system evaluation is an interesting prospect. We use crowd-sourcing paradigms to collect spoken dialogue data, transcribe, build spoken dialogue systems and evaluate them.
I am very interested in using crowd-sourcing paradigms for building dialogue systems. Building dialogue systems is hard !! Engaging large number of users to build such systems is one of my interests.
We have built an online HTML5 based framework to collect, synchronize and reconstruct the spoken interaction online. This is invaluable when we are building a new domain dialogue systems. By deploying these systems online, we can collect large amounts of spoken interactions. The dialogue data collected needs to be transcribed, annotated and bootstrapped so that they can be used for building dialogue systems.
Once the spoken dialogue system is built, we might want to evaluate them. We use crowd-sourcing paradigms to evaluate these spoken dialogue systems. Building such systems is tough!
So, how to build the agent then ? Policies, DM, ASR, NLU, NLG, TTS, Interface are all part of the same problem.