Siri first translates your speech to text.Once the speech-to-text is complete, Siri performs lexical analysis of the text to find out what you want to do. Speech recognition is most likely not done on the iPhone device itself as Apple’s servers are involved in the process of translating speech to text, and Siri has been shown to have problems when Apple’s servers were under heavy load.
Speech recognition is extremely challenging even if you can break down the structure of the sentence (quite easy to do with a good dictionary) but there are so many actions, objects, and permutations that makes it be very difficult to act upon.Siri has one advantage that (other than people asking crazy things) the set of actions it could be asked to do is fairly limited. Cleverness is required to link oddly-phrased sentences into some actions, and some knowledge of relationships between people is required e.g. “call Mom”, allegedly it gets this data from the contacts list “Relationship” field in the phone, there are actually not that many ways of asking for a reminder to be set or for a calendar entry to be added.
After Siri has determined what you want to do (or come up with a funny reply), it performs the action through one of the APIs that it has access to. It is the number of on-device APIs that Siri has access to (that nobody else does) and the number of clever web services it can connect through to get you answers (see Wolfram Alpha) that sets it apart and makes it seem much cleverer than it really is to the end user.
Join us on Facebook, Twitter to spread the Siri craze all over the world