At VoiceCon, Microsoft made much noise about how the softphone delivered better audio quality than a leading IP phone. (see my posts: Microsoft – where do your get this stuff? and Psytechnics Gets a Microsoft Moment )

At first it sounds impressive – especially considering the history of Windows. I remember in 1998 getting all excited about the prospects of Nortel doing a deal with Vocaltec, the Israeli company that was the first with a Windows softphone, but then having experienced it, I acknowledged that the OS had several flaws.

Windows 95, 98 and probably Millenium, were all single threaded. That means, that any real-time process underway, occupied the machine completely. Any intermittent process, like a clock update, or an email server update simply seized control of the processor, did its thing, and then dropped the user back into the real-time process then underway. This might work if I was editing an email since the interrupt was only about 500 milliseconds long. However, in an audio conversation, a random 0.5 seconds of zero speech processing can destroy the context for the user, and of course destroy the experience of the user.

It could literally eliminate answers like, 'no' which only take ~ 400 milliseconds to speak.

However, Windows today, with the fast processors of today, with the massive memory of today have no trouble with this challenge (thank God) and the Psytechnics test proves it. In the test, results available at, a Microsoft Windows and Office client with a USB phone are contrasted with the Cisco 7961 IP phone. The comparisons show that the MSFT device supports Wideband and Narrowband operation, while the Cisco phone supports G.711 and G.729.

Does this show that MSFT doesn't support the ITU's codec and won't be able to interoperate with other gateways and applications?

Also, the confidence intervals of each of the samples contrasting the narrowband and the G.711 may suggest that the gap is pretty small. For example, at IP Condition 1 (no other apps running on the LAN), the MSFT setup generates a MOS (Mean Opinion Score) of 3.51 and a confidence interval of 0.11. In contrast, the Cisco phone, at IP Condition 1, using the G.711 codec generated a MOS mean of 3.41 and a confidence interval of 0.11. So, although the MSFT setup does have an higher score here, it isn't much of a higher score.

So, this is a test of the performance of two codecs – one using ITU codec standards and the other a proprietary MSFT implementation. Was that the real goal of the test?

Maybe MSFT should present its IPR to the ITU to imbed it into the next codec standard? How about it, Jeff?