Friday, October 02, 2009

Semantic Communication, Madhu Sudan

Madhu Sudan gave a colloquium at Harvard yesterday on his work on Universal Semantic Communication and Goal-Oriented Communication (both with Brendan Juba, the latter also with Oded Goldreich). The papers are available here, and here are the slides (pdf). One of the motivating examples for the work is the following : you're visiting another department for the day, and need to print something out. You have access to a printer, but your machine doesn't speak the same language as it does. So you have to get a driver, install it, and so on, and all of a sudden it takes 1/2 an hour to do a simple print job. Why can't the machines figure out how to get the simple job done -- printing -- without this additional work?

More abstractly, one can pose the high-level idea in the following way: Shannon's theory was about the reliable communication of bits, and we've solved a great deal about those types of communication problems. But even if we assume that bits are transmitting correctly over a channel, how can we ensure the meaning of those bits is interpreted properly, particularly in the sense of if those bits represent a task of the form, "Please do this computation for me," how do we ensure the other side performs the computation we want done if we don't have a prior agreed-upon semantic framework?

I've seen in various settings criticism of this line of work, which is quite abstract and certainly a bit unusual for CS theory. The original paper is often referred to as "the aliens paper" because it set the question in terms of communicating with aliens (where there may naturally be no shared semantic framework), and my impression is that several people felt it is too far removed from, well, everything, to be of interest. It was, I understand, rejected multiple times before being accepted.

I have to say the impression that this paper is "too far removed" is incorrect based on the reaction at the talk Madhu gave. Part of it may be a change in message -- no talk of aliens, and more talk of novel devices connecting to the Internet makes the problem more tangible. But it was remarkable how many people were interested -- our computational linguist and programming languages faculty seemed really intrigued, and there was also great interest from some of our other systems people. (It helps to be in a department where faculty outside of theory are generally perfectly comfortable seeing PSPACE-completeness and reductions show up on slides -- is that usual, or are we spoiled here? Multiple questions on the details of the definitions were asked by non-theorists...) Many people shared the impression that this was a new way to think about some very challenging problems, and while the work so far is too theoretical to be of practical use -- indeed, it's arguably still at the stage where perhaps the right frameworks or questions aren't entirely clear -- they seemed to view it as a start worth pursuing further.

I think this sort of paper is a rare beast, but perhaps it does serve as an example that a new conference like ICS is needed as an outlet for this kind of work. In particular, it's not clear to me that FOCS/STOC always has a good handle on where theory could be of large interests and have a significant impact on other communities. My complaint generally takes the form that FOCS/STOC (and even SODA) weighs mathematical "difficulty" far, far greater than practical utility when judging work in algorithms and data structures, but this seems to be a related issue.

Anyhow, thanks to Madhu for an excellent talk.

8 comments:

Anonymous said...

Sounds like a case of a theorist poorly choosing which setting his paper pretends to apply to in the intro. "Suppose an alien comes down to earth"? No! "The Internet has seen a remarkable explosion of growth over the past two decades"? Yes!

....at least it wasn't "VLSI design is an important problem in computer engineering".

Anonymous said...

My sense is also that algorithms and data structures work is judged much more harshly by the theory community than lower bounds or complexity work. For algorithms work, the question often asked by the typical reviewer (who is a pure theoretician) is : (a) does it have hard/interesting math and (b) is it practical. Unfortunately, an idea that satisfies (a) often takes a long time to make its way to (b) (and often the reviewer is a poor judge of (b) anyway), so the paper often gets rejected.

However for complexity/lower bounds work, there is absolutely no question of (b), and people are happy to accept a paper so long as it passes (a).

Anonymous said...

and often the reviewer is a poor judge of (b) (is it practical) anyway

A while back I wrote a couple of papers describing algorithms in use in practice. The usage was carefully described in the introduction but the names of the companies using them could not be cited for confidential reasons. One paper was rejected because it had "no applications", the other because "the author does not know how this is done in industry".

Anonymous said...

(It helps to be in a department where faculty outside of theory are generally perfectly comfortable seeing PSPACE-completeness and reductions show up on slides -- is that usual, or are we spoiled here? [...])

You're spoiled, and I'm jealous.

Heard from a faculty colleague in a talk: "NP means non-polynomial, right?"

Anonymous said...

Anonymous 3: If you can't cite your sources, it wouldn't even be allowed in a Wikipedia article. I am not sure why you think scientific articles should have even lower standards.

Anonymous said...

If you can't cite your sources, it wouldn't even be allowed in a Wikipedia article. I am not sure why you think scientific articles should have even lower standards.

What were we saying about people not knowing how to evaluate applicability?

By that measure Google's page ranking is not applicable since at the time of discovery it hadn't been implemented by a company (google didn't even exist).

And yes, page ranking was rejected from SIGIR, though admittedly the paper wasn't very well written.

rgrig said...

Perhaps this theory can be applied to teaching students (math?) who seem to speak a completely different language sometimes. :) (I just read a huge thread where a colleague failed, so far, to convince a student that "x is at most y" means x≤y.)

Durga said...

Sounds similar to the UniPlug work we did at the Media Lab:
http://dspace.mit.edu/handle/1721.1/41756