Monday, August 31, 2009

503 Service Unavailable

I got 3 requests for serving on PCs last week. I might say yes to 1 (if the load seems light enough), but I'm already on 3 for 2010, and I've got a "top-secret" visit to DC planned for my near future, as well as other duties (SIGACT Exec Committee, STOC special issue, other editorial duties) to deal with.

So for a while, at least, I'm going to take a break from taking on new service duties.

Friday, August 28, 2009

Taschenbuch der Algorithmen

A long time ago I had visions of helping construct a book about theoretical CS for high school students, as I wrote about on the blog multiple times, such as here and here. I even wrote a draft sample chapter of my own on coding theory and practice. Strangely (at least, strangely in my mind) there never seemed to be much interest in it. A few people sounded vaguely interested, but not enough to coalesce into a group that would really push for the book. Disappointed, I left it aside.

At some point, some people (Martin Dietzfelbinger, in particular) pointed me to the fact that the German TCS community was way ahead of me. They apparently had started a Web site to promote algorithms, with a Web page written by a scientist roughly every week about some algorithmic type problem. They then managed to further convince people to turn those Web pages into chapters, and produced this book. Which, sadly (for me), is in German, so I can't really tell how it turned out.

I recently saw Martin at STOC, and he mentioned the book again; they were translating the chapters into English to produce an English version. After talking with him I said I'd be happy to "donate" my proposed chapter, and I sent it in for them to look at. Berthold Vocking gave me some corrections and told me they would include it; I sent in the final version yesterday. We'll see what happens from here, and I hope to have more to say about the book as it progresses.

I'm happy that a book of this form will eventually come out, though I wish there could have been more involvement from my peers in America. Perhaps I'll try again someday for another theory book of this form. Or maybe I'll start seeing if there's interest from the networking community -- a book on the past, present, and future of networking and communication technologies, meant for high school students, to inspire them to become involved in computing and communication.

Tuesday, August 25, 2009

NSF proposals

Since I recently got "official word" that one grant is "in", I feel I can finally talk about my grant proposals from last year on the blog. I often feel there's not much discussion of grants or the grant process in our community, so perhaps this (and the comments from others) will give some insight to graduate students or young faculty.

Despite, or because of, the stimulus-based funding bump at NSF, one of my 3 submissions last year got funded. Interestingly, and unlike several other times in the past, I was not dismayed or upset at the rejections; I found the reviews quite reasonable, and believe that the rejected proposals are likely to be funded after a revision. (In the past, I've had revised proposals accepted on the 2nd -- or 3rd -- try.)

Staying positive, the proposal that did get funded was my individual (small) grant to Algorithmic Foundations. Perhaps not surprising anyone, it's all about hashing. I was shocked to find that the proposal seemed to have been reviewed by SIX people. I don't think I've ever seen a proposal with so many reviews. (I thought 3-4 was the norm.) I think the proposal also had the best overall scores I've received on a proposal.

That's not to say the reviews were universally positive. The pluses the reviews gave were the connections being made between theory and practice, the appealing questions, and my history of work in this area. The surprise there for me was that a few of the reviewers explicitly mentioned my "history" in the area; generally, I haven't found this explicitly taken into account in reviews before, although I certainly suspect it is often implicitly considered. The main negative in the reviews was the concern that the work would be too incremental. As one reviewer put it (paraphrasing slightly): "... I feel the proposal is somewhat incremental in its intellectual merit... The proposed research problems largely fall into the category of extending results in a mature field... I would have like to see something more out of the box..." The reviewer still gave the proposal a good score, so I can't complain that they didn't see the merits in tackling the many interesting open problems in the hashing area, or the importance of pushing new results in the area out into the real world. And I can certainly accept the criticism as a reasonable one -- indeed, my concern about this proposal going in was that it didn't have the "out of the box", high-risk flavor that seems in vogue at some areas of the NSF. Perhaps it's easier to accept the criticism since the proposal did get funded, but overall I thought the reviews showed understanding and were well thought out, balancing the pluses and minuses reasonably.

(OK, there was also some of the most-annoying but standard review comment: "He doesn't explain how he's going to solve the problems he's proposed," which I always find remarkable. If I knew, I'd already be writing or have written the paper... On the other hand, even in this case, the reviewer(s) did not use this as an excuse to disregard the proposal -- as they sometimes do -- so again, I can't really complain.)

Not surprisingly, the work that people mentioned as being most interesting/out-of-the-box was the paper on Why Simple Hash Functions Work, which utilized a not-worst-case model, showing that one would get near-uniform hashing with weak hash functions as long as there was "enough" entropy in the data stream. There should be more applications of these ideas, and thinking it through is clearly something I'll have to be getting back to.

Friday, August 21, 2009

Lossy Difference Aggregators

This post is about 2 papers and a student.

The first paper was presented at SIGCOMM Thursday: Every Microsecond Counts: Tracking Fine-Grain Latencies with a Lossy Difference Aggregator , by Ramana Rao Kompella (Purdue University), Kirill Levchenko (UC San Diego), Alex C. Snoeren (UC San Diego), and George Varghese (UC San Diego). The paper introduces a new hash-based data structure for the following situation: packets are being sent from a source to destination over a lossy channel (like the Internet), and you want to estimate the average latency over a time window. (They can also handle jitter -- variance in latency -- as well.) If each packet was timestamped at the source, this would be easy; at the destination you could add up individual latencies on the fly. But adding sufficiently accurate timestamps is an expensive proposition. If there weren't any losses, this would be easy: add up the timestamps over the time window at the source, send that over separately, and subtract it from the sum of the reception times at the receiver, then divide to get the average. If there is packet loss, however, this won't work. The authors show that by sampling and bucketing packets at the right rate, though, you can get buckets that contain a random sample of the packets with no loss, from which you can estimate the latency.

It's a clever idea and a great paper. (Obviously, I'm biased -- novel hash-based data structures with real-world applications appeal greatly to me!) So when I got a preprint from George Varghese, I asked if I could please give it to my seminar class on network algorithms, CS 222, for an in-class discussion. (The students wouldn't know it had been accepted to SIGCOMM, but would be asked to review the paper; I wanted to see their reaction.) In return I'd send George feedback, before the final version was due. George and his co-authors pleasantly agreed.

Now about the student. Hilary Finucane was a standout student in my undergraduate algorithms class, and has continued to impress me since. She started working with me as her advisor on her senior thesis, on "floating codes"; the thesis is available here as a technical report, and some of this work is part of this Allerton paper (the "journal version" of which was just accepted to Transactions on Information Theory). So when Hilary, who was taking CS 222 as a senior, asked during the discussion why the authors were doing X, when it seemed clear that Y would work better, I listened, and we sent a note to the authors; they very nicely acknowledged Hilary in the final version for her contribution.

Hilary and I discussed the paper further and felt we could improve on the analysis. We're making the results available in a preprint here; we'll be submitting a version somewhere soon. (This is the 2nd paper of today's post.) Besides giving an improved analysis, we also showed how competitive analysis can be used in this setting to help decide suitable parameters for implementations. (Despite my previous rants about competitive analysis, this seemed like an appropriate place for it!)

It's been a great pleasure having the chance to work with Hilary, who will be starting graduate school at Weizmann next year. While I've been influencing her to work on the more "practical" problems I'm interested in, I imagine she'll be moving back to problems she claims to like more -- more complexity theory and mathematically oriented problems -- as a graduate student. Given what she's been able to contribute and accomplish as an undergraduate, I'm really looking forward to seeing what she'll do in the future!

Wednesday, August 19, 2009

SIGCOMM 2009 Best Paper

Congratulations to Rohan Murthy and Matt Welsh of Harvard, as well as their co-authors, for winning the Best Paper Award at the 2009 SIGCOMM. The announcement is here, and the paper info is here.

White Space Networking with Wi-Fi like Connectivity Paramvir Bahl (Microsoft Research), Ranveer Chandra (Microsoft Research), Thomas Moscibroda (Microsoft Research), Rohan Murty (Harvard University), Matt Welsh (Harvard University)

Tuesday, August 18, 2009

SIGCOMM Test of Time Award 2009

I was informed that I (and my co-authors) will be receiving a SIGCOMM test-of-time award for our paper "A digital fountain approach to reliable distribution of bulk data", from back in 1998.
(co-authors -- John Byers, Michael Luby, Ashutosh Rege). The official announcement seems to be up now.

As I was told in the official e-mail: "The Sigcomm Test-of-time award recognizes papers published 10 to 12 years in the past in Computer Communication Review or any SIGCOMM sponsored or co-sponsored conference that is deemed to be an outstanding paper whose contents are still a vibrant and useful contribution today."

I'd like to thank the SIGCOMM award committee for this recognition; it's an honor to receive this award. The only downside, as I can see it, is it hit me that now I'm old enough to receive a test-of-time award.

Monday, August 17, 2009

Program Committee Duties

Since I'm thinking about SIGCOMM (going on this week), even though I'm not sure the official cfp is out, I assume it's OK to announce that I'll be on the SIGCOMM PC for 2010. This means I'll be repeating both NSDI and SIGCOMM PCs from last year.

My understanding is that this year they're not going to do the heavy-light split they've been doing in the recent past with SIGCOMM, where only a subset of the TPC is at the final PC meeting. Instead, they'll just be the one committee. With about 50 people.

I admit to having some concerns about how a PC meeting with around 50 people is going to run -- but that's not my job. I'd also like to point out the approach here: around 300 submission, 50 people on the PC, giving a review load of roughly 24 papers per person (assuming an average of 4 reviews), and no sub-refereeing (although experts outside the PC will be asked for reviews as needed). That's very different than the process on many other conferences I've been involved with, particularly the FOCS/STOC/SODA model -- about the same number of submissions, a PC about half the size, and lots of subreferees.

I'm not saying one approach is better or worse than another -- perhaps they're both good for their respective domains -- but the differences are worth commenting on, if anyone wants to comment.

Finally, it's at least somewhat unlikely I'll be motivated to "3-peat" for the NSDI or SIGCOMM PC for 2011. I imagine I'm filling an "network algorithmist" slot on these PCs. If anyone has any good ideas for other names I can pass on the PC chairs, feel free to comment (or send me mail directly).

Friday, August 14, 2009

Report from PODC 2009 (Guest Post, Aaron Sterling)

Aaron Sterling reports from PODC:

I'm in Calgary, where PODC 2009, the Twenty-Eighth Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, has just ended. Lorenzo Alvisi was the Program Chair and Srikanta Tirthapura was the General Chair. However, the clear consensus was that the heroes of the conference were Local Arrangement Chairs Lisa Highman and Philipp Woelfel, who ensured the event ran smoothly, and raised a large amount of money from Canadian sources. This, for example, allowed the organizers to reduce the student registration to $120, and provide significant student support on top of that. Impressive.

My favorite talk of the program was by invited speaker Bruce Hendrickson, who argued for the position that we are in a brand new situation with regard to parallelism in computing -- and not because of "multicore." Rather, he foresaw a growing demand for scientific computation by disciplines (like economics and sociology) that study graphs with fundamentally different structures from the physics-type problems that has previously been the bread and butter of scientific computing. This will require creation of architectures that relate to memory significantly differently, because, to paraphrase Hendrickson, "Computation is limited by memory access, not by processor speed." At the same time, he saw a brand new situation in terms of desktop architecture and programming, because, for the first time ever, "Silicon is essentially free" -- meaning, a chip designer could encode circuitry used by everyone and still have enough space left over to lay down circuitry that would only be used by a small fraction of users, like customers interested in scientific computing. Hendrickson claimed adding this additional functionality would not significantly increase production cost, meaning we are in a brand-new period of architecture design.

Both Hendrickson and invited speaker Robbert van Renesse advocated teaching undergraduates much more about parallelism than is currently in the curriculum. Renesse presented a new method for explaining consensus algorithms step-by-step. The third invited speaker, Sarita Adve -- who also gave a superb presentation -- called more explicitly for the creation of new tools. In particular, she stated the need for new parallel programming languages that would, for example, prevent programmers from constructing data races.

Moving to the refereed program, Christoph Lenzen presented "Tight Bounds
for Clock Synchronization
," which won the Best Paper Award. This is joint work with Thomas Locher and Roger Wattenhofer, and is a tour de force, providing tight upper and lower bounds for (both global and local) clock synchronization in a network, finally solving a problem many authors have been working on, in some sense, since the 1980s. Keren Censor won the Best Student Paper Award for "Max Registers, Counters, and Monotone Circuits." This is joint work with Jim Aspnes and Hagit Attiya. I thought Censor gave the best presentation of the conference, out of all the refereed submissions. Her takeaway message was, "Lower bounds do not always have the last word," because she was able to "beat" a lower bound by investigating the proof, and then modifying the shared objects she used so they no longer had the problem the proof technique took advantage of.

Given the blog I'm guestposting on, I cannot resist telling this story. Christoph and I met at DISC 2008, where we both presented papers. We hit it off, and learned that we both intended to submit to STOC 2009, so we laid down a friendly challenge that we would next see each other at STOC. As I posted here previously, my STOC submission was rejected, I upgraded it, submitted it elsewhere, and was quite happy with the result. Christoph's STOC submission was rejected also, and, sure enough, he resubmitted to PODC and won Best Paper for it. So, once again, rejection from a conference is far from the end of the world.

The business meeting was animated. The least controversial
agenda item was the elections. Andrjez Pelc is the new Steering
Committee chair (replacing Faith Ellen), and Jennifer Welch is the new
Steering Committee Member-at-Large (replacing Michel Raynal). There
was extensive discussion on nonbinding votes to give direction to the
Steering Committee. Perhaps of most interest to readers of this blog,
the business meeting voted overwhelmingly that full proofs of results
should be included in future PODC submissions, in an appendix if
appropriate. There was (unresolved) debate over whether to publish
abstracts along with the list of accepted papers. Some people from
industry were opposed to this, because they thought it might start the
"patent clock" a few months earlier. Lastly, I will mention that
there was discussion about whether to separate PODC from the ACM,
because of concerns that ACM was not providing value for the amount of
money they were charging the conference. No consensus was reached
here, but there were definitely strong opinions.

PODC 2010 will be organized by Roger Wattenhofer in Zurich.
I'm sure it will be great, but he has a tough act to follow.

Thursday, August 13, 2009

Skype as a Research Tool

Yesterday, I used Skype on three different research projects to synch up with my collaborators. I hadn't realized Skype was becoming a basic research tool for me, but I suppose that's the sign that it is.

Generally I wasn't even using the video features. Most of what I do with Skype I could do as well on the phone. The conference call feature is very nice and easy to use, moreso than the standard phone system, but I still experience more drops on Skype than I would on a phone.

Somehow, the real value of Skype is that it's running on my computer, where I'm doing the research work as well. I can look through relevant old e-mails or read relevant documents while Skype continues managing the conversation. Somehow, it all works more conveniently than using the phone and the computer. I expect that "long-distance" research will continue to become easier and constitute more of my research time.

Wednesday, August 12, 2009

Harvard Group Wins Expeditions Grant

It's now public that Harvard has won one of the latest rounds of Expeditions grants with a engineering/computer science team to build RoboBee. It's an exciting project that highlights the diversity and collaborative spirit of the faculty in the School of Engineering and Applied Sciences. While I'm not involved in the project several of my colleagues are, and I congratulate them all. Here's the high level description:

"A multidisciplinary team of computer scientists, engineers, and biologists at Harvard received a $10 million National Science Foundation (NSF) Expeditions in Computing grant to fund the development of small-scale mobile robotic devices. Inspired by the biology of a bee and the insect's hive behavior, the researchers aim to push advances in miniature robotics and the design of compact high-energy power sources; spur innovations in ultra-low-power computing and electronic "smart" sensors; and refine coordination algorithms to manage multiple, independent machines."

When talking to two of my colleagues about RoboBee, I quietly expressed that my only concern was that this is how Skynet started. Strangely, neither of them got the reference. Which only makes me more concerned.

Monday, August 10, 2009

SiGCOMM Papers Are Up

The SIGCOMM conference takes place next week, and if you want to look at the papers, here they are, nicely available online.

Since I was on the PC, it might be impolitic to provide detailed comments, but here's a few thoughts. First, the key words for this year's conference seem to be "Data Center". There are a lot -- a LOT -- of data center papers. I leave the implications for next year's submission strategy to you. Second, I feel scooped by the Univ. of Wisconsin-Madison group for their PLUG paper, on how to implement flexible high-speed lookup modules in hardware. It's an idea I've been batting about for a while, but the devil is in the details of the implementation. Congrats on the paper! Third, congrats to Matt Welsh (of Harvard) for his paper on "White Space Networking".

Sadly, I won't be going... while I love the city of Barcelona, it seemed like a ways to go given that I've already read most of the papers. Perhaps it was a trip I should have made and brought the family along. Especially since next year's SIGCOMM is in New Delhi, which makes traveling to Barcelona seem easy by comparison.

Sunday, August 09, 2009

Publishing Goals - Diversity

For the first time, I'm a co-author on a graphics paper; the paper Real-Time Parallel Hashing on the GPU, with co-authors Dan Alcantara, Andrei Sharf, Fatemeh Abbasinejad, Nina Amenta, Shubhabrata Sengupta, and John D. Owens of University of California, Davis, was (conditionally) accepted to SIGGRAPH Asia 2009. More on the paper in a later post.

This post is about why I'm excited about having a graphics paper, when I've never had one before. Early in your career, when you are just starting to write papers, the goal for a typical student is to aim to get papers in the top conferences in your area: theorists aim for FOCS/STOC/SODA, networking people aim for SIGCOMM, systems people aim for SOSP/OSDI, and so on. It's the way to establish your professional reputation.

The downside of this is that as a student you become focused on both the style and problem spaces of your "home" community. Some (most?) researchers become entrenched in their home community, and never really leave it. My biased view, though, is that when I look at the great computer scientists and mathematicians that I think of as role models, they've had a broad impact across multiple areas, and have crossed boundaries to publish outside what might be considered their comfort zone.

For quite some time, I've tried to follow that lead. So one of the goals I have is to publish in a variety of different conference areas -- really, as many as I can. (I realize it's taking the logic the wrong way -- great scientists didn't necessarily set out to do things in many fields, but their work was so great it naturally had impact in lots of places.) It's an apparently strange goal, in that I haven't heard others voice it, but it's one I enjoy. I optimistically like to think that I can make contributions in multiple areas, and it gives me the chance to get some insight into things going on in different areas. It also leads to fun collaborations.

Of course, just publishing a paper in a conference does not make you part of that conference community. While I've found it rewarding to write a paper for a new conference I've never been to, I've also found that when I have a stronger interest, I can get more out of a longer relationship. My personal rule of thumb is that once I've had three papers at a conference, I can consider myself part of that conference community; at that point, generally I'll know people, and they'll know me, and there's a feeling that my work is really part of a larger scientific picture going forward. (I've also found that's about the point where people start asking you to serve on the program committee...) So my goal is not just to publish single papers in as many conferences as I can, but also to explore and find interesting communities to work with.

Overall, I feel this path has served me well. I feel like I have several research home communities (in theory, networking, and information theory), and when I visit someplace to give a talk or interview a candidate, there is often an unusual connection to be found. I think I've created a positive feedback loop -- because I'm known to have a wide array of interests, people feel free to approach me with new problems, which I'm happy to encourage! [This graphics paper arose entirely because Nina Amenta wrote to ask me about what's known about doing "modern" hashing on "modern" parallel architectures such as GPUs...] But there's certainly potential downsides. I know that, as I was coming up for tenure, this "lack of focus" (if one wants to label it negatively) was, at least for some, a problem to be overcome rather than a positive virtue. Graduate students and junior faculty may need to keep in mind how such forays will be viewed -- usually it seems to me a very small number is seen as a positive, but a large number isn't necessarily seen as a good thing.

I understand for most people the goal is to publish FOCS/STOC papers, or to publish SIGCOMM papers, or publish SOSP/OSDI papers -- that's what it means for them to do great work, and that works for them. My approach is different, so I'm excited by the novelty of having a SIGGRAPH Asia paper. While I wouldn't prescribe how people should do research, I'd encourage people to look for opportunities to work a bit outside their home community. And I think there's value in that that tenure and hiring committees should recognize and acknowledge.

And by the way, I've never had a SOSP or OSDI paper -- if anyone has any good ideas of how that might be rectified, feel free to let me know.

Wednesday, August 05, 2009

STOC/FOCS complete versions [Guest Post by Mikkel Thorup]

STOC/FOCS complete versions
Guest Post by Mikkel Thorup

I would like to congratulate PCs that have requested authors to submit appendices with complete proofs. I would like to suggest going one step further. For accepted papers, I think these complete proofs should be posted as reports on a place like ArXiv.

My basic interest is the case where STOC/FOCS deals with problems that are so important that we actually want them solved (contrasting cases that are just about exchanging conceptual ideas). In that case, the most damaging conference abstract is one that claims a solution but with an incomplete sketch of a proof that can neither be verified, nor falsified, and where the authors never themselves provide the details. Since the credit is already given out, there is no incentive left to really solve the problem. I have seen good research areas killed this way. The situation is much better when a paper has a clear bug, for then it is well-justified to look for a true solution. This is why I would like the complete proofs posted on ArXiv. It is OK that they are not so well-written. The important thing is that we have a place where people can go and check the details if in doubt.

I think this would have some positive side-effects. First of all, I do not think people should claim solutions to important problems if they do not have some complete proofs. The PC may not have time to read the complete versions, but knowing that your complete version will be public will most likely make people think twice before submitting (I think this is positive).

Having the first complete versions along with comments and questions should also pave the way for more and better journal versions.

Sunday, August 02, 2009

Blogging for August

A commenter noted I haven't posted much to the blog for July. Happily, there's no dark secret underlying the slow pace of posting -- things are simply quieter (no students, faculty disappearing), and yet at the same time busier (research!), over the summer. There's plenty of stuff in the pipeline to eventually write about, but many things aren't quite ripe enough to post about yet. (Those of you reviewers sitting on my journal submissions and revisions, which it seems like you've had for a loooong time, please try to get through them, so I have one more thing to blog about...)

Indeed, this summer has, if anything, seemed busier with research than I remember from summers past. Like all faculty, I dread going through the explanations to neighbors/friends/relatives/people you meet at a barbeque who ask what you're doing during the off time in the summer, since summer is actually the time when, as I like to tell the undergraduates, the "real work get done." (I leave it to the students to decide how seriously to interpret the statement.) Summer is always filled with the opportunity to start new projects and learn new things, finish off journal versions that have languished too long, and tie up things in time for summer and early fall deadlines. This summer has been filled with all of the above; we'll see what interesting comes out of it all, hopefully soon.

So I'll start trying to work on producing more posts, but in the meantime, have a look at this week's Chronicle of Higher Education, which has yet more on the Gates affair, a commentary on the University of California cuts (well, focused on Chinese history, but still), and news about the BU students who just a judgment of $675,000 handed down against him for illegal music downloads. Or check out those NSF deadlines, medium grants for CISE cross-cutting programs, CNS, and CCF are all at the end of August! (Yikes! Grant proposals -- the other job of summer...)