Wednesday, February 11, 2015

New Heapable Subsequence Paper

In the "only a dozen people could care about this category"...

About 4 1/2 years ago, I posted about a paper we had put up on the arxiv about Heapable Sequences and Subsequences.  The basic combinatorial structure we were looking at is a seemingly natural generalization of the idea of Longest Increasing Subsequences.  Say that a sequence is heapable if you can sequentially place the items into a (binary, increasing) heap, so each new item is the child of some item already in the heap.  So, for example, 1 4 2 3 5 is heapable, but 1 5 3 4 2 is not.  Once you have this idea, you can ask about things like the Longest Heapable Subsequence of a sequence (algorithms for it, expected length with a random permutation, etc.).  Our paper had some results and lots of open questions.

I admit, when we did this paper I was hoping that some combinatorialist(s) would find the notion compelling, take up the questions, and find some cool connections.  Longest Increasing Subsequences are somehow related to Young tableaux, interacting particle systems, and all sorts of other cool things.  So what about Longest Heapable Subsequences?

I had to wait a few years, but Gabriel Istrate and Cosmin Bonchis recently put a paper up on the arxiv that makes these connections.  Here's the abstract:
We investigate partitioning of integer sequences into heapable subsequences (previously defined and established by Mitzenmacher et al). We show that an extension of patience sorting computes the decomposition into a minimal number of heapable subsequences (MHS). We connect this parameter to an interactive particle system, a multiset extension of Hammersley's process, and investigate its expected value on a random permutation. In contrast with the (well studied) case of the longest increasing subsequence, we bring experimental evidence that the correct asymptotic scaling is 1+52ln(n). Finally we give a heap-based extension of Young tableaux, prove a hook inequality and an extension of the Robinson-Schensted correspondence.
(Note, that should really be "Byers et al...")  

I love the new conjecture that the expected minimal number of heapable subsequences a random sequence decomposes into is ((1+sqrt{5})/2) ln n.  (It's clearly at least ln n, the expected number of minima in the sequence.) 

There are still all sort of open questions, that seem surprisingly difficult;  and I certainly can't claim I know of any important practical applications.  But Longest Heapable Subsequences just appeal to me as a simple, straightforward mathematical object that I wish I understood more.

For simple-sounding but apparently difficult open questions,  as far as I know, the answer to even the basic question of "What is the formula for how many sequences of length n are heapable?" is still not known.  Similarly, I think the question of finding an efficient algorithm for determining the Longest Heapable Subsequence (or showing it is hard for some class) is open as well.