Sunday, April 19, 2009

Christos a înviat!

Happy Easter!  Christos a înviat!

Tuesday, April 14, 2009

Theory and Practice

Via David Eppstein, I find out about the Workshop on Theory and Multicores. A memorable citation from the call:

[...] Indeed, the programs of many mainstream computer science conferences, such as ASPLOS, DAC, ISCA, PLDI and POPL are heavily populated with papers on parallel computing and in particular on many-core computing. In contrast, the recent programs of flagship theory conferences, such as FOCS, SODA and STOC, hardly have any such paper. This low level of activity should be a concern to the theory community, for it is not clear, for example, what validity the theory of algorithms will have if the main model of computation supported by the vendors is allowed to evolve away from any studied by the theory. The low level of current activity in the theory community is not compatible with past involvement of theorists in parallel computing, and to their representation in the technical discourse.
Which theory community? The one that thinks our machine models are not robust enough to tell the difference between O(n lg n) and O(n5)? The one that thinks nO(1/ε^2) is polynomial and even efficient? The one that thinks Amazon should be happy with an O(lg3n) approximation for its profit? The one that thinks the coolest conclusions we ever came to are the philosophical implications of interactive proofs and zero knowledge?

The theory community will do just fine, thank you very much.

As for "the low level of current activity" being incompatible "with past involvement of theorists in parallel computing" --- it is exactly the past involvement that led to the current attitude towards parallel computing! Parallel computing is the fabled field where we got burned badly, and the first example people use to argue that theory is still very young. (It usually goes like this: "Imagine an equivalent of Perelman disappearing for 20 years and coming back with an answer to the most burning question about PRAMs from 1985. Who would care now?").

It's going to be hard to regain enthusiasm about parallel computing, as timely as the moment may be.

But at least we learned our lessons. Never again will a hyped-up mob of theorists rush head-on to study a machine model too early. Never will we study a machine model before such computers are even built, and while the practicality of such computers is still debated. We understand that producing a theoretical model too early will have us chasing the wrong rabbits, and that our results might be as relevant as the 5-state 17-symbol universal Turing Machine was in the design of the Pentium.


Alright, enough of this. I should go back to reading about 3-round quantum interactive proofs with entanglement.

Wednesday, April 8, 2009

Puzzle: Short-hop permutations

Count the number of permutations of size n, such that |π(i) - π(i+1)| ≤ k, for all i. That is, the hop between any two consecutive values must be short.


I am asking for an algorithm, not a closed formula. (Combinatorialists: is one known?)

Your algorithm should work fast for values like n=106 and k=8. I have not thought hard about supporting larger k (say, k = 20), but I would certainly be interested to hear such a solution if you have one.

Credits. This problem was used in a selection contest for the Romanian olympic team. The original author was Marius Andrei. My subjective judgment of its difficulty is "average+" (it was tricky, but I solved it within 10 minutes). YMMV.

CC4: One-Way Communication and a Puzzle

As we will discuss in the next post, for some application it suffices to get lower bounds on the one-way communication complexity -- that is, the case in which a single message is sent, either from Alice to Bob, or from Bob to Alice. The party receiving the message must immediately announce to answer, based on the message and their own input.


Referring to our INDEXING lower bound in CC3, we can immediately obtain a one-way lower bound by setting a=0 (one message Bob→Alice) or b=0 (one message Alice→Bob). We conclude that Alice must send Ω(n) bits in the A→B model, and that Bob must send Ω(lg n) in the B→A model. The former case is much more interesting in applications, so when people say "the one way communication complexity of INDEXING," they always mean the Alice→Bob model.

One-way communication is often simple to analyze, and we can obtain lower bounds by direct appeal to information theory. Here is a short and simple proof of the INDEXING lower bound:

Lemma. The one-way communication complexity of INDEXING is Ω(n).

Proof. Assume there exists a protocol in which Alice sends n/10 bits, which works with probability (say) 0.99 over the uniform distribution. We use this to construct an encoder for a random vector of n bits. We show that the average length of the encoding is strictly less than n, which is a contradiction because the entropy of the random bit string is n.

Say we want to encode A[1..n]. We give Alice the input A, and include her message M in the encoding. Then, we specify the set of indices i, such that, if Bob has input i and receives message M from Alice, he will give an incorrect answer to INDEXING. This is the entire encoding.

Claim 1: we can decode A from the encoding. To accomplish this, simmulate the action of Bob for every possible i. For every index in the bad set, A[i] is the opposite of what Bob says.

Claim 2: the average size of the encoding is at most 0.85 n. The first component (Alice's message) is n/10 in size by assumption. The second component has size lg(n choose k), where k is the number of bad indices for a particular A. How do we analyze the expected size of this component?
  • EA[k] = 0.01 n. Indeed, for uniformly random A and i, the algorithm works with probability 0.99. So in expectation over A, the bad set of indices has size 0.01 n.
  • By the Markov bound, PrA[≤ 0.02 n] ≥ 1/2. 
  • When ≤ 0.02 n,  lg(n choose k) ≤ n/2.
  • When ≥ 0.02 n,  lg(n choose k) ≤ n.
  • Therefore E[lg(n choose k)] ≤ 0.75 n.
Therefore, the total size of the encoding is 0.85 n in expectation, contradiction.

A puzzle. Since this blog has featured algorithmic puzzles recently, it's time for a lower bound puzzle:
Say Alice and Bob receive two set of n elements from the universe [n2]. Show that the one-way randomized communication complexity of set disjointness is Ω(n lg n) bits.
This is the best result possible, since Alice can specify her entire set with O(n lg n) bits, after which Bob knows the answer with no error. 

In fact, the O(n lg n) complexity holds even for large universes: use a universal hash function from the universe to a range of 100 n2. The hash functions introduces zero collisions with probability 99%, so it doesn't introduce false positives in the intersection.

Acknowledgements. Thanks to Mert Sağlam for telling me about this question. As far as I know, the proof is folklore. David Woodruff wrote an email describing this result many years ago.


Monday, April 6, 2009

Puzzle: Cycle Containing Two Nodes

The following simple puzzle was circulating among Romanian olympiad participants around 1998. It was supposed to be a quick way to tell apart algorithmists from mere programmers.

Given an undirected graph G, and two vertices u and v, find a cycle containing both u and v, or report than none exists. Running time O(m).
Update: Simple ideas based on a DFS (like: find one path, find another) do not work. Think of the following graph:
If you first find the path s → a → b → t, you will not find a second. 

The one-line answer is: try to route two units on flow from s to t in the unit-capacity graph (with unit node capacities if you want simple cycles). This is not the same as two DFS searches, because the second DFS is in the residual graph (it can go back on the edges of the first DFS).


About the previous puzzle: As many people noticed, Alice can guarantee a win or a draw. She computes the sum of the elements on odd positions and the sum on the even positions. Depending on which is higher, she only plays odd positions or only even positions. (Bob has no choice, since the subarrays he's left with always have the ends of the same parity.)

But how do you compute the optimal value for Alice? If the sum of even and odd is equal, how can Alice determine whether she can win, or only draw? A simple dynamic program running in O(n2) time works. Can you solve it faster?

Friday, April 3, 2009

CC3: the good, the bad, the ugly

This is the 3rd post in the thread on communication complexity, following Episodes 1 and 2.


This post is also available in PDF.

In today's episode, we will prove our first randomized/distributional lower bound. We will consider a problem that appears quite trivial at first sight -- INDEXING, defined as follows:
  • Alice receives a bit vector,  x ∈ {0,1}n
  • Bob receives an index,  y ∈ [n]        (note standard notation: [n]={1,...,n})
  • their goal is to output xy, i.e. the y-th bit of the vector x.
One way to view INDEXING is as the simplest case of lopsided set disjointness, where Bob's set is of size one: Alice receives the set S = { i | x=1 }, and Bob receives T = { y }.

Intuition. What trade-off should we expect between Alice's communication a, and Bob's communication b? Clearly, the problem can be solved by [a=1, b=lg n] and by [a=n, b=1]. 

In between these two extremes, the best use of b seems to be for Bob to send b bits of his index. Alice now knows the index to be in a set of n/2b possibilities. She can simply send all her bits at these positions, using communication an/2b. Finally, Bob announces the answer with one more bit.

We thus showed the upper bound: an/2b-1. It should be fairly intutive that this is also a tight lower bound (up to constants). Indeed, no matter what Bob communicates, his index will be uncertain in a set of n/2b possibilities (on average). If Alice sends less than n/2b+1 bits of information, at least half of the possible index positions will not have a specified answer based on Alice's message. In other words, the protocol fails to determine the answer with constant probability (i.e. makes constant error).


Distributions. Before proving a distributional lower bound, we must find the distribution that makes the problem hard. From the intuition above, it should be clear that the right distributions are uniform, both for Alice's vector and for Bob's index.


Rectangles. We are in the situation of product distributions: the inputs of Alice and Bob are independent. This is a very good situation to be in, if you remember the main take-home lesson from Episode 1: rectangles. Remember that some fixed communication transcript is realized in a combinatorial rectangle A x B, where A is a subset of Alice's possible inputs, and B a subset of Bob's possible inputs. The main ideas of a deterministic proof were:
  • little communication from Alice means A is large;
  • little communication from Bob means that B is large;
  • but the rectangle must be monochromatic (the answers must be fixed, since the communication is over)
  • if A and B are large, you must have both yes-instances and no-instances.
For a product distribution, we will perform essentially the same analysis, given that the densities μ(A) and μ(B) can be measured independently. The only difference will be that the rectangle is not monochromatic, but almost monochromatic. Indeed, if the protocol may make some errors, the answer need not be fixed in the rectangle. But it must happen that one answer (yes or no) is much more frequent -- otherwise the error is large.

The first three steps of the analysis are formalized in the following randomized richness lemma, from the classic paper of [Miltersen, Nissan, Safra, Wigderson STOC'95]:

Lemma. (randomized richness)  Say Alice and Bob compute a function : XxY -> {0,1} on a product distribution over XxY. Assuming that:
  • f is dense, in the sense that E[ f(x,y) ] ≥ α ≥ 1/5.
  • the distributional error is at most ε.
  • Alice communicates a bits, and Bob b bits.
Then, there exists a rectangle AxB (AX, BY) satisfying:
  • Alice's side is large: μ(A) ≥ 2-O(a);
  • Bob's side is large: μ(B) ≥ 2-O(a+b);
  • the rectangle is almost monochromatic: E [ f(x,y) | xA, yB ] ≥ 1- O(ε).
Proof: Though the statement is very intuitive, the proof is actually non-obvious. The obvious approach, which fails, would be some induction on the bits of communication: fix more bits of the communication, making sure the rectangle doesn't decrease too much, and the error doesn't increase too much in the remaining rectangle.

The elegant idea is to use the deterministic richness lemma. Let F be the output of the protocol (what Alice and Bob answer). We know that f and F coincide on 1-ε of the inputs. By definition, the protocol computes F deterministically with no error (duh!). It is also clear that F is rich, because it is dense E[F] ≥ α-ε.

So apply the deterministic richness lemma from Episode 1. We get a large rectangle of F in which the answer is one. But how do we know that f is mostly one in the rectangle? It is true that F and f differ on only ε of the inputs, but that ε might include this entire rectangle that we chose! (Note: the rectangle size is ~ 2-a x 2-b, so much much smaller than some constant ε. It could all be filled with errors.)

We were too quick to apply richness on F. Now define G, a cleaned-up version of F. Consider some transcript of the protocol, leading to a rectangle AxB. If the answer is zero, let G=0 on AxB. If the answer is one, but the error inside this rectangle is more than 10 ε, also let G=0. Otherwise, let G=1 on the rectangle.

How much of the truth table gets zeroed out because of excessive error (above 10 ε)? Well, the overall average error is ε, so we can apply a Markov bound to it: the mass of all rectangles in which the error exceeds 10 ε is at most 1/10.

Thus, G is also fairly dense: E[G≥ E[F] - 1/10 ≥  α - (1/10) ε ≥ 1/10 - ε. Thus, G is rich, and we can find a bit rectangle in which it is identically one. But in that rectangle, E[f] ≥ 1 - 10 ε by construction of G.


The good, the bad, the ugly. It remains to prove that in any large rectangle, the fraction of zeros must be non-negligible (it may not be almost all ones). This part is, of course, problem specific, and we shall do it here for INDEXING. 

Unfortunately, these proofs are somewhat technical. They typically apply a number of "pruning" steps on the rectangle. An example of pruning on the space of rectangles was seen above: we zeroed out all rectangles that had more than 10 ε error. In these proofs, we throw out rows and columns we don't like for various reasons. One usually makes many funny definitions, talking about "good rows", "bad rows", "ugly columns", etc.

While looking at such proofs, it is important to remember the information-theoretical intuition behind them. After you understand why the statement is true (handwaving about the information of this and the entropy of that), you can deal with these technicalities on the night before the STOC/FOCS deadline.

Here is how the proof proceeds for INDEXING:

Lemma: Consider any A ⊂ {0,1}n and B ⊂ [n] such that |A| ≥ 2n+1 / 2|B|/2. Then EA,[f(x,y)] ≥ 0.95.

Proof: Assume for contradiction that PrA,[f(x,y) = 0] ≤ 1/20. Let an ugly row be a row xA such that Pr[f(x,y) = 0] > 1/10. At most half of the rows are ugly by the Markov bound. Discard all ugly rows, obtaining A'⊂A, with |A'| ≥ |A|/2.

For every remaining xA', we have xy=0 for at least 0.9 |B| indices from B  (this is the definition of f). Call these good indices. 

There are (|B|  choose  0.9|B|) = (|B|  choose  0.1|B|) choices for the good indices. For every set of good indices, there are at most 2n - 0.9|B| vectors x which are zero on the good indices. Thus:
|A'|  ≤  (|B|  choose  0.1|B|) 2- 0.9|B|  ≤  2n 2O(0.1 |B| lg 10) / 20.9|B = 2n / 20.9|B|-O(0.1 |B| lg 10)
This is at most 2n / 20.5|B| , at least for a sufficiently small value of 1/10 (I never care to remember the constants in the binomial inequalities). We have reached a contradiction with the lemma's guarantee that A is large.

Combine this with the richness lemma with an error ε=0.05. We have |A|≈2n-a and |B|≈n/2b, so it must be the case that a = Ω(|B|) ≥ n/2O(b). We have obtained a tight lower bound.

Wednesday, April 1, 2009

Matching in Regular Bipartite Graphs

It is well known that d-regular bipartite graphs have a perfect matching. Applying this iteratively, it even follows that such graphs can be decomposed into the union of d perfect matchings. (The existence of a matching follows from Hall's theorem, which is probably beaten into students during any combinatorics course.)


But how easy is it to find a perfect matching in a d-regular bipartite graph? The following classic gem finds a matching in O(m) time --- but, oddly enough, it assumes d is a power of two!
Find an Eulerian circuit of the graph in O(m) time, which exists for any even d. Now throw out the 2nd, 4th, 6th, etc edges of the circuit. You are left with a graph that is (d/2)-regular. Recurse to find a matching.
Why do you get a regular graph when you throw out all edges on even positions? Think about it: the edges of the circuit go back and forth between the two parties of the graph. For any vertex v, its d incident edges are grouped into d/2 pairs of consecutive edges. One edge of such a pair appears on an odd position in the circuit, and one on an even position.

This algorithm is due to Gabow and Kariv [STOC'78]. By contrast, obtaining a linear-time algorithm for d not a power of two took until 2001 [Cole-Ost-Schirra]. In SODA'09, [Goel-Kapralov-Khanna] achieved sublinear algorithms for d greater than sqrt(n). The idea is that matchings are so frequent that you can still find one (with care) in a random sample of the graph. 

My interest is not so much with the sublinear complexity, but more with matching in general bipartite graphs (not regular). After you see this beautiful algorithm for matching in regular graphs, it is hard not to feel a bit optimistic that general bipartite matching may also be solvable in O(m) time.