I’ve been reading Mermin’s account of Deutsch’s Problem, and it’s given me a new insight into where the power of quantum computing comes from.
If you look at the original problem, you have a function f(x) from {0,1} to {0,1}. There are 4 such functions.
Function | f(0) | f(1) |
f_1 | 0 | 0 |
f_2 | 0 | 1 |
f_3 | 1 | 0 |
f_4 | 1 | 1 |
What can you learn by calling the function once? If you call it with 0 as input, you separate 1,2 from 3,4. By calling f(1) you separate 1,3 from 2,4.
Can you separate 1,4 from 2,3? In other words, can you answer the question “Does f(0) equal f(1)?” with a single function call? It’s impossible: Classically! (Try it and see)
In classical computing you have some bits and the only thing you can do to them is flip em from 0 to 1 and back and permute them around.
Quantum bits have additional properties. They can be rotated! More precisely, if you consider 0 and 1 to be two orthonormal vectors in a 2-dimensional complex vector space, quantum bits can be any unit vector in that space. The idea of a complex linear combination (of the computational basis vectors 0 and 1) is something that makes no sense in classical terms. Yet we can do operations with those combinations, rotate back into one of the classical bits, and get out a classical answer. Deutsch’s algorithm (explained on Wikipedia) uses Hadamard gates, which let you apply the function f (once!) to the qubit represented by . After some more hadamard gates you get to measure a bit which is 1 if f(0) equals f(1) and 0 otherwise. Note that you do not get to know either f(0) or f(1), so you haven’t magically gained more information from your single function call, just different information which you could think of as “perpendicular” to the classical way of seeing things.
I just realised that there is something which forms a useful analogy for the power of quantum computing: the kernel trick in SVMs!
Support vector machines work by taking dot products of data vectors with special vectors called support vectors. We might have some data which may not be linearly separable in the space we’re working in, but might be separable in a higher dimensional space. In other words, there might be some fancy function such that working with lets us linearly separate things even though doesn’t work. The kernel trick is that we can actually compute from without having to compute or even work in the high-dimensional -space at all. So what we do is we take the dot product (linear), and compute some nonlinear function of it (say we take the square). Magically, this function turns out to be the dot product of for some function !
Of course, the caveat is that you can’t get just any function this way, but in practice you’ll easily find some function which lets you separate the data.
To work through an example of what I described above: Let . Then . Suppose our kernel function was . Then we get ,which can be written as . But this is equal to , where is the non-linear function !
So using kernel functions, by sacrificing other information to only learn about dot products, we’ve managed to leverage the power of computing in the -space, without actually needing to explicitly compute .
Analogously in quantum computing, interference among complex linear combinations of classical bits let us compute (classical) answers to questions that cannot be answered using only classical operations. Note that we do not (and cannot) get “other” information like the complex coefficients of the qubit.
Kernel Trick | Quantum Computing |
Classical bit ( 0 or 1) | |
Qubit in quantum superposition () | |
Classical indicator function | |
Quantum circuit |