A Quantum distance-based classifier (part 3)

Robert Wezeman, TNO

Table of Contents

\[\newcommand{\ket}[1]{\left|{#1}\right\rangle}\]
[1]:
## Import external python file
from data_plotter import DataPlotter  # for easier plotting
DataPlotter = DataPlotter()

# Import math functions
import numpy as np
from math import acos, atan, sqrt
from numpy import sign

Introduction

This notebook is the third in the series on the quantum distance-based classifier. In the first notebook, we looked at how to build a distance-based classifier with two data points, each having two features. In the second notebook, we looked at how to increase the amount of data points. In this notebook we will look at how to increase the amount of features.

Back to Table of Contents

Problem

We repeat the problem description from the previous notebooks. We define the following binary classification problem: Given the data set

\[\mathcal{D} = \Big\{ ({\bf x}_1, y_1), \ldots ({\bf x}_M , y_M) \Big\},\]

consisting of \(M\) data points \(x_i\in\mathbb{R}^n\) and corresponding labels \(y_i\in \{-1, 1\}\), give a prediction for the label \(\tilde{y}\) corresponding to an unlabeled data point \(\bf\tilde{x}\). The classifier we shall implement with our quantum circuit is a distance-based classifier and is given by

\begin{equation}\newcommand{\sgn}{{\rm sgn}}\newcommand{\abs}[1]{\left\lvert#1\right\rvert}\label{eq:classifier} \tilde{y} = \sgn\left(\sum_{m=0}^{M-1} y_m \left[1-\frac{1}{4M}\abs{{\bf\tilde{x}}-{\bf x}_m}^2\right]\right). \hspace{3cm} (1)\end{equation}

This is a typical \(M\)-nearest-neighbor model where each data point is given a weight related to the distance measure. To implement this classifier on a quantum computer we use amplitude encoding. Details are found in the previous notebooks.

Back to Contents

Theory

In the previous notebooks we encoded the data, each having two features, with a simple \(R_y(\theta)\) rotation. The angle for this rotation was chosen such that it rotated the state \(\ket{0}\) to the desired state \(\ket{\bf{x}}\) corresponding to the data. If we want to include more features for our data points we need to generalize this rotation. Instead of a simple rotation, we now need a combination of gates such that it maps \(\ket{0\ldots 0} \mapsto \ket{\bf x} = \sum_i a_i \ket{i}\), where \(\ket{i}\) is the \(i^{th}\) entry of the computational basis \(\left\{\ket{0\ldots0},\ldots,\ket{1\ldots1}\right\}\). Again we only work with normalised data, meaning \(\sum_i \lvert a_i \rvert^2=1\). The general procedure how to initialize a state to an arbitrary superposed state can be found in the article by Long and Sun. In this notebook we will consider how to implement their scheme for 2 qubits, that is up to 4 features:

\begin{equation} \ket{00} \mapsto a_{00} \ket{00} + a_{01} \ket{01} + a_{10} \ket{10} + a_{11} \ket{11} \end{equation}

For the implementation we closely follow the reference and use single bit rotation gates \(U(\theta)\) defined by

\begin{equation} U(\theta) = \begin{pmatrix} \cos(\theta) & \sin(\theta) \\ \sin(\theta) & -\cos(\theta) \end{pmatrix} = R_y(2\theta) \cdot Z \end{equation}

and controlled versions of it. Because we will only act with these gates on \(\ket{0}\) we can even drop the \(Z\) gate.

For two qubits the scheme consists of three steps: 1. Apply a bit rotation \(U(\alpha_1)\) on the first qubit:

\begin{equation} U(\alpha_1)\ket{0}\otimes\ket{0}= \sqrt{\abs{a_{00}}^2 + \abs{a_{01}}^2} \ket{00} + \sqrt{\abs{a_{10}}^2 + \abs{a_{11}}^2} \ket{10}, \end{equation}

where \(\alpha_1\) is given by

\begin{equation} \alpha_1 = \arctan\left(\sqrt{\frac{\abs{a_{10}}^2 + \abs{a_{11}}^2}{\abs{a_{00}}^2 + \abs{a_{01}}^2}}\right) \end{equation}
  1. Next, apply a controlled-rotation \(U(\alpha_2)\) on the second qubit, with as control the first qubit being 0. Choose \(\alpha_2\) such that:

    \begin{equation} \cos(\alpha_2) = \frac{a_{00}}{\sqrt{\abs{a_{00}}^2+\abs{a_{01}}^2}}, \hspace{1cm} \sin(\alpha_2) = \frac{a_{01}}{\sqrt{\abs{a_{00}}^2+\abs{a_{01}}^2}} \end{equation}
  2. Lastly, apply a controlled-rotation \(U(\alpha_3)\) on the second qubit, with the first qubit being 1 as control. Choose \(\alpha_3\) such that:

    \begin{equation} \cos(\alpha_3) = \frac{a_{10}}{\sqrt{\abs{a_{10}}^2+\abs{a_{11}}^2}}, \hspace{1cm} \sin(\alpha_3) = \frac{a_{11}}{\sqrt{\abs{a_{10}}^2+\abs{a_{11}}^2}} \end{equation}

The angles \(\alpha_2\) and \(\alpha_3\) are chosen such that the root terms cancel out, leaving us with the desired result:

\begin{equation} \begin{split} \sqrt{\abs{a_{00}}^2 + \abs{a_{01}}^2}& \ket{0}\otimes U(\alpha_2)\ket{0} + \sqrt{\abs{a_{10}}^2 + \abs{a_{11}}^2} \ket{1}\otimes U(\alpha_3)\ket{0}\\ &=a_{11} \ket{00} + a_{01} \ket{01} + a_{10} \ket{10} + a_{11} \ket{11} \end{split} \end{equation}

Note: this circuit makes it also possible to encode 3 features by simply setting \(a_{11}=0\).

The circuit looks something like this:

3b0d64661f244cc7b4795452c4119a19

Back to Table of Contents

Algorithm for the arbitrary state preparation

In this notebook we will work with the Qiskit backend for the quantum inspire. Let us first take a closer look at the state preparation part of the circuit used to preparing an arbitrary state. The following code loads the scaled and normalised data of the Iris set containing all 4 features. Let us consider the first point in this set

[2]:
# Load scaled and normalised data
iris_setosa_normalised, iris_versicolor_normalised = DataPlotter.load_data(max_features=4)
first_data_point = [feature[0] for feature in iris_setosa_normalised]
first_data_point
[2]:
[-0.32695550305984783,
 0.4736868636179572,
 -0.5699845991812187,
 -0.5863773622428564]

We want to build the circuit such that:

\begin{equation}\ket{00} \mapsto -0.3270 \ket{00} + 0.4737 \ket{01} - 0.5700 \ket{10} - 0.5864 \ket{11}\end{equation}
[3]:
a00, a01, a10, a11 = first_data_point
alpha1 = atan(sqrt((a10**2  + a11**2) / (a00**2  + a01**2)))
alpha2 = np.arctan2(a01, a00)
alpha3 = np.arctan2(a11, a10)

As always the first step is to set up a connection with the Quantum Inspire:

[4]:
import os

from qiskit.circuit import QuantumRegister, ClassicalRegister, QuantumCircuit
from qiskit.tools.visualization import plot_histogram
from qiskit import execute

from quantuminspire.api import QuantumInspireAPI
from quantuminspire.credentials import get_authentication
from quantuminspire.qiskit import QI

QI_URL = os.getenv('API_URL', 'https://api.quantum-inspire.com/')


authentication = get_authentication()
# Temporary alternative so that we can give our project a name:
QI._api = QuantumInspireAPI(QI_URL, authentication, project_name="Distance-based Classifier more features (NEW)")

# Alternative:
# authentication = [email, password]
# QI.set_authentication_details(*authentication)

qi_backend = QI.get_backend('QX single-node simulator')

We can now construct a function which builds the quantum circuit as discussed above. Note that Qiskit contains the general unitary gates \(u3(\theta, \phi, \lambda)\) which are related to our definition of \(U(\alpha)\) by:

\[U(\alpha) = u3(2\alpha, 0, \pi).\]

Unfortunately, these \(u3\) gates are not yet implemented on the quantum inspire backend and thus we need to make use of regular \(R_y\) rotations and CNOT gates.

[5]:
def cRy(q, angle, ctrlidx, idx):
    if len(q) < 2:
        raise ValueError("Error, len quantum register must at least be 2.")
    circuit = QuantumCircuit(q)
    half_angle = angle / 2

    circuit.cx(q[ctrlidx], q[idx])
    circuit.ry(-half_angle, q[idx])
    circuit.cx(q[ctrlidx], q[idx])
    circuit.ry(half_angle, q[idx])
    return circuit


def features_encoding(q, alpha, idx1, idx2):
    if len(q) < 2:
        raise ValueError("Error, len quantum register must at least be 2.")
    # Alternative use u3(2 * alpha, 0, pi) and cu3(2 * alpha, 0, pi) gates but not yet implemented on Quantum inspire
    alpha1, alpha2, alpha3 = alpha

    circuit = QuantumCircuit(q)

    # step 1.
    circuit.ry(2 * alpha1, q[idx1])

    # # step 2.
    circuit.x(q[idx1])
    circuit = circuit.compose(cRy(q, 2 * alpha2, idx1, idx2))
    circuit.x(q[idx1])

    # step 3.
    circuit = circuit.compose(cRy(q, 2 * alpha3, idx1, idx2))
    return circuit


def measurement(q):
    # TO DO: Rewrite as loop over b to avoid measurement of ancillary qubits
    circuit_size = len(q)
    b = ClassicalRegister(circuit_size)
    meas = QuantumCircuit(q, b)
    meas.measure(q, b)
    return meas
[6]:
q = QuantumRegister(2)
circuit = features_encoding(q, [alpha1, alpha2, alpha3], 0, 1)
circuit.draw(output='mpl')
[6]:
../../_images/notebooks_classifier_example_classification_example3_4_features_15_0.png
[7]:
#Initialize quantum register:
q = QuantumRegister(2)
meas = measurement(q)

# Build circuit:
circuit = features_encoding(q, [alpha1, alpha2, alpha3], 0, 1)
qc = meas.compose(circuit, front=True)


# Execute the circuit:
def execute_circuit(circuit, shots=1):
    qi_job = execute(qc, backend=qi_backend, shots=shots)

    # Temporary needed to fix naming the project:
    project = next((project for project in QI._api.get_projects()
                    if project['name'] == QI._api.project_name), None)
    if project is not None:
        qi_job._job_id = str(project['id'])

    # Results of the job:
    qi_result = qi_job.result()

    # Select the results of the circuit and print results in a dict
    probabilities = qi_result.get_probabilities(qc)
    return probabilities


def create_histogram_results(probabilities, number_of_bits=2):
    histogram_results = {}
    for k, v in probabilities.items():
        # Reversed order of the bin strings to be consistent
        histogram_results[k[::-1]] = v
    return histogram_results


# Display results:
probabilities = execute_circuit(qc)
create_histogram_results(probabilities)
[7]:
{'00': 0.1068999, '10': 0.3248824, '01': 0.2243791, '11': 0.3438387}

We can compare these results with what we expected if we measure the state \(\psi = a_{00} \ket{00} + a_{01} \ket{01} + a_{10} \ket{10} + a_{11} \ket{11}\) and obtain the result \(X\)

Outcome \(x\)

\(Prob(X=x)\)

\(\ket{00}\)

\(\abs{a_{00}}^2\)

\(\ket{01}\)

\(\abs{a_{01}}^2\)

\(\ket{10}\)

\(\abs{a_{10}}^2\)

\(\ket{11}\)

\(\abs{a_{11}}^2\)

[8]:
# Desired outcomes for the test point:
desired_outcomes = [coefficient**2 for coefficient in [a00, a01, a10, a11]]
print('Desired values: ', desired_outcomes)
plot_histogram(create_histogram_results(probabilities))
Desired values:  [0.10689990098111816, 0.2243792447642172, 0.3248824433037745, 0.34383841095089007]
[8]:
../../_images/notebooks_classifier_example_classification_example3_4_features_18_1.png

In the next section we use this feature encoding schema in the distance-based classifier for 2 data points, each having 4 features. In the algorithm however, feature encoding is done in a controlled fashion, with the index qubit as control. Therefore we will also need the same circuit as above but transformed to a controlled version. This can be done by replacing every gate by a controlled equivalent.

[9]:
def ccRy(q, angle, ctrlidx1, ctrlidx2, idx):
    if len(q) < 3:
        raise ValueError("Error, len quantum register must at least be 3.")
    circuit = QuantumCircuit(q)
    quarter_angle = angle / 4

    circuit.ccx(q[ctrlidx1], q[ctrlidx2], q[idx])
    circuit.cx(q[ctrlidx2], q[idx])
    circuit.ry(quarter_angle, q[idx])
    circuit.cx(q[ctrlidx2], q[idx])
    circuit.ry(- quarter_angle, q[idx])

    circuit.ccx(q[ctrlidx1], q[ctrlidx2], q[idx])
    circuit.cx(q[ctrlidx2], q[idx])
    circuit.ry(-quarter_angle, q[idx])
    circuit.cx(q[ctrlidx2], q[idx])
    circuit.ry(quarter_angle, q[idx])
    return circuit


def c_features_encoding(q, alpha, ctrlidx, idx1, idx2):
    if len(q) < 3:
        raise ValueError("Error, len quantum register must at least be 3.")
    alpha1, alpha2, alpha3 = alpha
    circuit = QuantumCircuit(q)

    # step 1.
    circuit = circuit.compose(cRy(q, 2 * alpha1, ctrlidx, idx1))  # old: ry(2 * alpha1, q[idx1])

    # # step 2.
    circuit.cx(q[ctrlidx], q[idx1])  # old: x(q[idx1])
    circuit = circuit.compose(ccRy(q, 2 * alpha2, idx1, ctrlidx, idx2))  # old: cRy gates
    circuit.cx(q[ctrlidx], q[idx1])  # old: x(q[idx1])

    # step 3.
    circuit = circuit.compose(ccRy(q, 2 * alpha3, idx1, ctrlidx, idx2))  # old: cRy gates
    return circuit

Back to Table of Contents

Implementation of the distance-based classifier

We first consider the case with 2 data points and just 2 features, randomly chosen from the Iris data set. This so that we can later compare the results of the distance-based classifier with 2 features to the implementation with 4 features.

[10]:
def two_features_classifier(q, x_tilde, x0, x1):
    if len(q) != 4:
        raise ValueError("Error, len quantum register must be 4.")
    circuit = QuantumCircuit(q)

    # Angles data points:
    angle_x_tilde = 2 * acos(x_tilde[0]) * sign(x_tilde[1]) # Label ?
    angle_x0 = 2 * acos(x0[0]) * sign(x0[1])                # Label 0
    angle_x1 = 2 * acos(x1[0]) * sign(x1[1])                # Label 1

    # part_a:
    for i in range(2):
        circuit.h(q[i])

    # part_b:
    circuit = circuit.compose(cRy(q, angle_x_tilde, 1, 2))
    circuit.x(q[1])

    # part_c:
    circuit = circuit.compose(ccRy(q, angle_x0, 1, 0, 2))
    circuit.x(q[0])

    # part_d:
    circuit = circuit.compose(ccRy(q, angle_x1, 1, 0, 2))

    # part_e:
    circuit.cx(q[0], q[3])

    # part_f:
    circuit.h(q[1]);
    return circuit

The above algorithm is the same implementation of the algorithm as we did in the first notebook. The following code runs the algorithm on randomly selected data from the iris set.

[11]:
# Get 2 random data points with 2 features.
data_label0, data_label1, x_tilde, random_label = DataPlotter.grab_random_data(size=2, features=2)

#Initialize quantum register:
q = QuantumRegister(4)
meas = measurement(q)

# Initialize circuit:
circuit = two_features_classifier(q, x_tilde, data_label0[0], data_label1[0])
qc = meas.compose(circuit, front=True)

# Execute the circuit:
probabilities = execute_circuit(qc)

# Display the results:
histogram_results = create_histogram_results(probabilities, number_of_bits=4)
DataPlotter.plot_data_points(x_tilde, data_label0, data_label1, histogram_results);  # Function to plot the data
../../_images/notebooks_classifier_example_classification_example3_4_features_26_0.png

Below the implementation for four features is given. Note that the structure of the algorithm is similar to the case with two features. We use one ancillary qubit for the controlled encoding of the features.

[12]:
def four_features_classifier(q, x_tilde, x0, x1):
    if len(q) != 6:
        raise ValueError("Error, len quantum register must be 5 + 1 ancillary qubit.")
    circuit = QuantumCircuit(q)

    def get_alpha(data_point):
        a00, a01, a10, a11 = data_point
        alpha1 = atan(sqrt((a10**2  + a11**2) / (a00**2  + a01**2)))
        alpha2 = np.arctan2(a01, a00)
        alpha3 = np.arctan2(a11, a10)
        return [alpha1, alpha2, alpha3]

    # part_a:
    for i in range(2):
        circuit.h(q[i])

    # part_b:
    alpha = get_alpha(x_tilde)
    circuit = circuit.compose(c_features_encoding(q, alpha, 1, 2, 3))
    circuit.x(q[1])

    # part_c:
    # Use ancillary qubit + c_features_encoding for cc_features_encoding
    circuit.ccx(q[0], q[1], q[5])
    alpha = get_alpha(x0)
    circuit = circuit.compose(c_features_encoding(q, alpha, 5, 2, 3))
    circuit.ccx(q[0], q[1], q[5])
    circuit.x(q[0])

    # part_d:
    # Use ancillary qubit + c_features_encoding for cc_features_encoding
    circuit.ccx(q[0], q[1], q[5])
    alpha = get_alpha(x1)
    circuit = circuit.compose(c_features_encoding(q, alpha, 5, 2, 3))
    circuit.ccx(q[0], q[1], q[5])

    # part_e:
    circuit.cx(q[0], q[4])

    # part_f:
    circuit.h(q[1])

    return circuit

The following code runs the algorithm for 2+1 random data points with 4 features each.

[13]:
# Get 2 random data points with 4 features.
data_label0, data_label1, x_tilde, random_label = DataPlotter.grab_random_data(size=2, features=4)

#Initialize quantum register:
q = QuantumRegister(6)
meas = measurement(q)

# Initialize circuit:
circuit = four_features_classifier(q, x_tilde, data_label0[0], data_label1[0])
qc = meas.compose(circuit, front=True)

# Execute the circuit:
probabilities = execute_circuit(qc)

# Display the results:
histogram_results = create_histogram_results(probabilities, number_of_bits=6)


def strip_ancillary_qubit(histogram_results):
    new_histogram_results = {}
    for k, v in histogram_results.items():
        new_histogram_results[k[:-1]] = v  # Strip ancillary qubit
    return new_histogram_results


histogram_results = strip_ancillary_qubit(histogram_results)


def summarize_results_4_features(histogram_results):
    sum_label0 = 0
    sum_label1 = 0
    for key, value in histogram_results.items():
        if key[1] == "0":
            if key[-1] == "0":
                sum_label0 += value
            else:
                sum_label1 += value
    print("Sum Iris Setosa (red, label 0): ", sum_label0)
    print("Sum Iris Versicolor (blue, label 1): ", sum_label1)


summarize_results_4_features(histogram_results)
print("Random label is: ", random_label)
print("Prediction by true classifier: ", DataPlotter.true_classifier(data_label0, data_label1, x_tilde))

# Plot results:
DataPlotter.plot_data_points_multiple_features(
    data_label0, data_label1, x_tilde, random_label, histogram_results) # Plot features + bar plot results
Sum Iris Setosa (red, label 0):  0.0704467
Sum Iris Versicolor (blue, label 1):  0.4095059
Random label is:  1
Prediction by true classifier:  1
../../_images/notebooks_classifier_example_classification_example3_4_features_30_1.png

In the case of an infinite amount of shots the quantum inspire gives as a result the true probability distribution which coincides with the classical solution of the distance-based classifier. The following table shows the quality of the distance-based classifier depending on the amount of data points and included features. The table contains the percentage of correct predictions for random selected data from the iris set, the results are over a sample of 10.000 runs and can be reproduced using the quality_classifier method of DataPlotter class.

% correct prediction

2 features

3 features

4 features

2 data points

0.9426

0.9870

0.9940

4 data points

0.9735

0.9933

0.9986

8 data points

0.9803

0.9975

0.9998

These results show why one is not only interested in extending the amount of data points but also in including more features for data.

Conclusion and further work

In this notebook we demonstrated how to extended the distance-based classifier to be able to handle data containing up to four features. We saw that the quality of the distance-based classifier improves both by including more data points but also by including data which containing more features.

So far in this notebook series we have only looked at binary classification, the results belong either to the class with a label 0 or to the class with the label 1. For some problems one is interessted in identifying between more than two classes, for example number recognition. A possible next extention for the current classifier is to extend it to being able to classify between more than two labels. This can be done by encoding the label in multiple qubits instead of one qubit.

We have only tested the distance-based classifier on rescaled data from the iris data set, this data set is well classified by the the distance-based classifier. For other data sets this might not necessary be the case. Suppose a different data set which after scaling has different the classes lie in concentric circles, at first glance we do not expect the distance-based classifier to yield good predictions. These problems can possibly be solved by an alternative data pre-processing or by a totally different type of classifier. The task of selecting the right methods for data preprocessing and the corresponding classifier is not a task for the quantum computer but for the data analyst. It will be interessting to see different classifiers implemented on quantum computers in the near future.

Back to Table of Contents

References

[ ]: