Machine learning contract example
In this section, we present a ZK smart contract that enables private inference on a binary decision tree classifier. We briefly describe a potential use case and the processes that precede the creation of the smart contract. We walk through the code, written in the ZK Rust language, which specifies how to securely run inference on the tree to generate an output. Lastly, we explain how you can adapt the contract to your own binary decision tree classifier. The full code is available in our repository of open source contract examples. If you want to write a ZK smart contract for a different type of machine learning algorithm, you can read about what you need to keep in mind in terms of the capabilities of PBC and the ZK Rust language in the next section.
Use case and model preparation
The example contract addresses a scenario where a bank makes a model available to its clients, which can predict whether the client's annual income exceeds $50K. The output is meant to serve as an indicator of the type of bank loan the client is eligible for. As the sample owner, the client can specify who will receive the output, and no one but the receiver will learn any new information.
The contract assumes a pre-trained model. This means that before uploading the model to the smart contract, the model owner has trained the model on a dataset off-chain, tuned it, and evaluated its performance to make it ready for deployment on PBC. The dataset used in the example contract was preprocessed to make it suitable for training and to reduce the number of features. You can see how we preprocessed the data and trained the model in the contract example README.
The contract is intended to illustrate the feasibility of running privacy-preserving machine learning inference on PBC. For other applications, different configurations or preprocessing choices may be more appropriate.
Code walk-through
A decision tree comprises internal vertices and leaf vertices. Each internal vertex consists of a splitting feature and a threshold value that dictates which path to follow: if the feature value in the input sample is less than or equal to the threshold, the left branch is taken, and if it is larger than the threshold, the right branch is taken. Each leaf vertex stores a class prediction, which is assigned to any input sample that takes the path ending in that leaf. Our binary classes are encoded as 0 and 1, which represent the prediction that the annual income does not exceed $50K and the annual income exceeds $50K, respectively.
Running the decision tree classifier on an input sample proceeds in three sequential steps. All internal vertices, leaf vertices, and sample values are secret-shared, so all computations are performed on shares. In the first step, each internal vertex is evaluated by comparing the threshold to the feature value in the input sample. Then, each path through the tree is evaluated to secretly determine whether it is consistent with the input sample. In the last step, we obtain the final output by extracting the class prediction of the leaf vertex that is consistent with the secret path taken by the input sample.
Evaluating the internal vertices
The internal vertices are represented as an array of (splitting feature, threshold value) pairs ordered according to a pre-order traversal of the decision tree. Every internal vertex in the tree is evaluated to ensure privacy, even if the input sample would never reach the vertex. We perform oblivious lookup in the input sample to obtain the feature value corresponding to the splitting feature in each vertex. Then, we perform secure comparison between the feature value and the threshold value. The output of this step is an array of secret-shared bits, where 1 represents that the comparison held true, and 0 represents that it did not.
fn evaluate_internal_vertices(internal_vertices: [InternalVertex; 7], sample: [Sbi16; 10]) -> [Sbu1; 7] {
let mut result: [Sbu1; 7] = [Sbu1::from(false); 7];
for i in 0usize..7usize {
let value: Sbi16 = lookup_in_sbi8_array(sample, internal_vertices[i].feature);
if value <= internal_vertices[i].threshold {
result[i] = Sbu1::from(true);
}
}
result
}
Evaluating the paths
In this step, we use the result of evaluating the internal vertices to secretly compute the consistency of each path with respect to the input sample. Like the internal vertices, the paths are ordered according to a pre-order traversal of the tree. A path is evaluated through a series of bitwise AND operations between the secret-shared comparison results of every internal vertex that appears in the path. Each internal vertex splits into two branches, left and right. Whenever the path follows a right branch, we negate the result of the evaluation of the given internal vertex. The output of this step is an array of secret-shared bits, where 1 represents that the path was consistent with the input sample, and 0 represents that it was not. Only one entry will contain a 1, corresponding to the actual path taken by the input sample, but since the bits are secret-shared, no one will know which entry that is.
fn evaluate_paths(vertex_evaluation: [Sbu1; 7]) -> [Sbu1; 8] {
let result: [Sbu1; 8] = [
vertex_evaluation[0] & vertex_evaluation[1] & vertex_evaluation[2],
vertex_evaluation[0] & vertex_evaluation[1] & !vertex_evaluation[2],
vertex_evaluation[0] & !vertex_evaluation[1] & vertex_evaluation[3],
vertex_evaluation[0] & !vertex_evaluation[1] & !vertex_evaluation[3],
!vertex_evaluation[0] & vertex_evaluation[4] & vertex_evaluation[5],
!vertex_evaluation[0] & vertex_evaluation[4] & !vertex_evaluation[5],
!vertex_evaluation[0] & !vertex_evaluation[4] & vertex_evaluation[6],
!vertex_evaluation[0] & !vertex_evaluation[4] & !vertex_evaluation[6],
];
result
}
Obtaining the output
Finally, we use the result of evaluating the paths and the secret-shared class predictions in the leaf vertices. The leaf vertices are represented as an array ordered according to when they are visited during a pre-order traversal of the tree. First, we perform elementwise AND operations between the path evaluations and the class predictions to isolate the class prediction in the leaf vertex where the actual path ends. Then, we perform bitwise OR operations on the resulting array of secret-shared bits to obtain the final, secret-shared output.
fn predict_class(path_evaluation: [Sbu1; 8], leaf_vertices: [LeafVertex; 8]) -> Sbu1 {
let mut product: [Sbu1; 8] = [Sbu1::from(false); 8];
for i in 0usize..8 {
let eval: Sbu1 = path_evaluation[i];
let class: Sbu1 = leaf_vertices[i].classification;
product[i] = eval & class;
}
let mut result: Sbu1 = Sbu1::from(false);
for i in 0usize..8 {
result = result | product[i];
}
result
}
Adapt to your own binary decision tree classifier
While this contract example is tailored to a particular scenario, the ZK Rust code necessary to perform inference on the model only depends to a small extend on the characteristics of the input sample and the structure of the decision tree. If you want to adapt the code to your own binary decision tree classifier, you only need to change the number of feature values in the input sample, the number of internal vertices and leaf vertices, and the hardcoded paths in the path evaluation function. Keep in mind that our inference approach is best suited for small decision trees, since the computational complexity grows quickly with the size of the tree.