Concepts & Architecture

Introduction

With the XOR Secret Computing© Platform, you can perform secure evaluation of functions across multiple private data sources, cryptographically guaranteeing that no information is shared between the data sources and that the analyst conducting the computation only sees the output of the function without revealing the private inputs. This proprietary platform is built on our patent pending Secure Multiparty Computation (MPC) protocol that enables fast performance with high-precision real-valued functions.

With the XOR you can do privacy-preserving analytics and machine learning to meet compliance requirements, maintain customer confidentiality, work with other departments’ and organizations’ sensitive data, even create a platform for third parties to train their own models on your data without seeing it (or vice versa). Meet the new world of Private A.I.

Privacy-preserving computing allows multiple parties to evaluate a function over its inputs while keeping the inputs private and revealing only the output of the function and nothing else.

The XOR Secret Computing© Platform is based on a scalable Secure Multiparty Computation (MPC) protocol for evaluating real-valued functions. The protocol is based on secret sharing and approximation of real-valued functions by Fourier series. For full technical details on the protocol, we refer to our paper.

The security model for MPC used is full threshold, honest-but-curious dealer with uniformly distributed masking, thus providing information-theoretic security.

Components & Requirements

Product Architecture

Sensitive data never leaves each data source; only random numbers are exchanged between the XOR Machines. For more on the security assumptions, see Security Considerations and Best Practices.

XOR Service

Hosted by Inpher, the XOR Service is responsible for compiling the circuits submitted by the data analyst as well as generating random auxiliary data used for the computation. The XOR Service never sees any customer data!

XOR Machine

Each data owner runs a virtual machine (VM) known as XOR Machine (docker image or software package) within their privacy zone on any available infrastructure such as private-cloud, on-prem server, desktop computer or laptop. For simplicity the data owner can either manually export their data (e.g from Excel, RDBMS) as a csv into a folder mounted on the XOR Machine or directly connect the XOR Machine to a RDBMS using an ODBC connector and query the database directly. The XOR Machines will execute on the plaintext functions within the privacy zone and switch to the XOR MPC protocol when privacy preserving aggregate computations have to be performed. In the latter case, only secret-shared data is exchanged between the computing parties.

Analyst Platform

Data analysts interact with XOR through the Analyst Platform which exposes the XOR API's. The XOR API's can be used either through a web application providing a graphical user interface, a python library (XOR-py) or directly through a REST API interface. Using the Analyst Platform, the analyst can browse and select available private datasets, compose distributed datasets, and run private computations on these distributed datasets either retrieving the results or keeping the results stored as encrypted secret shares on each XOR Machine.

Anatomy of a Secret Computation

Datasets

An MPC computation operates on data that is either available as plaintext or secret-shares, to all or just a subset of the computing parties. These input and output datasets are stored in a distributed privacy-preserving store that we call: PDDStore. A dataset represents a table or a matrix of numbers, booleans, or strings, it has a visibility and a set of owners. It is identified by a globally-unique source name, and stored under a binary format convertible to/from i.e. standard csvs. Currently, the two supported visibilities are plaintext and full threshold secret-shares.

plaintext visibility

A plaintext dataset is physically stored as plaintext, and all of its owners have a copy of it. In particular, a plaintext dataset that belongs to a single owner is also called "private" dataset, whereas a plaintext dataset owned by all the computing parties is called "public" dataset. At least one owner of the plaintext dataset must be present to use it without revealing its content in an MPC computation; however, for security reasons, it is only used in its plaintext form if and only if all the computing parties are owners.

In general, plaintext datasets should be converted to binary format, uniquely identified, and stored in the PDDStore. Each participant also has a private-data folder in which he can put its own private csv files: these files are automatically parsed, converted to binary, and treated as single-owner plaintext private datasets.

secret-shares visibility

A secret-shared dataset is presented as $n$ shares, where each owner gets one share, such that any threshold of $k$ share owners can re-construct the data if they recombine their shares. Unless otherwise specified, the secret-sharing is full threshold: $k = n$ , all owners need to be present to use the dataset.

Secret-shared datasets are always stored in binary format, uniquely identified, and stored in the PDDStore. It can be used in an MPC computation, without revealing its content as soon as at least $k$ out-of its $n$ owners are among the computing parties. Analyst can upload csv files to the PDDStore by using the REST API: this csv is automatically full-threshold secret-shared across the $n$ -players of the computation, one share is uploaded to each player's local PDDStore prior to the computation.

Phases and steps

The end-user (Data Analyst or Client) interacts with the XOR Secret Computing© Platform via a web application (Analyst Platform). The main processing steps performed by the analyst are described below:

Anatomy

#	Phase	Description
1	Initialization and Handshakes	The XOR Machines establish SSL connections with the XOR Analyst Platform and Service. The XOR Machines establish TCP connections with each other.
2.1	Data Selection	The analyst then chooses the data sources from available XOR Machines and is only able to see the headers and additional metadata provided by the data owner. The analyst has the option to upload their own data, which is received by the XOR Analyst Platform and immediately Secret Shared and distributed to the XOR machines.
2.2	Operation	Analyst selects the operation to be computed.
2.3	Computation	The analyst triggers the computation.
3	Compile and Binary Distribution	The XOR Service compiles the operation into a Secret Compute Circuit (binary) which is distributed to the XOR Machines.
4	Offline Compute	The XOR Service generates the Triplets (random data) and distributes to the XOR Machines.
5	Online (Secret) Compute	The XOR Machines execute the Secret Compute Circuit (binary) and communicate with each other, exchanging only Secret Shares (random data).
6	Results	The resulting shares are encrypted with the analyst key, sent from the XOR Machines to the XOR Analyst Platform, where the analyst can combine them into the final decrypted output.

PDDStore and PDDStoreVault

Each XOR Machine has a PDDStore, a local storage where plaintext and shares of datasets owned by the XOR Machine are stored.

The PDDStoreVault is used to change the ownership of plaintext and secret-shared datasets. It is a secure, persistent and centralized storage (currently backed by AWS S3 or Google Cloud Storage). Change of ownership allows to share the result of a computation (i.e a dataset) with the analyst, or a new set of owners which grant them the use in a next computation.

The ownership change of a plaintext or secret-shared dataset is done in 2 steps:

The original owners encrypt the data to be shared with the public-key of each destination owner and put it in the PDDStoreVault
Destination owners pull the encrypted data from the PDDStoreVault and decrypt it

Even though the dataset identifiers present in the PDDStoreVault are visible to all, only the dataset owner can decrypt it. The visibility change of a dataset (e.g. from secret-shared to plaintext) is performed in a similar manner.

Security of XOR Service

The role of the XOR Service is to compile the operations submitted by the user and generate random numbers (triplets) to support the secret computation; it never sees any data.

No private data is accessible to the XOR Service.
The communication between the XOR Machines during the compute (online) phase of the protocol is not visible by the XOR Service.

Security of the MPC Protocol

The MPC protocol used in the Inpher XOR Secret Computing© Platform is based on secret sharing and Fourier approximation of real-valued functions. For more details, see Inpher's peer-reviewed paper from Financial Cryptography 2018 and the Manticore paper.

Full Threshold Security

The XOR protocols provide full threshold security such that no subset of the computing parties can collude with each other to obtain any information about the data from the other parties. For example, in a three party setting, two of the parties would be unable to collude and retrieve any data from the third party. The security model assumes that the computing parties are semi-honest and do not collude with the XOR Service, hence the reason that the XOR Service does not participate in the online compute phase of the process.

Information Theoretic Security

Real numbers are represented as integers modulo $2^{64}$ or $2^{128}$ , the masking is uniform and provides provable information-theoretic security.

XOR participants

The setting assumes a set of input parties (IP) who provide the input data. These can be financial institutions, healthcare providers, Internet companies, manufacturers, etc. A data analyst (DA) acting as a client is interested in learning the result of a function over data coming from the different input parties without exposing the inputs. This is achieved with secure Multiparty Computation (MPC). In MPC, a set of computation parties (CP) receive secret shares of the input values (in most scenarios the IPs and CPs are mutually inclusive, i.e. the input parties are also the computing parties). The trusted dealer (TD), operating as the XOR Service, helps the CPs compute by generating independent and correlated randomness (Beaver triplets) and secret sharing it among the parties. It is trusted because it does not participate in the online phase of the computation (the phase during which CP's operate on the input data) and thus, it sees neither the local private data, nor the communicated masked data. Thus, TD only participates in an offline phase which is independent of the input data. In summary, the setting involves the following participants:

IP: Input parties providing input data.
DA: Data analyst(s) interested in learning $f$ over input data.
CP: Computation parties who compute the result without learning the input data.
TD: A trusted dealer helping with the computation of $f$ by providing randomness to the CP's in an offline phase, prior to the parties accessing data.

Threat Model

Because all the communication channels are private and authenticated, external adversaries are harmless for the protocol and therefore not modeled. Adversaries are impersonated with the CP's. Any information revealed from the output of the function is considered intentional and therefore beyond the security scope.

Collusion model There are two formalized types of collusions: horizontal collusions ( $H C$ ) and vertical collusions ( $V C$ ).

Horizontal collusions are categorized in two different types:

(a.) Collusions between two or more IP's,
(b.) Collusions between two or more CP's,

If the DAs have provided inputs to the MPC protocol then this will be classified as $H C$ collusions of type a. and type b.

Vertical collusions are further categorized in three types:

(a.) an IP colluding with a CP,
(b.) a CP colluding with DA,
(c.) a DA colluding with IP

The notation ${H | | V} C_{a | | b | | c}^{i, j}$ indicates a horizontal/vertical collusion of type $a, b$ or $c$ between the corresponding parties $i$ and $j$ .

A collusion of a specific category and type is full-threshold if any number of parties up to $k - 1$ are allowed to collude. The TD is never participating in the actual computation (online) phase and it is disconnected completely from the machines.

graph text

Collusion scenarios and the models formally addressed in the security proofs

Horizontal collusions:

$H C_{a}$ : A collusion between the IP's is not a realistic assumption because parties do not want to share their private data with other participants.
$H C_{b}$ : That is the most common collusion scenario that will be used in the analysis. Here, colluding parties are allowed to exchange information about their input shares in order to completely unmask the shared value.

Vertical collusions:

$V C_{a}$ : As the CPs are deployed inside the perimeter of the corresponding IPs, it is assumed a collusion of that type $V C_{a}^{i, i}$ . $V C_{a}$ : types of collusions for different indices is not a realistic assumption as all the IPs want to protect their individual input.
$V C_{b}$ : this type of collusion does not improve the information available to the colluding CPs as the information DA has is publicly available.
$V C_{c}$ : it is assumed that IP's being rational players will not reveal their inputs to the DA.

The security analysis assumes $H C_{b}$ : collusions in a full threshold model: $k$ : players (CPs) with maximum $k - 1$ : collusions. That is, information from one player can be spread across a maximum of players. Finally the result is exposed to the DA in plaintext following the MPC as a service model.

Security Proof Assumptions:

TD going offline during online computations.
At most $k - 1$ : parties colluding.
Appropriate uniformly random secret shares of each value.

The main theorem for the proof of security states that the only information learned from (IPs) CPs is that which is disclosed during the protocol in agreement with all (IPs) CPs.

The formal cryptographic security proof can be provided on demand.

Concepts & Architecture ​

Introduction ​

Components & Requirements ​

Product Architecture ​

XOR Service ​

XOR Machine ​

Analyst Platform ​

Anatomy of a Secret Computation ​

Datasets ​

plaintext visibility ​

secret-shares visibility ​

Phases and steps ​

PDDStore and PDDStoreVault ​

Security of XOR Service ​

Security of the MPC Protocol ​

Full Threshold Security ​

Information Theoretic Security ​

XOR participants ​

Threat Model ​