FAQ

XOR Trial Beta FAQ

How long does a user have access to XOR Trial Beta?

Users have 14 days of access from the time they accept the invitation. A user’s operation history will be saved for 30 days.

What if I signed up for XOR Trial Beta but did not have a chance to access it during the evaluation period?

Please email support@inpher.io and ask to have your sessions restarted.

Do I need a credit card to use XOR Trial Beta?

No. XOR Trial Beta is completely free to use during your evaluation period. Every user of XOR Trial Beta needs to accept the Terms & Conditions before starting.

Do I have to cancel my XOR Trial Beta account at the end of the evaluation period?

No action is required. Your access will be disabled and your operation history deleted automatically.

If I have questions about my XOR Trial Beta account, who should I reach out to?

Users of XOR Trial Beta can either click on the green support button in the lower left hand corner of the user interface or email support@inpher.io

If I want to use my own datasets or other algorithms in XOR Trial Beta, who should I reach out to?

Users can reach out to sales@inpher.io or visit the AWS or GCP Marketplaces and select a Sandbox License to begin using XOR Secret Computing on your own datasets right away.

I have a colleague that wants to use XOR Trial Beta. How can I give them their own unique access?

Direct them to https://trial.xor.inpher.io and have them accept the Terms & Conditions.

I work at a university or college and would like my students to explore MPC using XOR Trial Beta? Who should I contact for access?

Users of XOR Trial Beta in academia should reach out to support@inpher.io and indicate your university affiliation and what you are interested in.

How do I reset my XOR Trial Beta password?

Click on Setting in the upper left hand corner of your browser and then update your password directly in the New Password field.

What web browsers can I use XOR Trial Beta in?

XOR Trial Beta is optimized, but not guaranteed, to work in the following desktop (non-mobile) browsers: Chrome, Safari, Firefox, & Edge.

Does XOR Trial Beta work for mobile devices?

XOR Trial Beta is not optimized for mobile devices.

General FAQ

MPC Protocol: Trusted Dealer vs. Active Security

The Trusted Dealer (TD) approach used in XOR contributes to the scalability, functionality and communication efficiency required for real world applications. This model works in practice as Inpher is an independent software vendor; collusion between the TD and the computing parties is precluded because the TD does not participate in the online phase of the computation. The main advantages are:

Efficient triplet generation
Less communications
Efficient for high precision computation; Generalized Beaver approach (Fourier) (Efficient evaluation of Fourier series required to improve the precision of nonlinear functions. This has a strong impact when trying to model rare event applications used by our customers such as fraud, AML and anomaly detection.)
Efficient native backend representation (mod $2^{k}$ )
Full threshold and information theoretic security

The above points are particularly hard in a non-trusted dealer setup due to their prohibitively large multiplicative depths. The TD model provides full threshold security that guarantees no collusion between the computing parties can reveal the data of a single player. The methods that allow full threshold in an active security model are based on Oblivious Transfer or FHE. The first method is scalable for a small number of players whereas the second is only practical for small datasets.

Who hosts the trusted third party in your framework?

The Trusted Dealer (XOR Service) is hosted and managed by Inpher in a secure cloud environment.

Who hosts the XOR Machines (computing parties)?

Each of the $k$ parties holding private distributed data is hosting its own XOR Machine. Our customers typically host the XOR Machines on premise or on their cloud environment. In general, there needs to be one XOR Machine per privacy zone.

Is there a limitation on the number of XOR Machines (computing parties)?

XOR is designed to support arbitrary values of $k$ without theoretical upper limit.

How many data owners can participate in Inpher’s MPC protocol?

Theoretically the number of data owners is unlimited in our MPC protocol, assuming they are associated with a specific network configuration. XOR is designed to support arbitrary values of k without theoretical upper limit. Our XOR Secret Computing© Platform has a framework to support a combination of federated learning and multi-party computation called secure aggregation, which would allow for data analysis generated from hundreds of millions of devices. Practically speaking, the guidance given to clients is that in a production scenario and with advanced machine learning algorithms, XOR would reliably support ten players (data sources) in a single network configuration.

Once a protocol has been created and saved, can it be updated without having to reset all initial parameters?

Yes, XOR allows for interaction with data analysts via XOR.py (Python) which can be used in a notebook and be made persistent, our APIs and our user interface. In the case of Python and our APIs, any initial parameters established, like which data sources, network configurations, operations, data stacking, etc., can be updated without having to reset the initial parameters.

Can protocols be scheduled to run automatically at predetermined frequencies (e.g., day, weekly, quarterly)?

Yes.

Describe the tool’s query execution performance time.

This varies and is largely dependent upon the number of data sources, the dimensions of the data being analyzed and the type of operation used. With a more simplistic algorithm, like private set intersection, performance time might be 1.5x versus plain text computing. If it is a more complicated operation, like XGBoost, performance time might be 10x versus plain text computing. One of Inpher’s strongest value propositions is that we consistently iterate on improving the performance of XOR versus plain text computing, both in terms of performance time and computational overhead. More benchmarking data is available upon request.

What encryption controls does the tool provide to prevent unauthorized disclosure of a data owner’s private data?

Inpher uses additive secret sharing and garbled circuits. The XOR protocols provide full threshold security such that no subset of the computing parties can collude with each other to obtain any information about the data from the other parties. For example, in a three party setting, two of the parties would be unable to collude and retrieve any data from the third party. The security model assumes that the computing parties are semi-honest and do not collude with the XOR Service, hence the reason that the XOR Service does not participate in the online compute phase of the process.

What privacy controls or standards are used to prevent unauthorized users access to tool and data?

XOR Machines are managed and operated by the respective parties for the “online phase” of communication, whereas the XOR Service is only responsible for the triplet generation in the “offline phase.” This means that Inpher cannot access the underlying data in each XOR Machine, nor can any other party participating in the MPC network see another parties data. The above points are particularly hard in a non-trusted dealer setup due to their prohibitively large multiplicative depths. The trusted dealer model provides full threshold security that guarantees no collusion between the computing parties can reveal the data of a single player. The methods that allow full threshold in an active security model are based on Oblivious Transfer or FHE. The first method is scalable for a small number of players whereas the second is only practical for small datasets.

What type of data sources can the tool ingest or connect to (e.g., Oracle, SQL, MongoDB)?

Any data source that supports ODBC can be connected out-of-the-box. Any additional connectors can be developed on demand.

What type of data format can the tool ingest (e.g., CSV, JSON, Avro, Parquet, TXT, Excel)?

Data needs to be in .csv format for the sandbox but JSON / Avro etc. are supported in production via connectors.

Does the tool support different data structures (e.g., structured, semi-structured, unstructured)?

XOR currently supports structured data and any unstructured data would require pre-processing (data engineering) before use in XOR.

Does XOR store data within the application?

No. No data is stored in the XOR Platform.

Does the tool have any machine learning capabilities?

Machine learning is inherently part of the XOR Service from the perspective of the types of analytics or operations that a user would have access to when performing MPC. XOR does not have ML embedded in the sense that it can make process recommendations or improvements based on user interaction.

What type of skills, knowledge, or training is needed to become proficient with the tool?

Inpher makes it easy for almost anyone, from a business unit owner to a data engineer to a cryptographer to a data scientist, to get up to speed with using XOR. Inpher provides a significant amount of training during POC and production licenses and is supported by documentation. Practically speaking, most of the day-to-day users are machine learning data scientists who would be comfortable with using Python and performing basic to complex machine learning operations. Anyone who has interest in XOR and Secret Computing can use our XOR Trial solution which is freely available without download or installation requirements at www.inpher.io XOR Trial has a number of pre-designed use cases with walkthroughs in a Sandbox environment hosted by Inpher.

Can XOR be hosted in a cloud environment?

The XOR Service is currently hosted in GCP, however, any data source can be located in a cloud environment, hybrid environment, or on-prem environment.

If so, which ones (e.g., Amazon AWS, Microsoft Azure, Google Cloud)?

Inpher supports interaction with all of the major cloud services.

Can XOR scale to support increases in workload and/or users?

Yes, XOR is designed to provide a scalable MPC solution for both users and a number of data sources. Users of the XOR Platform can deploy XOR Machines to data sources and enable additional users directly from the Platform.

How can XOR address the challenges of data cleaning, feature extraction and hyperparameter tuning on data sources before an operation is executed?

XOR addresses those challenges in the following ways:

Data owners enabling privacy-preserving analytics with XOR can choose to expose arbitrary structured and unstructured metadata allowing for rich feature discovery and documentation: Metadata is typically pulled from a data catalogue where XOR exposes only the schema of the features without exposing individual samples Metadata can be filtered by applying Differential Privacy to differentially private summary statistics about numeric features True samples and synthetic samples can be exposed by data owners to support a more visual understanding Functional evaluations can be run privately with XOR or locally computed and stored as metadata either directly or after applying differential privacy (i.e. Histograms, Mean, Max) When raw features require cleaning and preprocessing, data analysts run predefined local or distributed pre-processing & cleanup functions (depending on how the data is virtually stacked; hstack vs. vstack). See a selection of algorithms here: https://dev.inpher.io/xor/algorithms/#preprocessing XORs most commonly seen operation are: Fill missing values and NaN (svd impute, knn impute, soft impute) Engineer synthetic features (moving averages, correlation across multiple features owned by different parties) Local / distributed filtering of data (local processing) Hyperparameter tuning starts with building a baseline model and running a typical sequence of pre-processing, training, inference, model metric evaluation. After this initial process, iterative pipelines are run testing different hyperparameters in the training process with metric comparisons made to the baseline model until a satisfying parameter space has been found. This is often expanded to a “grid search” of parameters and codified with xor-py.

How does XOR provide explainability of the ML algorithms if only the output is shared and trained model parameters are not?

XOR provides the ability to share trained, fitted model parameters. Additionally, given the flexible nature of XOR, explainability and evidence packages can be enabled and configured to share only permitted outputs. For example, depending on privacy desires, evidence packages can be configured to parity with plaintext evidence packages (i.e. score, features, model id, explanation codes, etc.).

How does XOR handle functions that cannot be approximated well enough by fourier theory (e.g. non-continuous functions)?

For non-continuous functions, Inpher’s XOR supports piecewise Fourier approximations where different Fourier approximations in the different regions of continuity are utilized.

Inpher’s growing IP portfolio speaks to applications of approximations with Fourier splines for such purposes.

Example: USPTO High-precision privacy-preserving real-valued function evaluation

Paper: Manticore: Efficient Framework for Scalable Secure Multiparty Computation Protocols

How does XOR handle non-numerical data, like categorical data, strings and images?

XOR is primarily built for private, structured data analysis. Typical preprocessing is utilized either locally in plaintext, or via XOR for privacy-preserving preprocessing and feature engineering.

Categorical data is typically processed with techniques such as one-hot encoding, boolean variables, etc.

Strings are often utilized in the context of private-set intersections (privacy-preserving inner joins) and fuzzy matching for entity resolution and linkage. Strings and audio can also be preprocessed locally for word embeddings and vector development for further structured privacy preserving analysis.

Images can be secretly shared by pixel-wise secret sharing, but most machine learning models rely on local processing of images extracting numerical features into common schemas across a federated setting.

How will unsupervised learning be supported, as it requires clustering with a dataset? The original data can’t be altered otherwise clustering will not work all of clustering is based on distances between the original observations

The following algorithms for clustering / unsupervised learning are currently in XOR Alpha and will soon be released as a Beta. https://dev.inpher.io/xor/algorithms/#recommender-system

MPC does not “alter” or “garble” the data as it’s being done in other cryptographic approaches which makes these types of algorithms possible while keeping high precision.

Is there any encryption standard that is being applied to the computation? Is it applied prior or post to the computation before the partial results are delivered? Please provide details about any encryption standards being applied.

Transport Layer Security (TLS) is applied to the secret shares of the MPC computation chain.

Secret shares are end-to-end encrypted with the public key of the Analyst requesting the computation on each xor-machine. These TLS encrypted secret shares are then downloaded by the analyst which decrypt each individual secret share followed by recombining all the secret share results to a plaintext result locally in the analyst environment. This guarantees that none of the data owners, nor Inpher will be able to see the plaintext result of a computation unless they have access to the private key of the analyst.

Anatomy of a Secret Compute & XOR Phases: https://dev.inpher.io/xor/concepts/#anatomy-of-a-secret-computation

What happens if we have only one data set that we are interested in performing analysis on? How does XOR divide and encrypt the data in that case?

Single privacy zone data analysis is not typically a use case for XOR given its primary benefit in securing and privatising multiparty computation scenarios.

If required, XOR can facilitate secret sharing 1 data set for secure hosting across Private Distributed Data Store (PDDStore) consisting of 2 private domains. Across the PDDStored data, a Secret Compute can be run.

How does XOR handle a situation, where the function being computed is specified in a way that it can give away information about specific silos/data ponds?

XOR enables secure evaluation of functions across multiple private data sources, cryptographically guaranteeing that no information is shared between the data sources and that the analyst conducting the computation only sees the output of the function without revealing the private inputs.

XOR’s configurable and integratable Access Control’s permission data, analysts, and functions to compute where and when permissioned.

MPC guarantees the privacy of the inputs facilitating nothing to be learnt other than the resulting function.

XOR plans to support further granularity within its Access Controls contemplating information leakage via functional outputs in 2022 where relative leakage and Epsilon levels will be configurable to interact with more classic RBAC/ABAC.\

FAQ ​

XOR Trial Beta FAQ ​

How long does a user have access to XOR Trial Beta? ​

What if I signed up for XOR Trial Beta but did not have a chance to access it during the evaluation period? ​

Do I need a credit card to use XOR Trial Beta? ​

Do I have to cancel my XOR Trial Beta account at the end of the evaluation period? ​

If I have questions about my XOR Trial Beta account, who should I reach out to? ​

If I want to use my own datasets or other algorithms in XOR Trial Beta, who should I reach out to? ​

I have a colleague that wants to use XOR Trial Beta. How can I give them their own unique access? ​

I work at a university or college and would like my students to explore MPC using XOR Trial Beta? Who should I contact for access? ​

How do I reset my XOR Trial Beta password? ​

What web browsers can I use XOR Trial Beta in? ​

Does XOR Trial Beta work for mobile devices? ​

General FAQ ​

MPC Protocol: Trusted Dealer vs. Active Security ​

Who hosts the trusted third party in your framework? ​

Who hosts the XOR Machines (computing parties)? ​

Is there a limitation on the number of XOR Machines (computing parties)? ​

How many data owners can participate in Inpher’s MPC protocol? ​

Once a protocol has been created and saved, can it be updated without having to reset all initial parameters? ​

Can protocols be scheduled to run automatically at predetermined frequencies (e.g., day, weekly, quarterly)? ​

Describe the tool’s query execution performance time. ​

What encryption controls does the tool provide to prevent unauthorized disclosure of a data owner’s private data? ​

What privacy controls or standards are used to prevent unauthorized users access to tool and data? ​

What type of data sources can the tool ingest or connect to (e.g., Oracle, SQL, MongoDB)? ​

What type of data format can the tool ingest (e.g., CSV, JSON, Avro, Parquet, TXT, Excel)? ​

Does the tool support different data structures (e.g., structured, semi-structured, unstructured)? ​

Does XOR store data within the application? ​

Does the tool have any machine learning capabilities? ​

What type of skills, knowledge, or training is needed to become proficient with the tool? ​

Can XOR be hosted in a cloud environment? ​

If so, which ones (e.g., Amazon AWS, Microsoft Azure, Google Cloud)? ​

Can XOR scale to support increases in workload and/or users? ​

How can XOR address the challenges of data cleaning, feature extraction and hyperparameter tuning on data sources before an operation is executed? ​

How does XOR provide explainability of the ML algorithms if only the output is shared and trained model parameters are not? ​

How does XOR handle functions that cannot be approximated well enough by fourier theory (e.g. non-continuous functions)? ​

How does XOR handle non-numerical data, like categorical data, strings and images? ​

How will unsupervised learning be supported, as it requires clustering with a dataset? The original data can’t be altered otherwise clustering will not work all of clustering is based on distances between the original observations ​

Is there any encryption standard that is being applied to the computation? Is it applied prior or post to the computation before the partial results are delivered? Please provide details about any encryption standards being applied. ​

What happens if we have only one data set that we are interested in performing analysis on? How does XOR divide and encrypt the data in that case? ​

How does XOR handle a situation, where the function being computed is specified in a way that it can give away information about specific silos/data ponds? ​

FAQ