Skip to content
/ fss Public

Function secret sharing (FSS) primitives including distributed point/comparison function (DPF/DCF)

License

Notifications You must be signed in to change notification settings

myl7/fss

Repository files navigation

myl7/fss

Function secret sharing (FSS) primitives including:

  • 2-party distributed point function (DPF)
  • 2-party distributed comparison function (DCF)

Documentation

Features:

  • First-class support for GPU (based on CUDA)
  • Well-commented and documented
  • Header-only library, easy for integration

Introduction

Multi-party computation (MPC) is a subfield of cryptography that aims to enable a group of parties (e.g., servers) to jointly compute a function over their inputs while keeping the inputs private.

Secret sharing is a method that distributes a secret among a group of parties, such that no individual party holds any information about the secret. For example, a number $x$ can be secret-shared into $x_0, x_1$ via $x = x_0 + x_1$.

FSS is a scheme to secret-share a function into a group of function shares. Each function share, called as a key, can be individually evaluated on a party. The outputs of the keys are the shares of the original function output. FSS consists of 2 methods: Gen for generating function shares as keys and Eval for evaluating a key to get an output share. FSS's workflow is shown below:

DPF/DCF are FSS for point/comparison functions. They are called out because 2-party DPF/DCF can have $O(\log N)$ key size, where $N$ is the input domain size. Meanwhile, 3-or-more-party DPF/DCF and general FSS have $O(\sqrt{N})$ key size. More details, including the definitions and the implementation details that users must care about, can be found in the documentation of dpf.cuh and dcf.cuh files.

Get Started

Prerequisites

  • CMake >= 3.22
  • CUDA toolkit >= 12.0 (for C++20 support). Tested on the latest CUDA toolkit.
  • OpenSSL 3 (only required for CPU with AES-128 MMO PRG)

Build

Clone the repository:

git clone https://github.com/myl7/fss.git
cd fss

Option A: Install via CMake and use find_package

cmake -B build -DBUILD_TESTING=OFF -DCMAKE_INSTALL_PREFIX=/path/to/install
cmake --build build
cmake --install build

Then in your project's CMakeLists.txt:

find_package(fss REQUIRED)
target_link_libraries(your_target fss::fss)

When configuring your project, point CMake to the install prefix:

cmake -B build -DCMAKE_PREFIX_PATH=/path/to/install

Option B: Use as a subdirectory (header-only)

Without installing, you can define the target directly in your CMakeLists.txt, like the samples do:

add_library(fss INTERFACE)
target_include_directories(fss INTERFACE "/path/to/fss/include")
target_compile_features(fss INTERFACE cxx_std_20 cuda_std_20)

Then link it in your project:

target_link_libraries(your_target fss)

CPU

This walks through using DPF and DCF on the CPU with AES-128 MMO PRG. This PRG requires OpenSSL.

  1. Include the headers and set up type aliases:

    #include <fss/dpf.cuh>
    #include <fss/dcf.cuh>
    #include <fss/group/bytes.cuh>
    #include <fss/prg/aes128_mmo.cuh>
    
    constexpr int kInBits = 8;  // Input domain: 2^8 = 256 values
    using In = uint8_t;
    using Group = fss::group::Bytes;
    
    // DPF uses mul=2, DCF uses mul=4
    using DpfPrg = fss::prg::Aes128Mmo<2>;
    using DcfPrg = fss::prg::Aes128Mmo<4>;
    using Dpf = fss::Dpf<kInBits, Group, DpfPrg, In>;
    using Dcf = fss::Dcf<kInBits, Group, DcfPrg, In>;
  2. Create the PRG with AES keys and instantiate DPF/DCF:

    // DPF PRG needs 2 AES keys
    unsigned char key0[16] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
    unsigned char key1[16] = {16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
    const unsigned char *keys[2] = {key0, key1};
    auto ctxs = DpfPrg::CreateCtxs(keys);
    
    DpfPrg prg(ctxs);
    Dpf dpf{prg};
  3. Run Gen to generate correction words (keys) from secret inputs:

    In alpha = 42;                  // Secret point / threshold
    int4 beta = {7, 0, 0, 0};      // Secret payload (LSB of .w must be 0)
    
    // Random seeds for the two parties (LSB of .w must be 0)
    int4 seeds[2] = {
        {0x11111111, 0x22222222, 0x33333333, 0x44444440},
        {0x55555555, 0x66666666, 0x77777777, static_cast<int>(0x88888880u)},
    };
    
    Dpf::Cw cws[kInBits + 1];
    dpf.Gen(cws, seeds, alpha, beta);
  4. Run Eval on each party and reconstruct using the group:

    // Each party evaluates independently
    int4 y0 = dpf.Eval(false, seeds[0], cws, alpha);
    int4 y1 = dpf.Eval(true, seeds[1], cws, alpha);
    
    // Reconstruct via the group: convert to group elements, add, convert back
    // For Bytes group this is XOR; for Uint group this is arithmetic addition
    int4 sum = (Group::From(y0) + Group::From(y1)).Into();
    // sum == beta at x == alpha, 0 otherwise
  5. Free the AES contexts when done:

    DpfPrg::FreeCtxs(ctxs);

DCF follows the same pattern — use DcfPrg (mul=4, needs 4 AES keys), Dcf, and Dcf::Cw. The reconstructed output equals beta when x < alpha and 0 otherwise.

Link with OpenSSL in your CMakeLists.txt:

find_package(OpenSSL REQUIRED)
target_link_libraries(your_target fss OpenSSL::Crypto)

See samples/dpf_dcf_cpu.cu for the complete working example.

GPU

This walks through using DPF and DCF on the GPU with ChaCha PRG.

  1. Include the headers and set up type aliases:

    #include <fss/dpf.cuh>
    #include <fss/dcf.cuh>
    #include <fss/group/bytes.cuh>
    #include <fss/prg/chacha.cuh>
    
    constexpr int kInBits = 8;
    using In = uint8_t;
    using Group = fss::group::Bytes;
    
    // DPF uses mul=2, DCF uses mul=4
    using DpfPrg = fss::prg::ChaCha<2>;
    using DcfPrg = fss::prg::ChaCha<4>;
    using Dpf = fss::Dpf<kInBits, Group, DpfPrg, In>;
    using Dcf = fss::Dcf<kInBits, Group, DcfPrg, In>;
  2. Set up a nonce in constant memory and create the PRG in a kernel:

    __constant__ int kNonce[2] = {0x12345678, 0x9abcdef0};
    
    __global__ void GenKernel(Dpf::Cw *cws, const int4 *seeds, const In *alphas, const int4 *betas) {
        int tid = blockIdx.x * blockDim.x + threadIdx.x;
    
        DpfPrg prg(kNonce);
        Dpf dpf{prg};
    
        int4 s[2] = {seeds[tid * 2], seeds[tid * 2 + 1]};
        dpf.Gen(cws + tid * (kInBits + 1), s, alphas[tid], betas[tid]);
    }
  3. Prepare host data, copy to device, and launch the Gen kernel:

    int4 *d_seeds = /* cudaMalloc + cudaMemcpy seeds to device */;
    In *d_alphas = /* cudaMalloc + cudaMemcpy alphas to device */;
    int4 *d_betas = /* cudaMalloc + cudaMemcpy betas to device */;
    
    Dpf::Cw *d_cws;
    cudaMalloc(&d_cws, sizeof(Dpf::Cw) * (kInBits + 1) * N);
    
    GenKernel<<<blocks, threads>>>(d_cws, d_seeds, d_alphas, d_betas);
  4. Write and launch an Eval kernel for each party, then copy results back:

    __global__ void EvalKernel(int4 *ys, bool party, const int4 *seeds, const Dpf::Cw *cws, const In *xs) {
        int tid = blockIdx.x * blockDim.x + threadIdx.x;
    
        DpfPrg prg(kNonce);
        Dpf dpf{prg};
    
        ys[tid] = dpf.Eval(party, seeds[tid], cws + tid * (kInBits + 1), xs[tid]);
    }
    
    // Launch for party 0 and party 1, then copy d_ys back to host
    EvalKernel<<<blocks, threads>>>(d_ys, false, d_seeds0, d_cws, d_xs);
    EvalKernel<<<blocks, threads>>>(d_ys, true, d_seeds1, d_cws, d_xs);
  5. Reconstruct on the host using the group, same as the CPU case:

    int4 sum = (Group::From(h_y0s[i]) + Group::From(h_y1s[i])).Into();

DCF follows the same pattern — use DcfPrg (mul=4), Dcf, and Dcf::Cw.

See samples/dpf_dcf_gpu.cu for the complete working example.

License

Apache License, Version 2.0

Copyright (C) 2026 Yulong Ming i@myl.moe

About

Function secret sharing (FSS) primitives including distributed point/comparison function (DPF/DCF)

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors