FedKNN
Secure Federated k-Nearest Neighbor Search (SIGMOD '24)
FedKNN: Secure Federated k-Nearest Neighbor Search Published at ACM SIGMOD 2024 · Xinyi Zhang, Qichen Wang, Cheng Xu, Yun Peng, Jianliang Xu
Problem
Nearest neighbor search is a fundamental primitive across data mining, information retrieval, and biomedicine. When data is distributed across organizations with strict privacy regulations, running kNN queries — especially with expensive distance functions such as graph or sequence similarity — over a private data federation is prohibitively costly with existing techniques.
Contributions
- FedKNN, a system that supports secure federated kNN search over a wide range of similarity measurements.
- DANN (Distribution-Aware kNN) — a new algorithm that minimizes unnecessary local computations while protecting data privacy.
- DANN* — a secure variant of DANN satisfying differential obliviousness, providing strong privacy guarantees with minimal overhead.
Results
- Up to 4.8× improvement on federated graph kNN search.
- Up to 2.7× improvement on federated sequence kNN search.
- Clear privacy / efficiency trade-off supported by the differentially-oblivious variant.