FedKNN: Secure Federated k-Nearest Neighbor Search.

Published in ACM SIGMOD International Conference on Management of Data (SIGMOD), 2024

Abstract

Nearest neighbor search is a fundamental task in various domains, such as federated learning, data mining, information retrieval, and biomedicine. With the increasing need to utilize data from different organizations while respecting privacy regulations, private data federation has emerged as a promising solution. However, it is costly to directly apply existing approaches to federated k-nearest neighbor (kNN) search with difficult-to-compute distance functions, like graph or sequence similarity. To address this challenge, we propose FedKNN, a system that supports secure federated kNN search queries with a wide range of similarity measurements. Our system is equipped with a new Distribution-Aware kNN (DANN) algorithm to minimize unnecessary local computations while protecting data privacy. We further develop DANN*, a secure version of DANN that satisfies differential obliviousness. Extensive evaluations show that FedKNN outperforms state-of-the-art solutions, achieving up to 4.8× improvement on federated graph kNN search and up to 2.7× improvement on federated sequence kNN search. Additionally, our approach offers a trade-off between privacy and efficiency, providing strong privacy guarantees with minimal overhead.

Citation

Xinyi Zhang, Qichen Wang, Cheng Xu, Yun Peng, and Jianliang Xu. “FedKNN: Secure Federated k-Nearest Neighbor Search.” ACM SIGMOD International Conference on Management of Data (SIGMOD), June 2024.

Supplemental Material

Technical Report

Experiment Scripts