publications | Xinyi Zhang

2026

SIGMOD’26
HIRE: A Hybrid Learned Index for Robust and Efficient Performance under Mixed Workloads

Xinyi Zhang, Liang Liang, Anastasia Ailamaki, and Jianliang Xu

Proceedings of the ACM on Management of Data (SIGMOD), Feb 2026

Abs DOI Bib HTML PDF Slides

Indexes are critical for efficient data retrieval and updates in modern databases. Recent advances in machine learning have led to the development of learned indexes, which model the cumulative distribution function of data to predict search positions and accelerate query processing. While learned indexes substantially outperform traditional structures for point lookups, they often suffer from high tail latency, suboptimal range query performance, and inconsistent effectiveness across diverse workloads. To address these challenges, this paper proposes HIRE, a hybrid in-memory index structure designed to deliver efficient performance consistently. HIRE combines the structural and performance robustness of traditional indexes with the predictive power of model-based prediction to reduce search overhead while maintaining worst-case stability. Experimental results on multiple real-world datasets demonstrate that HIRE outperforms both state-of-the-art learned indexes and traditional structures in range-query throughput, tail latency, and overall stability, achieving up to 41.7x higher throughput under mixed workloads and reducing tail latency by up to 98%.
@article{SIGMOD26:hire, title = {HIRE: A Hybrid Learned Index for Robust and Efficient Performance under Mixed Workloads}, author = {Zhang, Xinyi and Liang, Liang and Ailamaki, Anastasia and Xu, Jianliang}, journal = {Proceedings of the ACM on Management of Data (SIGMOD)}, volume = {4}, number = {1}, articleno = {43}, numpages = {25}, month = feb, year = {2026}, doi = {10.1145/3786657}, }

2025

DAC’25
EDGE: DBMS-Empowered Boolean Decomposition for GIG Synthesis

Ruofei Tang, Xuliang Zhu, Xinyi Zhang, Lei Chen, Xing Li, Mingxuan Yuan, and Jianliang Xu

In Proceedings of the 62nd Annual ACM/IEEE Design Automation Conference (DAC), San Francisco, California, United States, 2025

Abs DOI Bib HTML

Boolean decomposition is a powerful technique in logic synthesis that breaks down Boolean functions into simpler components. Decomposition-based logic synthesis yields high-quality results and is particularly effective when combined with small-window optimization methods in Gate-Inverter Graphs (GIG). However, the efficiency limitations of current methods have constrained their applicability in handling large and complex logic. To address this challenge, we propose a novel framework, called EDGE, which leverages modern database techniques to accelerate Boolean decomposition, thereby achieving improved synthesis results while maintaining high efficiency. Experimental results demonstrate a runtime speedup of up to 21x and an overall reduction in node count of at least 15% compared to state-of-the-art synthesis methods.
@inproceedings{DAC25:edge, title = {EDGE: DBMS-Empowered Boolean Decomposition for GIG Synthesis}, author = {Tang, Ruofei and Zhu, Xuliang and Zhang, Xinyi and Chen, Lei and Li, Xing and Yuan, Mingxuan and Xu, Jianliang}, booktitle = {Proceedings of the 62nd Annual ACM/IEEE Design Automation Conference (DAC)}, series = {DAC '25}, articleno = {408}, numpages = {7}, location = {San Francisco, California, United States}, publisher = {IEEE Press}, isbn = {9798331503048}, year = {2025}, doi = {10.1109/DAC63849.2025.11133306}, }

2024

SIGMOD’24
FedKNN: Secure Federated k-Nearest Neighbor Search

Xinyi Zhang, Qichen Wang, Cheng Xu, Yun Peng, and Jianliang Xu

Proceedings of the ACM on Management of Data (SIGMOD), Mar 2024

Abs DOI Bib HTML PDF Slides

Nearest neighbor search is a fundamental task in various domains, such as federated learning, data mining, information retrieval, and biomedicine. With the increasing need to utilize data from different organizations while respecting privacy regulations, private data federation has emerged as a promising solution. However, it is costly to directly apply existing approaches to federated k-nearest neighbor (kNN) search with difficult-to-compute distance functions, like graph or sequence similarity. To address this challenge, we propose FedKNN, a system that supports secure federated kNN search queries with a wide range of similarity measurements. Our system is equipped with a new Distribution-Aware kNN (DANN) algorithm to minimize unnecessary local computations while protecting data privacy. We further develop DANN*, a secure version of DANN that satisfies differential obliviousness. Extensive evaluations show that FedKNN outperforms state-of-the-art solutions, achieving up to 4.8x improvement on federated graph kNN search and up to 2.7x improvement on federated sequence kNN search.
@article{SIGMOD24:fedknn, title = {FedKNN: Secure Federated k-Nearest Neighbor Search}, author = {Zhang, Xinyi and Wang, Qichen and Xu, Cheng and Peng, Yun and Xu, Jianliang}, journal = {Proceedings of the ACM on Management of Data (SIGMOD)}, volume = {2}, number = {1}, articleno = {11}, numpages = {26}, month = mar, year = {2024}, doi = {10.1145/3639266}, }