Faculty Duo Win the VLDB Test of Time Award
Fusheng Wang, professor in the Department of Biomedical Informatics and the Department of Computer Science, and Cherith Professor and Joel Saltz, Founding Chair of the Department of Biomedical Informatics, have been awarded the 2024 Test of Time award from the Very Large Database (VLDB) Endowment.

The duo was honored for a 2013 paper that has made significant contributions to the creation of an ecosystem for big spatial analytics that is currently widely adopted for its merits of large-scale capacity, scalability, compatibility with low-cost commodity processors, and open-source accessibility, making it indispensable in the society for various applications.
Wang will receive the award at the VLDB conference in Guangzhou, China on August 29.
In order to win the award, a paper is selected from the VLDB Conference from 10 to 12 years earlier that best meets the “test of time.” In picking a winner, the committee evaluates the impact of the paper, especially in practice.
The paper, Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce, introduced a project that has been a pioneer in transforming spatial data analytics from traditional commercial parallel database systems to large-scale clusters of commodity processors using open-source software. It has been a continuous source of inspiration for subsequent work on scalable and parallel spatial data processing in both academia and industry.
This project was originally developed by a group of researchers from Stony Brook University, Emory University, and the Ohio State University, whose groundbreaking work started the development of a new spatial data warehousing and GIS ecosystem based on open-source software at a low cost in high-performance, high-throughput, and high-scalability by massive parallel processing across a cluster of commodity processors.
At the time of publication, Saltz and Wang were faculty at Emory University, before joining the Stony Brook faculty. Ablimit Aji, Hoang Vo, Rubao Lee, Qiaoling Liu and Xiaodong Zhang served as co-authors on the paper. Vo later attended a doctoral program at Stony Brook.

Following the release of the Hadoop-GIS open-source software and its publication in VLDB in 2013, its technical breakthroughs have inspired numerous academic research projects. Approximately 833 citations (817 papers and 19 patents) as of May 28, 2024, indicate that many subsequent projects have produced revised designs, algorithms, and implementations of Hadoop-GIS. Notably, some of these follow-up open-source software applications have reached a level of maturity suitable for a wide public usage, such as Apache Sedona.
The Hadoop-GIS project pioneered a systematic approach to the design and implementation of data processing platforms in a scalable and sustainable way. Beyond its academic impact, Hadoop-GIS has profoundly influenced several GIS production systems currently in use.
“The VLDB Test of Time Award underscores the lasting significance of this research, which continues to shape how we approach big spatial data challenges in an increasingly data-driven world,” said Saltz.
The Hadoop-GIS project, from its inception, has been designed to efficiently support high-performance queries on large volumes of spatial data, operating on a shared-nothing architecture. One groundbreaking innovation of Hadoop-GIS is an on-demand spatial query engine approach that can execute spatial query engines on as many partitions as needed, which was a paradigm shift from traditional spatial data management systems.
Through dynamic on-demand indexing and elegant handling of boundaries, Hadoop-GIS can achieve highly scalable spatial queries at extreme scale. Besides, Hadoop-GIS proposed a declarative spatial query language on top of MapReduce through integration with Apache Hive, which has inspired various implementations in the years that followed.
Hadoop-GIS stood as a testament to the potential of combining spatial data analysis with scalable distributed computing frameworks. It demonstrated that high performance and scalability in managing and querying massive spatial data can be achieved without resorting to specialized hardware, thereby marking a significant milestone in the evolution of spatial data warehousing systems. This pioneering work set a new direction for the systems development in the GIS field.
The project was initially motivated by spatial big data derived from digital pathology, for which Saltz is a pioneer, and led to two NSF awards the year after the paper was published; a NSF CAREER award for Wang and a NSF DIBBS $5 million award, for which Wang served as co-PI.
— Beth Squire