VerdictDB is built upon the theories of Approximate Query Processing (AQP) and our novel architecture of AQP-as-a-middleware. VerdictDB’s large speedups are possible because, even from a small fraction of the entire data, we can reliably estimate many important statistics of the entire data. VerdictDB exploits that the values of many aggregate functions that commonly appear in analytic queries can be expressed using those statistics of the entire data, which can be estimated using samples.

Our research in Approximate Query Processing has produced many research papers at premier database conferences.

  1. Yongjoo Park, Jingyi Qing, Xiaoyang Shen, Barzan Mozafari. BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees. SIGMOD 2019.
  2. Yongjoo Park, Barzan Mozafari, Joseph Sorenson, Junhao Wang. VerdictDB: Universalizing Approximate Query Processing. SIGMOD 2018.
  3. Barzan Mozafari. Approximate Query Engines: Commercial Challenges and Research Opportunities. SIGMOD 2017 Keynote.
  4. Yongjoo Park, Ahmad Shahab Tajik, Michael Cafarella, Barzan Mozafari. Database Learning: Toward a Database that Becomes Smarter Every Time. SIGMOD 2017.
  5. Yongjoo Park. Active Database Learning CIDR 2017.
  6. Yongjoo Park, Michael Cafarella, Barzan Mozafari. Visualization-Aware Sampling for Very Large Databases. ICDE 2016.
  7. Yongjoo Park, Michael Cafarella, Barzan Mozafari. Neighbor-Sensitive Hashing. PVLDB 2015.
  8. Barzan Mozafari, and Ning Niu. A Handbook for Building an Approximate Query Engine. IEEE Data Engineering Bulletin, 2015.
  9. Barzan Mozafari. Verdict: A System for Stochastic Query Planning. CIDR 2015.
  10. Kai Zeng, Shi Gao, Barzan Mozafari and Carlo Zaniolo. The Analytical Bootstrap: a New Method for Fast Error Estimation in Approximate Query Processing. SIGMOD 2014.
  11. Sameer Agarwal, Henry Milner, Ariel Kleiner, Ameet Talwalkar, Michael Jordan, Samuel Madden, Barzan Mozafari and Ion Stoica. Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems. SIGMOD 2014.
  12. Kai Zeng, Shi Gao, Jiaqi Gu, Barzan Mozafari and Carlo Zaniolo. ABS: a System for Scalable Approximate Queries with Accuracy Guarantees. SIGMOD 2014.
  13. Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, and Ion Stoica. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. EuroSys 2013.
  14. Sameer Agarwal, Aurojit Panda, Barzan Mozafari, Anand P. Iyer, Samuel Madden, and Ion Stoica. Blink and It’s Done: Interactive Queries on Very Large Data PVLDB 2012.