publications

Below, a selection of my publications. Find a complete and up to date overview at my Google Scholar page.

2025

  1. Rethinking Dataset Discovery with DataScout
    Rachel Lin, Bhavya Chopra, Wenjing Lin, and 3 more authors
    In Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology, , 2025
  2. How well do LLMs reason over tabular data, really?
    Cornelius Wolff and Madelon Hulsebos
    In Proceedings of the 4th Table Representation Learning Workshop, 2025
  3. Metadata Matters in Dense Table Retrieval
    Daniel Gomm and Madelon Hulsebos
    In ELLIS workshop on Representation Learning and Generative Models for Structured Data, 2025
  4. Querying Templatized Document Collections with Large Language Models
    Yiming Lin, Madelon Hulsebos, Ruiying Ma, and 4 more authors
    In 2025 IEEE 41st International Conference on Data Engineering (ICDE), 2025
  5. Detecting Contextually Sensitive Data with AI
    Liang Telkamp, Melanie Rabier, Javier Teran, and 1 more author
    2025

2024

  1. TARGET: Benchmarking Table Retrieval for Generative Tasks
    Xingyu Ji, Aditya Parameswaran, and Madelon Hulsebos
    In NeurIPS 2024 Third Table Representation Learning Workshop, 2024
  2. It Took Longer than I was Expecting: Why is Dataset Search Still so Hard?
    Madelon Hulsebos, Wenjing Lin, Shreya Shankar, and 1 more author
    In Proceedings of the 2024 Workshop on Human-In-the-Loop Data Analytics, Santiago, AA, Chile, 2024
  3. SchemaPile: A Large Collection of Relational Database Schemas
    Till Döhmen, Radu Geacu, Madelon Hulsebos, and 1 more author
    Proc. ACM Manag. Data, May 2024
  4. spade: Synthesizing Data Quality Assertions for Large Language Model Pipelines
    Shreya Shankar, Haotian Li, Parth Asawa, and 7 more authors
    Proc. VLDB Endow., Aug 2024

2023

  1. Observatory: Characterizing Embeddings of Relational Tables
    Tianji Cong, Madelon Hulsebos, Zhenjie Sun, and 2 more authors
    Proc. VLDB Endow., Dec 2023
  2. GitTables: A Large-Scale Corpus of Relational Tables
    Madelon Hulsebos, Çagatay Demiralp, and Paul Groth
    Proc. ACM Manag. Data, May 2023
  3. Models and Practice of Neural Table Representations
    Madelon Hulsebos, Xiang Deng, Huan Sun, and 1 more author
    In Companion of the 2023 International Conference on Management of Data, Seattle, WA, USA, May 2023
  4. Introducing the Observatory Library for End-to-End Table Embedding Inference
    Tianji Cong, Zhenjie Sun, Paul Groth, and 2 more authors
    In NeurIPS 2023 Second Table Representation Learning Workshop, May 2023

2021

  1. Augmenting Decision Making via Interactive What-If Analysis
    Sneha Gathani, Madelon Hulsebos, James Gale, and 2 more authors
    arXiv e-prints, Sep 2021

2020

  1. Sato: contextual semantic type detection in tables
    Dan Zhang, Madelon Hulsebos, Yoshihiko Suhara, and 3 more authors
    Proc. VLDB Endow., Jul 2020

2019

  1. VizNet: Towards A Large-Scale Visualization Learning and Benchmarking Repository
    Kevin Hu, Snehalkumar ’Neil’ S. Gaikwad, Madelon Hulsebos, and 7 more authors
    In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, Scotland Uk, Jul 2019
  2. Sherlock: A Deep Learning Approach to Semantic Data Type Detection
    Madelon Hulsebos, Kevin Hu, Michiel Bakker, and 5 more authors
    In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, Jul 2019