This page lists all my research publications.
Those marked by (*) have not been refereed. This list is also available as a
PDF file. Cette page est également disponible en
français.
For a friendlier presentation of my research, here is an
introduction.
The definitive version of my papers is always the one listed on this page,
which includes all reviewer feedback and all corrections and errata (if any).
All my publications are freely
available. Never buy my papers from
scientific publishers: you would be wasting your money, and you risk ending up with an
older or otherwise inferior version.
-
Antoine Amarilli,
Michael Benedikt.
Tighter Bounds for Query Answering with Guarded TGDs.
Under review. (*)
-
Antoine Amarilli.
Survey of Results on the ModPath and ModCycle Problems.
arXiv:2409.00770v1, 2024.
[code] (*)
-
Antoine Amarilli,
Florent Capelli.
Tractable Circuits in Database Theory.
SIGMOD Record Database Principles Column. (*)
-
Antoine Amarilli,
Marcelo Arenas,
YooJung Choi,
Mikaël Monet,
Guy Van den Broeck,
Benjie Wang.
A Circus of Circuits: Connections Between Decision Diagrams, Circuits, and Automata.
arXiv:2404.09674v1, 2024. (*)
-
Antoine Amarilli,
Mikaël Monet,
Dan Suciu.
The Non-Cancelling Intersections Conjecture.
arXiv:2401.16210v1, 2024. (*)
-
Antoine Amarilli,
Timothy van Bremen,
Kuldeep S. Meel.
Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability.
ICDT 2024.
[journal version, slides by Timothy van Bremen, minor errata]
-
Antoine Amarilli,
Pierre Bourhis,
Florent Capelli,
Mikaël Monet.
Ranked Enumeration for MSO on Trees via Knowledge Compilation.
ICDT 2024.
[slides by Pierre Bourhis]
-
Antoine Amarilli,
Benny Kimelfeld,
Sébastien Labbé,
Stefan Mengel.
Skyline Operators for Document Spanners.
ICDT 2024.
[slides]
-
Antoine Amarilli,
Charles Paperman.
Locality and Centrality: The Variety ZG.
LMCS.
-
Osnat Drien,
Matanya Freiman,
Antoine Amarilli,
Yael Amsterdamer.
Query-Driven Resolution in Uncertain Databases.
SIGMOD 2023.
-
Antoine Amarilli.
Degree-3 Planar Graphs as Topological Minors of Wall Graphs in Polynomial Time.
arXiv:2302.03461v3, 2023. (*)
-
Antoine Amarilli.
Query Evaluation: Enumeration, Maintenance, Reliability.
Habilitation thesis. (*)
-
Antoine Amarilli,
Mikaël Monet.
Enumerating Regular Languages with Bounded Delay.
STACS 2023.
[slides by Mikaël Monet, minor errata]
-
Antoine Amarilli.
Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries.
ICDT 2023.
[slides, video]
-
Antoine Amarilli,
Benny Kimelfeld.
Uniform Reliability of Self-Join-Free Conjunctive Queries.
LMCS, 2022.
[conference version]
-
Antoine Amarilli,
Mikaël Monet.
Weighted Counting of Matchings in Unbounded-Treewidth Graph Families.
MFCS 2022.
[slides by Mikaël Monet, code]
-
Antoine Amarilli,
Yael Amsterdamer.
Worst-Case Analysis for Interactive Evaluation of Boolean Provenance.
TAPP 2022.
[slides by Yael Amsterdamer]
-
Antoine Amarilli,
Louis Jachiet,
Martín Muñoz,
Cristian Riveros.
Efficient Enumeration Algorithms for Annotated Grammars.
PODS 2022.
[slides by Martín Muñoz, video by Martín Muñoz on Youtube]
-
Antoine Amarilli,
Michael Benedikt.
When Can We Answer Queries Using Result-Bounded Data Interfaces?
LMCS, 2022.
[conference version]
-
Antoine Amarilli,
İsmail İlkan Ceylan.
The Dichotomy of Evaluating Homomorphism-Closed Queries on Probabilistic Graphs.
LMCS, 2022.
[conference version]
-
Antoine Amarilli,
Louis Jachiet,
Charles Paperman.
Dynamic Membership for Regular Languages.
ICALP 2021.
[slides, video on Youtube or in direct download]. Best paper award of ICALP'21 track B.
-
Osnat Drien,
Antoine Amarilli,
Yael Amsterdamer.
Managing Consent for Data Access in Shared Databases.
ICDE 2021. Short paper.
[errata]
-
Antoine Amarilli,
Benny Kimelfeld.
Uniform Reliability of Self-Join-Free Conjunctive Queries.
ICDT 2021.
[journal version, slides, video on PeerTube INFRES or in direct download]
-
Antoine Amarilli,
Pierre Bourhis,
Stefan Mengel,
Matthias Niewerth.
Constant-Delay Enumeration for Nondeterministic Document Spanners.
TODS, 2021.
[conference version, code by Rémi Dupré and Matthias Niewerth]
-
Julien Romero,
Nicoleta Preda,
Antoine Amarilli,
Fabian M. Suchanek.
Computing and Illustrating Query Rewritings on Path Views with Binding Patterns.
CIKM 2020. Demo paper.
[video by Julien Romero on ACM Digital Library or in direct download, code by Julien Romero]
-
Antoine Amarilli,
İsmail İlkan Ceylan.
A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs.
ICDT 2020.
[journal version, slides, poster, video on TIB AV-Portal or in direct download]. Best paper award of ICDT'20.
-
Julien Romero,
Nicoleta Preda,
Antoine Amarilli,
Fabian M. Suchanek.
Equivalent Rewritings on Path Views with Binding Patterns.
ESWC 2020.
[slides, video by Julien Romero on Videolectures.net or on Youtube or in direct download]
-
Antoine Amarilli,
Michael Benedikt.
Finite Open-World Query Answering with Number Restrictions.
ToCL, 2020.
[conference version]
-
Antoine Amarilli,
Florent Capelli,
Mikaël Monet,
Pierre Senellart.
Connecting Knowledge Compilation Classes and Width Parameters.
ToCS, 2020.
[conference version]
-
Andy Shih,
Guy Van den Broeck,
Paul Beame,
Antoine Amarilli.
Smoothing Structured Decomposable Circuits.
NeurIPS 2019. Spotlight presentation.
[slides by Andy Shih, poster by Andy Shih]
-
Antoine Amarilli,
M. Lamine Ba,
Daniel Deutch,
Pierre Senellart.
Computing Possible and Certain Answers over Order-Incomplete Data.
TCS, 2019.
[conference version]
-
Antoine Amarilli,
Pierre Bourhis,
Stefan Mengel,
Matthias Niewerth.
Enumeration on Trees with Tractable Combined Complexity and Efficient Updates.
PODS 2019.
[slides by Matthias Niewerth, poster by Matthias Niewerth, video by Matthias Niewerth on TIB AV-Portal or in direct download, errata]
-
Antoine Amarilli,
Pierre Bourhis,
Stefan Mengel,
Matthias Niewerth.
Constant-Delay Enumeration for Nondeterministic Document Spanners.
ICDT 2019.
[journal version, slides by Matthias Niewerth, code by Rémi Dupré and Matthias Niewerth]. Featured in ACM SIGMOD Research Highlights.
-
Antoine Amarilli,
Pierre Bourhis,
Mikaël Monet,
Pierre Senellart.
Evaluating Datalog via Tree Automata and Cycluits.
ToCS, 2019.
[conference version]
-
Antoine Amarilli,
Michael Benedikt,
Pierre Bourhis,
Michael Vanden Boom.
Query Answering with Transitive and Linear-Ordered Data.
JAIR, 2018.
[conference version, errata]
-
Antoine Amarilli,
Charles Paperman.
Topological Sorting under Regular Constraints.
ICALP 2018.
[slides]
-
Antoine Amarilli,
Michael Benedikt.
When Can We Answer Queries Using Result-Bounded Data Interfaces?
PODS 2018.
[journal version, slides, poster, video on Youtube or in direct download, errata]
-
Antoine Amarilli,
Mikaël Monet,
Pierre Senellart.
Connecting Width and Structure in Knowledge Compilation.
ICDT 2018.
[journal version, slides by Mikaël Monet]
-
Antoine Amarilli,
Pierre Bourhis,
Stefan Mengel.
Enumeration on Trees under Relabelings.
ICDT 2018.
[slides, poster]
-
Antoine Amarilli,
M. Lamine Ba,
Daniel Deutch,
Pierre Senellart.
Possible and Certain Answers for Queries over Order-Incomplete Data.
TIME 2017.
[journal version, slides, errata]
-
Antoine Amarilli,
Pierre Bourhis,
Louis Jachiet,
Stefan Mengel.
A Circuit-Based Approach to Efficient Enumeration.
ICALP 2017.
[slides]
-
Antoine Amarilli,
Mikaël Monet,
Pierre Senellart.
Conjunctive Queries on Probabilistic Graphs: Combined Complexity.
PODS 2017.
[slides by Mikaël Monet, poster by Mikaël Monet]
-
Antoine Amarilli,
Pierre Bourhis,
Mikaël Monet,
Pierre Senellart.
Combined Tractability of Query Evaluation via Tree Automata and Cycluits.
ICDT 2017.
[journal version, slides by Mikaël Monet, poster by Mikaël Monet]
-
Antoine Amarilli,
Yael Amsterdamer,
Tova Milo,
Pierre Senellart.
Top-k Queries on Unknown Values under Order Constraints.
ICDT 2017.
[slides by Yael Amsterdamer, poster by Yael Amsterdamer]
-
Luis Galárraga,
Simon Razniewski,
Antoine Amarilli,
Fabian M. Suchanek.
Predicting Completeness in Knowledge Bases.
WSDM 2017.
[slides by Luis Galárraga, poster by Luis Galárraga]
-
Antoine Amarilli,
Silviu Maniu,
Mikaël Monet.
Challenges for Efficient Query Evaluation on Structured Probabilistic Data.
SUM 2016.
[slides by Mikaël Monet]
-
Antoine Amarilli,
Marc Beunardeau,
Rémi Géraud,
David Naccache.
Failure is Also an Option.
The New Codebreakers: Essays Dedicated to David Kahn on the Occasion of His 85th Birthday. (*)
-
Antoine Amarilli,
Michael Benedikt,
Pierre Bourhis,
Michael Vanden Boom.
Query Answering with Transitive and Linear-Ordered Data.
IJCAI 2016.
[journal version, slides by Michael Vanden Boom, longer slides, poster by Michael Vanden Boom, errata]
-
Antoine Amarilli.
Leveraging the Structure of Uncertain Data.
PhD thesis.
[slides, errata] (*)
-
Antoine Amarilli,
Pierre Bourhis,
Pierre Senellart.
Tractable Lineages on Treelike Instances: Limits and Extensions.
PODS 2016.
[slides, poster, errata]
-
Ruiming Tang,
Antoine Amarilli,
Pierre Senellart,
Stéphane Bressan.
A Framework for Sampling-Based XML Data Pricing.
TLDKS, 2016.
[conference version]
-
Antoine Amarilli,
Mehryar Mohri,
Cyril Allauzen.
Minimum Bayesian Risk Methods for Automatic Speech Recognition.
US Patent 9123333.
[not an endorsement of software patents]
-
Antoine Amarilli.
Possibility in Probabilistic XML.
ISI, 2015.
[conference version, the publisher version is no longer available]
-
Antoine Amarilli,
Silviu Maniu,
Pierre Senellart.
Intensional Data on the Web.
ACM SIGWEB Newsletter, Summer 2015. (*)
-
Antoine Amarilli,
Michael Benedikt.
Combining Existential Rules and Description Logics.
IJCAI 2015.
[slides, longer slides, poster]
-
Aliaksandr Talaika,
Joanna Biega,
Antoine Amarilli,
Fabian M. Suchanek.
IBEX: Harvesting Entities from the Web Using Unique Identifiers.
WebDB 2015.
[slides]
-
Antoine Amarilli,
Michael Benedikt.
Finite Open-World Query Answering with Number Restrictions.
LICS 2015.
[journal version, slides]
-
Antoine Amarilli,
Pierre Bourhis,
Pierre Senellart.
Provenance Circuits for Trees and Treelike Instances.
ICALP 2015.
[slides]
-
Antoine Amarilli.
Structurally Tractable Uncertain Data.
SIGMOD/PODS Ph.D. Symposium 2015.
[slides]
-
Antoine Amarilli,
Luis Galárraga,
Nicoleta Preda,
Fabian M. Suchanek.
Recent Topics of Research around the YAGO Knowledge Base.
APWEB 2014. (*)
-
Ruiming Tang,
Antoine Amarilli,
Pierre Senellart,
Stéphane Bressan.
Get a Sample for a Discount: Sampling-Based XML Data Pricing.
DEXA 2014.
[journal version, slides by Ruiming Tang]
-
Antoine Amarilli,
Yael Amsterdamer,
Tova Milo.
Uncertainty in Crowd Data Sourcing under Structural Constraints.
UnCrowd 2014.
[slides]
-
Antoine Amarilli.
The Possibility Problem for Probabilistic XML.
AMW 2014.
[journal version, slides]
-
Antoine Amarilli,
Yael Amsterdamer,
Tova Milo.
On the Complexity of Mining Itemsets from the Crowd Using Taxonomies.
ICDT 2014.
[slides]
I try hard to avoid making mistakes in my work, but I'm sorry that they
sometimes seem to happen despite my best efforts. When a mistake is found in one
of my papers, I usually update it with a corrected version, and also update the
links on this page to point to the latest version. However, when the error is
major, there is always a risk that older versions of the paper will still
confuse or mislead readers, especially in cases where the publisher version (in
conference proceedings and journals) cannot be updated.
For this reason, in an attempt to minimize the impact of mistakes, here is a
list of the major errors that I know about. Here are my criteria for when an
error must be included in this list (some more minor errors may also be covered):
- The error concerns the technical claims of the paper (not the surrounding
discussion or other material).
- The error is major, e.g., a result is not true, or a proof is not correct, and
cannot easily be fixed by an expert reader and requires substantial new
arguments.
- The error occurs in a version of the work that had been published at a
peer-reviewed conference or journal. In other words, I may not cover errors which we were
able to correct before the paper is accepted for publication, even when
preliminary drafts of the work had been made public (e.g., on arXiv). If
these happen, I may only update the drafts without listing the error here.
Here is the list of errata:
- Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability:
- Enumerating Regular Languages with Bounded Delay (conference proceedings version, arXiv version 3 and earlier).
On Figure 1e page 8, state 0 of automaton A5 should be non-final instead of
final. The rest of the discussion in Example 3.4 is unchanged.
- Tractable Lineages on Treelike Instances (conference proceedings version (closed access), arXiv version 1).
The result stated as Theorem 5.5 in this work is proved in my PhD thesis, but the proof given there is incorrect. See this erratum for details about this issue. A corrected proof is given in the next version of the article (arXiv version 2).
- Leveraging the Structure of Uncertain Data (on Thèses en ligne).
I have found an error in the proof of Theorem 6.2.5 of my PhD thesis, on page numbered 108 of the manuscript. The result in question also appears (without proof) as Theorem 5.5 of Tractable Lineages on Treelike Instances: Limits and Extensions. The error is that the definition of the advice function in this proof does not only depend on the size of the instance but also on the instance itself. A corrected proof is given in the appendix of arXiv version 2 of Tractable Lineages on Treelike Instances: Limits and Extensions
- Query Answering with Transitive and Linear-Ordered Data
(arXiv version 1, conference
proceedings version).
An error was reported to us by Andreas
Pieris and Andrew Bailey in the
proof of Theorem 9 (Appendix H of the arXiv version). The same error exists in
the journal version of the work: please refer to this
erratum where we discuss the error in more detail
and explain how the error was fixed.
In the conference version, the error is in the step "From UCQ to CQ" in
appendix H of the arXiv version, which increases the arity.
- Possible and Certain Answers for Queries over Order-Incomplete Data (arXiv version 1, conference proceedings
version).
I have found a issue in the proof of Theorem 22 (both versions), which also affects Theorem 19
and Theorem 30 (both versions). We no longer claim that these results hold.
The specific error is in the proof of Proposition 66 (in arXiv version 1) for
the case of the product operator. The latest version of the article (arXiv
version 2) removes the affected results
and discusses these issues in Appendix G. The journal version of the
article (arXiv) contains neither the
affected results nor the discussion in this appendix.
- Query Answering with Transitive and Linear-Ordered Data
(journal version).
An error was reported to us by Andreas
Pieris and Andrew Bailey in the
proof of Theorem 6.2. The same error exists in the conference version of the
work: see this erratum.
The given proof works
if the query Q is allowed be a UCQ, or if the signature is not required to
be arity-two, but not both. The error is that the last step of the proof,
given in Appendix E.5.1, used to cover the case of a CQ rather than a UCQ,
increases the signature arity (with Lemma E.6), which we did not realize. The
arXiv version of this work, posted in
2022, gives a corrected proof of the claimed result.
- When Can We Answer Queries Using Result-Bounded Data Interfaces?
(arXiv version 1,
arXiv version 2,
conference proceedings version (closed
access)).
- An error was reported by reviewers of the upcoming journal version of this
work concerning the case of finite answerability. The error is in Proposition
2.2, which does not apply to schemas with result bounds. We do not know if
Proposition 2.2 is correct in the setting with result bounds studied in the
paper, or how it could be proved. This error affects all results concerning
finite answerability, so the general claims at the beginning of Sections 5 and
7 that all results also extend to that case, and the specific Corollary 7.3.
We do not know if our specific results in the finite case are also true (no
matter the status of Proposition 2.2). We also do not have knowledge about any
reasons why they would false. All results related to finite answerability
claimed in the paper should instead be considered as open. This is clarified
in the journal version: see
this erratum.
- There are several imprecisions in the proof of Theorem G.3 in appendix G.2.3,
which is used to prove the EXPTIME bound of Theorem 7.2 in the main text of
the paper. A revised and corrected proof is given in this
work and should be read instead of Appendix G.2.3.
- Enumeration on Trees with Tractable Combined Complexity and Efficient Updates (arXiv versions 1 and 2,
conference proceedings version (closed access).
An issue was found in the LICS'18 paper MSO Queries on Trees: Enumerating
Answers under Updates Using Forest
Algebras,
on which some of our own results crucially depend. It is currently
open whether the results of the LICS'18 paper can indeed be proved as
stated. In our own paper, this affects Lemma 7.4 (for the part
asserting that the encoding is linear-time computable, and for the
part asserting that updates can be efficiently performed), and thus
affects Theorem 8.1, Corollary 8.2, Corollary 8.3, Corollary 8.4, and
Theorem 8.5 (in what concerns the preprocessing time and the update
time). All these results from our paper are thus only proven
conditionally to the fact that one can indeed prove the results
claimed in Section 3 of the LICS'18 paper. This is currently under
investigation by the author of the paper (as of February 2022), and we will update the arXiv version of
our paper depending on how this develops.
- Managing Consent for Data Access in
Shared Databases (conference proceedings
version, full
version).
I have found an issue in Proposition III.4 (conference
proceedings version), also present in the corresponding results of the
full version. The complexity and approximation ratios claimed in the
proposition are incorrect. The correct complexity should be the following:
O(|¯D||Q|×|Q|2+|Q(¯D)|×|Q|2|Q|+1×|¯D|2).
Specifically, the first term is the complexity of computing the result of
the query along with its provenance as a DNF formula (the number of possible
assignments is the number of facts of the database to the power of
the number of query atoms, and the time to check an assignment is in
time proportional to the query size), and then minimizing the DNF formula of
each tuple in time proportional to the maximal number of tuples
(|¯D||Q|) times the maximal size of a provenance formula (|Q|2).
The second term is the complexity of computing the
provenance as a CNF formula (expanding to a conjunction of at most
|Q||Q| clauses of size |Q| and then minimizing them by considering
each pair of clauses and testing subsumption), then multiplied by the
complexity of performing the Q-value algorithm (which is in the number of
variables squared). The correct approximation ratio should be log(|Q||Q|+1×|Q(¯D)|). The first factor in the logarithm is
because, for each tuple, the number of terms in the DNF is |Q| and the
number of clauses in the CNF is |Q||Q|. The second factor in the logarithm is the number of tuples.
Please note that, while these are the only errors currently known in the
conference proceedings version, the full version of the paper is still a
"work in progress" and contains (as of November 2021) some other errors not
affecting the conference proceedings version.