This page lists all my research publications. Those marked by (*) have not been refereed. This list is also available as a PDF file. Cette page est également disponible en français.

For a friendlier presentation of my research, here is an introduction.

The definitive version of my papers is *always* the one listed on this page,
which includes all reviewer feedback and all corrections and errata (if any).
All my publications are freely
available. Never buy my papers from
scientific publishers: you would be wasting your money, and you risk ending up with an
older or otherwise inferior version.

I try hard to avoid making mistakes in my work, but I'm sorry that they sometimes seem to happen despite my best efforts. When a mistake is found in one of my papers, I usually update it with a corrected version, and also update the links on this page to point to the latest version. However, when the error is major, there is always a risk that older versions of the paper will still confuse or mislead readers, especially in cases where the publisher version (in conference proceedings and journals) cannot be updated.

For this reason, in an attempt to minimize the impact of mistakes, here is a list of the major errors that I know about. Here are my criteria for when an error must be included in this list (some more minor errors may also be covered):

- The error concerns the technical claims of the paper (not the surrounding discussion or other material).
- The error is major, e.g., a result is not true, or a proof is not correct, and cannot easily be fixed by an expert reader and requires substantial new arguments.
- The error occurs in a version of the work that had been published at a peer-reviewed conference or journal. In other words, I may not cover errors which we were able to correct before the paper is accepted for publication, even when preliminary drafts of the work had been made public (e.g., on arXiv). If these happen, I may only update the drafts without listing the error here.

Here is the list of errata:

*Leveraging the Structure of Uncertain Data*(on Thèses en ligne). I have found an error in the proof of Theorem 6.2.5 of my PhD thesis, on page numbered 108 of the manuscript. The result in question also appears (without proof) as Theorem 5.5 of*Tractable Lineages on Treelike Instances: Limits and Extensions*. The error is that the definition of the advice function in this proof does not only depend on the size of the instance but also on the instance itself. We believe that the result holds as stated and could be fixed by changing the advice to give an embedding of a sufficiently large wall graph, in which the input graph can be embedded in deterministic polynomial time. We are investigating this towards posting a corrected proof of that result (as of May 2022).*Query Answering with Transitive and Linear-Ordered Data*(arXiv version 1, conference proceedings version). An error was reported to us by Andreas Pieris and Andrew Bailey in the proof of Theorem 9 (Appendix H of the arXiv version). The same error exists in the journal version of the work: please refer to this erratum where we discuss the error in more detail and explain how the error was fixed. In the conference version, the error is in the step "From UCQ to CQ" in appendix H of the arXiv version, which increases the arity.*Possible and Certain Answers for Queries over Order-Incomplete Data*(arXiv version 1, conference proceedings version). I have found a issue in the proof of Theorem 22 (both versions), which also affects Theorem 19 and Theorem 30 (both versions). We no longer claim that these results hold. The specific error is in the proof of Proposition 66 (in arXiv version 1) for the case of the product operator. The latest version of the article (arXiv version 2) removes the affected results and discusses these issues in Appendix G. The journal version of the article (arXiv) contains neither the affected results nor the discussion in this appendix.*Query Answering with Transitive and Linear-Ordered Data*(journal version). An error was reported to us by Andreas Pieris and Andrew Bailey in the proof of Theorem 6.2. The same error exists in the conference version of the work: see this erratum. The given proof works if the query*Q*is allowed be a UCQ, or if the signature is not required to be arity-two, but not both. The error is that the last step of the proof, given in Appendix E.5.1, used to cover the case of a CQ rather than a UCQ, increases the signature arity (with Lemma E.6), which we did not realize. The arXiv version of this work, posted in 2022, gives a corrected proof of the claimed result.*When Can We Answer Queries Using Result-Bounded Data Interfaces?*(arXiv version 1, arXiv version 2, conference proceedings version (closed access)).- An error was reported by reviewers of the upcoming journal version of this work concerning the case of finite answerability. The error is in Proposition 2.2, which does not apply to schemas with result bounds. We do not know if Proposition 2.2 is correct in the setting with result bounds studied in the paper, or how it could be proved. This error affects all results concerning finite answerability, so the general claims at the beginning of Sections 5 and 7 that all results also extend to that case, and the specific Corollary 7.3. We do not know if our specific results in the finite case are also true (no matter the status of Proposition 2.2). We also do not have knowledge about any reasons why they would false. All results related to finite answerability claimed in the paper should instead be considered as open. This is clarified in the journal version: see this erratum.
- There are several imprecisions in the proof of Theorem G.3 in appendix G.2.3, which is used to prove the EXPTIME bound of Theorem 7.2 in the main text of the paper. A revised and corrected proof is given in this work and should be read instead of Appendix G.2.3.

*When Can We Answer Queries Using Result-Bounded Data Interfaces?*(arXiv version 1). An error was reported during the review process, concerning the case of finite answerability. This error already existed in the conference version of this work: see this erratum. The affected results were removed in arXiv version 2 and subsequent versions.*Enumeration on Trees with Tractable Combined Complexity and Efficient Updates*(arXiv versions 1 and 2, conference proceedings version (closed access). An issue was found in the LICS'18 paper*MSO Queries on Trees: Enumerating Answers under Updates Using Forest Algebras*, on which some of our own results crucially depend. It is currently open whether the results of the LICS'18 paper can indeed be proved as stated. In our own paper, this affects Lemma 7.4 (for the part asserting that the encoding is linear-time computable, and for the part asserting that updates can be efficiently performed), and thus affects Theorem 8.1, Corollary 8.2, Corollary 8.3, Corollary 8.4, and Theorem 8.5 (in what concerns the preprocessing time and the update time). All these results from our paper are thus only proven conditionally to the fact that one can indeed prove the results claimed in Section 3 of the LICS'18 paper. This is currently under investigation by the author of the paper (as of February 2022), and we will update the arXiv version of our paper depending on how this develops.*Managing Consent for Data Access in Shared Databases*(conference proceedings version, full version). I have found an issue in Proposition III.4 (conference proceedings version), also present in the corresponding results of the full version. The complexity and approximation ratios claimed in the proposition are incorrect. The correct complexity should be the following: O(|¯D||Q|×|Q|2+|Q(¯D)|×|Q|2|Q|+1×|¯D|2). Specifically, the first term is the complexity of computing the result of the query along with its provenance as a DNF formula (the number of possible assignments is the number of facts of the database to the power of the number of query atoms, and the time to check an assignment is in time proportional to the query size), and then minimizing the DNF formula of each tuple in time proportional to the maximal number of tuples (|¯D||Q|) times the maximal size of a provenance formula (|Q|2). The second term is the complexity of computing the provenance as a CNF formula (expanding to a conjunction of at most |Q||Q| clauses of size |Q| and then minimizing them by considering each pair of clauses and testing subsumption), then multiplied by the complexity of performing the Q-value algorithm (which is in the number of variables squared). The correct approximation ratio should be log(|Q||Q|+1×|Q(¯D)|). The first factor in the logarithm is because, for each tuple, the number of terms in the DNF is |Q| and the number of clauses in the CNF is |Q||Q|. The second factor in the logarithm is the number of tuples. Please note that, while these are the only errors currently known in the conference proceedings version, the full version of the paper is still a "work in progress" and contains (as of November 2021) some other errors not affecting the conference proceedings version.