Publications:

Baste, Ø., Cyndecka, M. A., Esayas, S., Langford, M., Lison, P., & Weitzenboeck, E. (2025). Open Justice Data in Europe: A Patchwork. Available at SSRN 5207840.

Charpentier, L. G. G., & Lison, P. (2025). Re-identification of De-identified Documents with Autoregressive Infilling. In W. Che, J. Nabende, E. Shutova, & M. T. Pilehvar (Eds.), Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1192–1209). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.acl-long.60

Pilán, I., Manzanares-Salor, B., Sánchez, D., & Lison, P. (2024). Truthful Text Sanitization Guided by Inference Attacks. https://arxiv.org/abs/2412.12928

Manzanares-Salor, B., Sánchez, D., & Lison, P. (2024). Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack. Data Mining and Knowledge Discovery, 1–36.

Papadopoulou, A., Lison, P., Anderson, M., Øvrelid, L., & Pilán, I. (2023). Neural Text Sanitization with Privacy Risk Indicators: An Empirical Analysis. ArXiv Preprint ArXiv:2310.14312.

Olstad, A. W., Papadopoulou, A., & Lison, P. (2023). Generation of Replacement Options in Text Sanitization. In T. Alumäe & M. Fishel (Eds.), Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa) (pp. 292–300). University of Tartu Library.

Papadopoulou, Anthi, Yu, Yunhao, Lison, Pierre and Øvrelid, Lilja (2022) Neural Text Sanitization with Explicit Measures of Privacy Risk. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

Manzanares-Salor, Benet, Sánchez, David and Pierre Lison (2022) Automatic Evaluation of Disclosure Risks of Text Anonymization Methods. In Privacy in Statistical Databases (PSD 2022). Paris, France.

Weitzenboeck, E., Lison, P., Cyndecka, M. & Langford, M. (2022) GDPR and unstructured data: is anonymization possible? International Data Privacy Law, 12(3).

Pilán, I., Lison, P, Øvrelid, L., Papadopoulou, A., Sánchez, D. & Batet, M. (2022) The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization. Computational Linguistics, 48(4): 1053-1101.

Papadopoulou, Anthi, Lison, Pierre, Øvrelid, Lilja and Pilán, Ildikó (2022) Bootstrapping Text Anonymization Models with Distant Supervision . In Proceedings of the Language Resources and Evaluation Conference. ELRA, Marseille, France.

Pierre Lison, Ildikó Pilán, David Sánchez, Montserrat Batet, and Lilja Øvrelid. 2021. Anonymisation Models for Text Data: State of the Art, Challenges and Future Directions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pages 4188–4203, Online. Association for Computational Linguistics. [pdf]

Pierre Lison, Jeremy Barnes, and Aliaksandr Hubin. 2021. skweak: Weak supervision made easy for NLP. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, pages 337–346, Online. Association for Computational Linguistics. [pdf] [code]