Machine Extraction of Tax Laws from Legislative Texts

Ash, Elliott; Guillot, Malka; Han, Luyang

doi:10.18653/v1/2021.nllp-1.7

Download

Paper published in a book (Scientific congresses and symposiums)

Machine Extraction of Tax Laws from Legislative Texts

Ash, Elliott; Guillot, Malka; Han, Luyang

2021 • In Androutsopoulos, Ion; Aletras, Nikolaos; Barrett, Leslie et al. (Eds.) Proceedings of the Natural Legal Language Processing Workshop 2021

Peer reviewed

Permalink
https://hdl.handle.net/2268/264509

DOI
10.18653/v1/2021.nllp-1.7

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

Machine_Extraction_of_Tax_Laws_from_Legislative_Texts.pdf

Author preprint (255.78 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Abstract :

[en] Using a corpus of compiled codes from U.S. states containing labeled tax law sections, we train text classifiers to automatically tag tax-law documents and, further, to identify the associated revenue source (e.g. income, property, or sales). After evaluating classifier performance in held-out test data, we apply them to an historical corpus of U.S. state legislation to extract the flow of relevant laws over the years 1910 through 2010. We document that the classifiers are effective in the historical corpus, for example by automatically detecting establishments of state personal income taxes. The trained models with replication code are published at https://github.com/luyang521/tax-classification.

Disciplines :

Computer science

Author, co-author :

Ash, Elliott

Guillot, Malka ; Université de Liège - ULiège > HEC Liège : UER > Microéconomie appliquée

Han, Luyang

Language :

English

Title :

Machine Extraction of Tax Laws from Legislative Texts

Publication date :

November 2021

Event name :

Natural Legal Language Processing Workshop 2021

Event date :

10/11/2021

Audience :

International

Main work title :

Proceedings of the Natural Legal Language Processing Workshop 2021

Editor :

Androutsopoulos, Ion

Aletras, Nikolaos

Barrett, Leslie

Goanta, Catalina

Preotiuc-Pietro, Daniel

Publisher :

Association for Computational Linguistics, Punta Cana, Dominican Republic

Peer reviewed :

Peer reviewed

Available on ORBi :

since 26 October 2021

Statistics

Number of views

89 (8 by ULiège)

Number of downloads

46 (6 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Benjamin Alarie, Anthony Niblett, and Albert H Yoon. 2016. Using machine learning to predict outcomes in tax law. Can. Bus. LJ, 58:231.
Houda Alberts, Akin Ipek, Roderick Lucas, and Phillip Wozny. 2020. Coliee 2020: Legal information retrieval and entailment with legal embeddings and boosting. In JSAI International Symposium on Artificial Intelligence, pages 211-225. Springer.
Jonathan H Choi. 2020. An empirical study of statutory interpretation in tax law. NYUL Rev., 95:363.
Jerome Friedman, Trevor Hastie, Robert Tibshirani, et al. 2001. The elements of statistical learning, volume 1. Springer series in statistics New York.
Jerrold Soh Tsin Howe, Lim How Khang, and Ian Ernst Chai. 2019. Legal area classification: A comparative study of text classifiers on singapore supreme court judgments. arXiv preprint arXiv:1904.06470.
Mi-Young Kim, Juliano Rabelo, and Randy Goebel. 2019. Statute law information retrieval and entailment. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, pages 283-289.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
Raquel Mochales Palau and Marie-Francine Moens. 2009. Argumentation mining: the detection, classification and structure of arguments in text. In Proceedings of the 12th international conference on artificial intelligence and law, pages 98-107.
Kurt Schmidheiny and Sebastian Siegloch. 2019. On event study designs and distributed-lag models: Equivalence, generalization and practical implications.
Souvik Sengupta and Vishwang Dave. 2021. Predicting applicable law sections from judicial case reports using legislative text analysis with machine learning. Journal of Computational Social Science, pages 1-14.
Bernhard Waltl, Georg Bonczek, Elena Scepankova, Jörg Landthaler, and Florian Matthes. 2017. Predicting the outcome of appeal decisions in germany's tax law. In International Conference on Electronic Participation, pages 89-99. Springer.
Thomas Wolf, Julien Chaumond, Lysandre Debut, Victor Sanh, Clement Delangue, Anthony Moi, Pierric Cistac, Morgan Funtowicz, Joe Davison, Sam Shleifer, et al. 2020. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38-45.
Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, et al. 2020. Big bird: Transformers for longer sequences. In NeurIPS.