Publications

Efficiently Digitizing World Knowledge

OCR open-source software

Tom Bryan, Jacob Carlson, Abhishek Arora, and Melissa Dell. “EfficientOCR: An Extensible, Open-Source Package for Efficiently Digitizing World Knowledge”,Empirical Methods on Natural Language Processing (Systems Demonstrations Track), forthcoming Paper.

Quantifying Character Similarity with Vision Transformers

vision transformers record linkage contrastive learning

Xinmei Yang, Abhishek Arora, Shao-Yu Jheng, and Melissa Dell. “Quantifying Character Similarity with Vision Transformers,” Paper. Forthcoming Empirical Methods on Natural Language Processing.

Headlines

semantic similarity contrastive learning

Emily Silcock and Melissa Dell. “A Massive Scale Semantic Similarity Dataset of Historical English,” NeurIPS (Benchmarks and Datasets Track), forthcoming Paper, Dataset

American Stories

historical newspapers layout detection OCR

Melissa Dell, Jacob Carlson, Tom Bryan, Emily Silcock, Abhishek Arora, Zejiang Shen, Luca D’Amico-Wong, Quan Le, Pablo Querubin, Leander Heldring. “American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers” NeurIPS (Benchmarks and Datasets Track), forthcoming Paper, Dataset, Github

NEWS-COPY

text analysis contrastive learning massive corpora

Silcock, Emily, Luca D’Amico-Wong, Jinglin Yang, and Melissa Dell. “Noise-Robust De-Duplication at Scale”, International Conference on Learning Representations, vol. 332 (2023). Paper

OLALA

document image analysis annotation active learning

Shen, Zejiang, Jian Zhao, Yaoliang Yu, Weining Li, and Melissa Dell. “OLALA: Object-Level Active Learning Based Layout Annotation.” EMNLP Computational Social Science Workshop (2023). Paper; Code

Layout Parser

object detection active learning OCR document image analysis

Shen, Zejiang, Ruochen Zhang, Melissa Dell, Benjamin Lee, Jacob Carlson, and Weining Li. “LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis.” International Conference on Document Analysis and Recognition (2021), 131-146. Article Github Website

HJDataset

object detection document image analysis

Shen, Zejiang, Kaixuan Zhang, and Melissa Dell. “A Large Dataset of Historical Japanese Documents with Complex Layouts.” IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2020): 548-559. Article; Website

The Dutch Cultivation System

long-run development colonialism economic organization

Dell, Melissa, and Benjamin Olken. “The Development Effects of the Extractive Colonial Economy: The Dutch Cultivation System in Java.” Review of Economic Studies 87, no. 1 (2020): 164-203. Paper; Appendix; Data Files

Information Extraction with Complex Layouts

layout analysis text extraction historical documents

Zhang, Kaixuan, Zejiang Shen, Jie Zhou, and Melissa Dell. “Information Extraction from Text Regions with Complex Tabular Structure.” Conference on Neural Information Processing Systems Document Intelligence Workshop (2019). Paper

Trade-Induced Worker Displacement

violent crime drug trafficking unemployment trade

Dell, Melissa, Benjamin Feigenberg, and Kensuke Teshima. “The Violent Consequences of Trade-Induced Worker Displacement in Mexico.” American Economic Review: Insights 1, no. 1 (2019): 43-58. Paper; Appendix; Data files

Nation Building through Foreign Intervention

military strategies foreign intervention nation building

Dell, Melissa, and Pablo Querubin. “Nation Building Through Foreign Intervention: Evidence from Discontinuities in Military Strategies.” Quarterly Journal of Economics 133, no. 2 (2018): 701-764. Paper; Appendix; Data files

The Historical State in Vietnam

long-run economic development historical states state capacity

Dell, Melissa, Nathan Lane, and Pablo Querubin. “The Historical State, Local Collective Action, and Economic Development in Vietnam.” Econometrica 86, no. 6 (2018): 2083-2121. Paper; Published Appendix; Online Only Appendix; Data files

Trafficking Networks and the Mexican Drug War

violence policing networks

Dell, Melissa. “Trafficking Networks and the Mexican Drug War.” American Economic Review 105, no. 6 (2015): 1738-1779. Paper; appendix; data files

What Do We Learn from the Weather?

growth climate weather

Dell, Melissa, Benjamin Jones, and Benjamin Olken. “What Do We Learn from the Weather? The New Climate-Economy Literature.” Journal of Economic Literature (2014). Paper; Appendix

Temperature Shocks and Economic Growth

temperature growth

Dell, Melissa, Benjamin Jones, and Benjamin Olken. “Temperature Shocks and Economic Growth: Evidence from the Last Half Century.” American Economic Journal: Macroeconomics 4, no. 3 (2012): 66-95. Paper; Appendix; Data files

Productivity Differences

productivity human capital subnational income differences

Dell, Melissa, and Daron Acemoglu. “Productivity Differences Between and Within Countries.” American Economic Journal: Macroeconomics 2, no. 1 (2010): 169–188. Paper; Appendix; Data files

Peru's Mining Mita

extractive institutions land tenure public goods

Dell, Melissa. “The Persistent Effects of Peru’s Mining Mita.” Econometrica 78, no. 6 (2010): 1863-1903.Article; Appendix; Spanish translation; Data files 1; Data files 2; Data files 3

Tempearture and Income

growth temperature Latin America

Dell, Melissa, Benjamin Jones, and Benjamin Olken. “Temperature and Income: Reconciling New Cross-Sectional and Panel Estimates.” American Economic Review Papers and Proceedings 99, no. 2 (2009): 198-204. Article; Appendix

Working Papers

Linking Representations with Multimodal Contrastive Learning

multimodal learning contrastive learning record linkage

Arora, Abhishek, Xinmei Yang, Shao-Yu Jheng, and Melissa Dell. “Linking Representations with Multimodal Contrastive Learning,” Paper

LinkTransformer

transformers record linkage data wrangling

Abhishek Arora and Melissa Dell. “LinkTransformer: A Unified Package for Record Linkage with Transformer Language Models.” Webpage, Paper, Github

Efficient OCR for Building a Diverse Digital History

ocr contrastive learning image retrieval

Carlson, Jacob, Tom Bryan, and Melissa Dell. “Efficient OCR for Building a Diverse Digital History” Paper

Path Dependence in Development

path dependence long-run development agrarian reform

Dell, Melissa. “Path Dependence in Development: Evidence from the Mexican Revolution.” Paper; Appendix