Recent years have seen a sharp increase in the availability of micro data at the firm and firm-to-firm level in international trade. Data platform providers often use proprietary algorithms that match reported firm names to assign identifiers to companies engaged in trade. We show that identifiers in one such platform suffer from substantial mismeasurement—for example, the same firm may be assigned different identifiers across transactions. We propose an algorithm to clean firm names and generate more accurate firm identifiers. Using these, we compare key exporter and importer indicators, as well as firm-to-firm trade indicators, with those based on the platform’s identifiers. The resulting biases are stark: using platform IDs shrinks the measured population of exporters and importers, inflates their average size, and overstates the concentration of trade among a few firms. It also artificially inflates firm entry into and exit from international markets. These distortions extend to firm-to-firm trade networks, which appear spuriously denser, less concentrated among top sellers or buyers, and far more volatile. Our findings caution against the growing reliance on readily available proprietary firm identifiers in studies of firms’ trade responses to global shocks, particularly through changes in their buyer–supplier networks.
| Repository name | URI |
|---|---|
| Reproducible Research Repository (World Bank) | https://reproducibility.worldbank.org |
Paper exhibits were reproduced on a computer with the following specifications:
• OS: Windows 11 Enterprise
• Processor: Intel(R) Core(TM) i5-1145G7 CPU @ 2.60GHz
• Memory available: 15.7 GB
Runtime: 10 hours.
To reproduce the findings in this paper, a new user should:
.Rproj).renv::restore() or manually install the required packages.!master.R.Because some of the data is restricted and not included in the reproducibility package, the results produced by the replicators are provided in the Outputs folder. Interested users can compare these results against those included in the reproducibility package.
The reproducibility package begins from intermediate datasets directly provided by the authors.
The full process used to generate these intermediate datasets from the raw data is not included in the replication workflow and was not executed by the replicators.
For transparency purposes, the authors have included sample code demonstrating how to construct the intermediate dataset for one country (India). This sample code is available in the folder: DataCreation/. The sample code is provided for documentation and reference only and is not required to run the reproducibility package.
For additional information about the full data creation process, interested users may contact the corresponding author at: afernandes@worldbank.org
Some data is restricted and has not been included in the reproducibility package. For more details, please refer to the README file.
| Author | Affiliation | |
|---|---|---|
| Ana Fernandes | World Bank | afernandes@worldbank.org |
| Devaki Ghose | World Bank | dghose@worldbank.org |
| Alejandro Forero | World Bank | aforero@worldbank.org |
| Piyush Panigrahi | International Finance Corporation | ppanigrahi@ifc.org |
2025-11-11
| Location | Code |
|---|---|
| World | WLD |
The materials in the reproducibility packages are distributed as they were prepared by the staff of the International Bank for Reconstruction and Development/The World Bank. The findings, interpretations, and conclusions expressed in this event do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank, or the governments they represent. The World Bank does not guarantee the accuracy of the materials included in the reproducibility package.
| Name | URI |
|---|---|
| Modified BSD3 | https://opensource.org/license/bsd-3-clause/ |
| Name | Affiliation | |
|---|---|---|
| Ana Fernandes | World Bank | afernandes@worldbank.org |
| Reproducibility WBG | World Bank | reproducibility@worldbank.org |
| Name | Abbreviation | Affiliation | Role |
|---|---|---|---|
| Reproducibility WBG | DECDI | World Bank - Development Impact Department | Verification and preparation of metadata |
2025-11-11
1