{"type":"script","doc_desc":{"producers":[{"name":"Reproducibility WBG","abbr":"DECDI","affiliation":"World Bank - Development Impact Department","role":"Verification and preparation of metadata"}],"prod_date":"2025-11-11","version":"1"},"project_desc":{"authoring_entity":[{"name":"Ana Fernandes","affiliation":"World Bank","email":"afernandes@worldbank.org"},{"name":"Devaki Ghose","affiliation":"World Bank","email":"dghose@worldbank.org"},{"name":"Alejandro Forero","affiliation":"World Bank","email":"aforero@worldbank.org"},{"name":"Piyush Panigrahi","affiliation":"International Finance Corporation","email":"ppanigrahi@ifc.org"}],"title_statement":{"title":"Reproducibility package for What's In A Name? Implications Of Extensive Margin Measurement In International Trade","idno":"RR_WLD_2025_452"},"data_statement":"Some data is restricted and has not been included in the reproducibility package. For more details, please refer to the README file. ","software":[{"name":"R","version":"4.4.1"}],"scripts":[{"title":"Reproducibility package for What's In A Name? Implications Of Extensive Margin Measurement In International Trade","date":"2025-11","notes":"Computational reproducibility verified by Development Impact (DECDI) Analytics team, World Bank.","instructions":"See README in reproducibility package.","file_name":"RR_WLD_2025_452","zip_package":"RR_WLD_2025_452.zip","dependencies":"R dependencies are listed in the file renv.lock."}],"repository_uri":[{"name":"Reproducible Research Repository (World Bank)","uri":"https:\/\/reproducibility.worldbank.org"}],"production_date":"2025-11-11","abstract":"Recent years have seen a sharp increase in the availability of micro data at the firm and firm-to-firm level in international trade. Data platform providers often use proprietary algorithms that match reported firm names to assign identifiers to companies engaged in trade. We show that identifiers in one such platform suffer from substantial mismeasurement\u2014for example, the same firm may be assigned different identifiers across transactions. We propose an algorithm to clean firm names and generate more accurate firm identifiers. Using these, we compare key exporter and importer indicators, as well as firm-to-firm trade indicators, with those based on the platform\u2019s identifiers. The resulting biases are stark: using platform IDs shrinks the measured population of exporters and importers, inflates their average size, and overstates the concentration of trade among a few firms. It also artificially inflates firm entry into and exit from international markets. These distortions extend to firm-to-firm trade networks, which appear spuriously denser, less concentrated among top sellers or buyers, and far more volatile. Our findings caution against the growing reliance on readily available proprietary firm identifiers in studies of firms\u2019 trade responses to global shocks, particularly through changes in their buyer\u2013supplier networks.","geographic_units":[{"name":"World","code":"WLD"}],"keywords":[{"name":"Firm-Level Trade"},{"name":"Exporter Dynamics"},{"name":"Importer Dynamics"},{"name":"Extensive Margin"},{"name":"Intensive Margin"},{"name":"Firm-To-Firm Trade"},{"name":"Data Mismeasurement"},{"name":"Name Cleaning Or Entity Resolution Algorithm"}],"topics":[{"id":"F10","uri":"https:\/\/www.aeaweb.org\/econlit\/jelCodes.php?view=jel","vocabulary":"Journal of Economic Literature (JEL)","name":"General","parent_id":"F1"},{"id":" F14","uri":"https:\/\/www.aeaweb.org\/econlit\/jelCodes.php?view=jel","vocabulary":"Journal of Economic Literature (JEL)","name":"Empirical Studies of Trade","parent_id":"F1"}],"output":[{"type":"Working Paper","description":"Policy Research Working Papers (PRWP)","title":"What's In A Name? Implications Of Extensive Margin Measurement In International Trade"}],"language":[{"name":"English","code":"EN"}],"technology_requirements":"Runtime: 10 hours. ","disclaimer":"The materials in the reproducibility packages are distributed as they were prepared by the staff of the International Bank for Reconstruction and Development\/The World Bank. The findings, interpretations, and conclusions expressed in this event do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank, or the governments they represent. The World Bank does not guarantee the accuracy of the materials included in the reproducibility package.","license":[{"name":"Modified BSD3","uri":"https:\/\/opensource.org\/license\/bsd-3-clause\/"}],"contacts":[{"name":"Ana Fernandes","affiliation":"World Bank","email":"afernandes@worldbank.org"},{"name":"Reproducibility WBG","affiliation":"World Bank","email":"reproducibility@worldbank.org"}],"reproduction_instructions":"To reproduce the findings in this paper, a new user should:\n\n1. Obtain access to the restricted data and place it in the appropriate folder as indicated in the README.\n2. Open the project file (`.Rproj`).\n3. Restore the environment using `renv::restore()` or manually install the required packages.\n4. Open `!master.R`.\n5. Run the code.\n\nBecause some of the data is restricted and not included in the reproducibility package, the results produced by the replicators are provided in the `Outputs` folder. Interested users can compare these results against those included in the reproducibility package.\n\nThe reproducibility package begins from intermediate datasets directly provided by the authors.\nThe full process used to generate these intermediate datasets from the raw data is not included in the replication workflow and was not executed by the replicators.\nFor transparency purposes, the authors have included sample code demonstrating how to construct the intermediate dataset for one country (India). This sample code is available in the folder: `DataCreation\/`. The sample code is provided for documentation and reference only and is not required to run the reproducibility package.\nFor additional information about the full data creation process, interested users may contact the corresponding author at: afernandes@worldbank.org","technology_environment":"Paper exhibits were reproduced on a computer with the following specifications:\n\u2022 OS: Windows 11 Enterprise\n\u2022 Processor: Intel(R) Core(TM) i5-1145G7 CPU @ 2.60GHz\n\u2022 Memory available: 15.7 GB","datasets":[{"name":"S&P Panjiva Trade Data Platform","note":"Files location: data\/raw\/trade and data\/raw\/allidscorrespondence. Proprietary shipment-level export and import data were obtained from the S&P Panjiva trade data platform for seven countries: India (March 2024), Colombia (August 2024), Mexico (July and August 2024), Peru (January 2024), Sri Lanka (August 2024), Uruguay (December 2023), and Vietnam (November 2023). Data are proprietary and not available for public sharing. The authors obtained permission and legitimate access from S&P Panjiva. A list of all the datasets and countries is included in the reproducibility package file: data_hash_report.csv.  For more information on this data, please get in touch with the author Ana M. Fernandes (afernandes@worldbank.org).","access_type":"Data is restricted and not included in the reproducibility package","uri":"https:\/\/panjiva.com\/","citation":"S&P Global. S&P Panjiva Trade Data Platform [dataset]. Shipment-level export and import data for India, Mexico, Peru, Sri Lanka, Uruguay, and Vietnam. Available under proprietary license at https:\/\/panjiva.com\/."},{"name":"Exporter Dynamics Database (EDD) 3.0","note":"Files location: data\/raw\/support\/consolidation_1996_2022.dta and data\/raw\/support\/statistics_exporterimporters_2019.xlsx. Data update is forthcoming at the Development Data Hub. Contact: Ana M. Fernandes (afernandes@worldbank.org).","uri":"https:\/\/datacatalog.worldbank.org\/search\/dataset\/0042326\/Exporter-Dynamics-Database","license":"Creative Commons Attribution 4.0 International (CC BY 4.0)","license_uri":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/","citation":"World Bank. Exporter Dynamics Database (EDD) 3.0 [dataset]. Washington, D.C.: World Bank. ","access_type":"Data will be publicly available and is included in the reproducibility package"}]},"tags":[{"tag":"DOI"},{"tag":"Open Code"},{"tag":"Restricted Data"}],"schematype":"script"}