1 Structure of the dataset

BACI provides yearly data on bilateral trade flows at the product level. Products are identified using the Harmonized System (HS), which is the standard nomenclature for international trade, used by most customs. The Harmonized System was revised in 1992, 1996, 2002, 2007 and 2012, and we provide BACI in all of those 5 revisions:

HS revision Years available Name of the files
92 1995-2017 BACI_HS92_year.csv
96 1998-2017 BACI_HS96_year.csv
02 2003-2017 BACI_HS02_year.csv
07 2008-2017 BACI_HS07_year.csv
12 2012-2017 BACI_HS12_year.csv

Each record within BACI is identified by a combination exporter-importer-product-year. There are 6 variables:

Variable Description
t Year
k Product category (HS 6-digit code)
i Exporter (ISO 3-digit country code)
j Importer (ISO 3-digit country code)
v Value of the trade flow (in thousands current USD)
q Quantity (in tons)

All files are in CSV format, using semicolon delimiters, and commas as decimal separators. When reading the data, we advise you not to treat the product code (k) variable as numeric, this would remove the leading zeros of the HS codes.

To save space, only the strictly positive trade flows are recorded in BACI. Additionnal files are provided to help users decide whether a flow not appearing in BACI corresponds to a zero trade flow, or a flow for which no information is available.

Trade flows whose value does not exceed 1000 USD do not appear in BACI.

2 Additional files

In addition to the core BACI files, we provide four additional set of files that may be useful to BACI users:

Name Function
country_codes Associates the ISO 3-digit country codes to country names
product_codes Associates the HS 6-digit product codes to product names
zeros Helps determine whether a flow absent from BACI is a zero trade flow or a missing value
reporter_reliability Documents the reliability of trade declarations of each country

2.1 Country codes

These files associate the ISO 3-digit numeric codes used in BACI with country full names and with other versions of the ISO codes (3-letter and 2-letter). They were constructed based on the metadata provided by Comtrade.

2.2 Product codes

These files contain lists of the product codes used in each revision of the Harmonized System, along with a description of each product. They were constructed based on the metadata provided by Comtrade.

2.3 Zeros

These files indicate whether observations that are not in BACI should be treated as zero trade flows or as missing values. Indeed, to save space, BACI records only strictly positive trade flows. So if a trade flow \(ijkt\) does not appear in BACI, it can mean either that there is no information on this trade flow (missing value), or that the trade flow is zero. The files contain 5 variables:

Variable Description
t Year
i Exporter ISO 3-digit numeric code
j Importer ISO 3-digit numeric code
zero1 1 if \(i\) or \(j\) have at least one non zero trade flow during the year (imports or exports), 0 otherwise
zero2 1 if the dyad \(ij\) has at least one non zero bilateral trade flow during the year, 0 otherwise

2.4 Reporter reliability

These files provide data on the reliability of each country when reporting exports and imports, in terms of value, quantity, and unit value. These figures correspond to \(\sigma\) in the companion working paper (p. 18).

3 Descriptive statistics

Aggregate trade flow: This is the total trade flow recorded in the 2019 version of BACI.

Number of dyads (\(ij\)) : This is the number of distinct combinations importer-exporter with at least one non-zero trade flow in BACI. It increases strongly up to the mid-2000s, reflecting improvements in the coverage of our primary sources over this period.

Number of products (\(k\)): A product is defined as a 6-digit item of the HS nomenclature, 1992 revision.

Number of dyad-products (\(ijk\)): This is the number of distinct combinations importer-exporter-product with at least one non-zero trade flow in BACI.

4 Countries included in BACI

The countries used in BACI are inherited from the official UN trade data (Comtrade) upon which we build BACI. There are nevertheless some slight differences, because the construction of BACI rests on a reconciliation procedure that compares the declarations of exporters and importers, and therefore requires that the countries used in both declarations are the same. This is done by choosing the largest existing entity: for instance, because some countries do not report trade with Belgium and Luxembourg separately, we gather them under the ISO code corresponding to the single entity, “Belgium - Luxembourg”. The table below gathers all the countries for which Comtrade has data on trade flows after 1995 (first year of BACI), that are not residual areas (“Not Elsewhere Specified”) but are nevertheless absent from BACI.

5 Differences across HS revisions

More recent revisions of the HS nomenclature are available only for more recent years. The choice of a revision has little incidence on the aggregate trade flow, suggesting that large trade flows are recorded whatever the revision.

Nevertheless, for a given revision, the number of products with non zero trade flow tends to decrease over time: this is probably because some items of the nomenclature become obsolete, corresponding to products that are not sold anymore (conversely, some new products may have appeared but do not have a specific item on their own).

Also, it is worth noting that using a recent HS revision leads to less dyads being available:

In line with the previous graphs, we observe that older classifications tend to have less observations for recent years, probably because they contain obsolete items and do not distinguish between some new products: