Evaluating the effects of language on international trade in MENA countries: A gravity-model approach

Prior studies have investigated the role of economic and noneconomic variables on international trade. A major factor, which has been studied less, is the language used in transactions and negotiations. We explore the effects of language connectedness and the Arabic language on international trade in thirteen countries in the Middle East and North Africa (MENA) region. We used a panel of bilateral data and gravity model for the countries of the region over the 2000 to 2018 period. Our analytic technique was the Poisson pseudo-maximum-likelihood (PPML) estimation method. The empirical outcomes indicate that speaking Arabic leads to an increase in export, that is, Arab nations prefer to export to the countries whose people speak their language. In addition, the language connectedness index, which depends on the extent to which the country's languages are spoken outside the country, is positively associated with the levels of exports and imports. Results further show that the GDP, population of the destination country, and political co-stability have significant positive impacts on the bilateral exports. Additionally, GDP, the population of the source country, political co-stability, and a common border have had significant positive influences on bilateral imports. The major contribution of this research is that the Arabic language has a significant and positive impact on trade among MENA countries.


Introduction
International trade patterns depend heavily on unobservable trade costs (Deardorff, 2014). Trade barriers, high trade costs, and the low capacity of some countries to supply goods demanded by other countries contribute to unexploited trade potential. Many factors are affecting international trade, among which languages used in cross-border transactions are underlined. The linguistic factor has been overlooked in prior research and may help find the right answer to solve the Unobserved Trade Cost Puzzle. Empirical findings from prior research, usually using gravity models, reveal that bilateral trade declines with language distance more rapidly that can be accounted for by trade costs that are implicit in price differences across countries and locations. The main purpose of this research is to investigate the role of linguistic variables, Arabic, and language connectedness index on international trade in selected MENA countries by implementing the PPML method in gravity equations. The panel data estimation is used for MENA nations. This is the impetus and contribution of this research.
For the first time, Adam Smith presented the relationship between language and economic behavior of individuals in the field of business. According to his idea, merchants should rely on truthful communication to become successful in business and trade (Alonso-Cortés & Cabrillo, 2012). The initial international languages like pidgins emerged by trade and were typically based on European languages. These languages were restricted to what was necessitated for trade and then expanded in the 17-19 th centuries around trade barriers (Al-Jasser, 2012;Botha, 2006;Mufwene, 2015). collected from 81 academic articles, found that a common language increases the flow of trade by 44% (Egger & Lassmann, 2012). Moreover, Baltakys et al. (2019) showed that investor pairs in Finland who speak the same language have more trade timing similarities than pairs who speak different languages. In a study of 115 countries, Oh et al. (2011) found that speaking a common official language increases imports by 43%. Apart from the common language, some languages that have more connection to the languages of the rest of the world can facilitate international trade. To investigate this matter, Konara (2020) introduced an index named language connectedness (LC) and found that this index is stronger for services trading in comparison with merchandise trade.
If country pairs speak different languages, they will face several language barriers. In this regard, Lohmann (2011) demonstrated that language barriers can significantly exert a negative impact on international trade. The leading results of this study indicated that an increase in the language barrier index of 10% reduced the trade flow by 7-10%. Anderson and Van Wincoop (2004) found language barrier is equivalent to a tariff of about 5-12 percent.
English and other European languages are the most efficient languages in the rise of international trade (Melitz, 2008). Hejazi and Ma (2011) used the gravity model to investigate the effects of the English language on international business. Based on their study, there are intra-language and interlanguage influences. Intra-language influence indicates that countries with better proficiencies in English will be able to motivate international business with countries whose people speak English. Inter-language impact represents the fact that non-English-speaking countries that have better English proficiency will make better communication with other non-English-speaking countries that also have English proficiency. Besides, they found that countries whose people speak English gain a clear advantage in international trade compared to non-English-speakers nations. Finally, Selmier II and Oh (2012) argued that English is the most inexpensive language among the main trade languages.
On the other hand, some studies have found that speaking the English language is not often an advantage. According to Grin (2002), the English language is no longer regarded as a sufficient condition in the field of socio-economic achievements, while speaking other languages may contribute to making more profit. Selmier (2016), examining the linkage between international trade and sociolinguistics refer to lingual economics, suggested that native English speakers in the twenty-first century will confront a disadvantage in terms of their knowledge of just one language.
There is a sharp distinction between different languages and their effect on international trade. In a study of 115 countries, Oh et al. (2011) found that transaction costs in English and French languages are less than those of the Spanish and Arabic languages. Selmier and Oh (2013) in another study, demonstrated that there is a notable hierarchy in transaction costs of main languages: English has the lowest transaction cost, followed by French, Spanish, and Arabic.
Taking into consideration the language effects on exports and imports volume. First, we try to assess the impact of the Arabic language as a dummy variable on bilateral exports and imports flow in MENA. Second, we apply the language connectedness driven from Konara (2020) study to investigate the impact of language connectedness on both exports and imports volume.

Model
The main purpose of this research is to investigate the effects of language connectedness and Arabic language on bilateral trade employing the Poisson pseudo-maximum-likelihood estimation procedure (PPML) within a framework of augmented gravity model in the MENA region. This region is known as an economically diverse region in which countries have different volumes of per capita income, natural resources, and economic development. Exports and imports both play a critical role in stimulating growth and integrating into the global economy for this region (Ekanayake & Ledgerwood, 2009).

Data
According to the World Atlas, MENA is a region spanning from Iran in the east to Morocco in the west including the following countries: Algeria, Bahrain, Egypt, Iran, Iraq, Israel, Jordan, Kuwait, Lebanon, Libya, Morocco, Oman, Palestine, Qatar, Saudi Arabia, Syria, Tunisia, Turkey, United Arab Emirates, and Yemen. Due to imperfect bilateral exports and imports data for some years, and because of this huge gap, we had to exclude some countries like Israel, Iraq, Lebanon, Libya, Palestine, Syria, and Yemen from the regression. In this region, the dominant spoken language is Arabic which is spoken in all countries except for Israel, Turkey, and Iran (Kabasakal et al., 2012).
To have a perspective on the MENA countries' exports and imports background, we collected the data for our study from United Nations Comtrade (http://comtrade.un.org/data/, 2020). The average bilateral exports and imports flow of thirteen countries in MENA are demonstrated in Table 1. This table is divided into two parts, the upper part shows the average exports flow from country i to country j, and the lower part demonstrates the average imports inflow of country i from country j.
We have put the database of exports in a matrix where thirteen countries are sorted in ascending alphabetical order in each row from bottom to top of the vertical axis is shown in Figure 1 and Figure  2. In this formalism, the destination country has an index that could be obtained by omitting the exporter country number in the list of thirteen counties. This is because any one country does not have any trade with itself. As a result, the country index ( axis) of each destination country is the new number assigned to that country in a list of country names with twelve resorted country indexes. It is possible to examine the relations between every two countries from the perspective of imports and exports. This formalism creates a database with thirteen source countries and twelve destination countries in the MENA region. These resulted in 156 (13 x 12) elements for exports and imports.
The counterplot in Figure 1 depicts the quantities of the average export volume, which are demonstrated by the colors ranging from blue to brown. Countries like Saudi Arabia, Turkey, and UAE with orange, red, and brown colors were the top exporters in this region, among which Saudi Arabia and UAE are among the world's top oil exporters. On the other hand, Algeria, Iran, Morocco, and Tunisia were the countries with more blue shades, indicating low levels of export values. Figure 2 displays a contour plot, which shows the logarithmic value of the average bilateral imports flows in the MENA region. The horizontal axis, vertical axis, color bar, and scales are just the same as Figure 1. According to this graph, Turkey and UAE dominated the top importers with more orange, red, and brown colors (except for their imports from Oman). Additionally, Saudi Arabia is a country that has been imported from most of the countries except for Algeria. As well as that, Morocco's imports from Algeria, Egypt, and Iran with the colors orange and red are considerable. On the contrary, most of the countries had the lowest volume of imports with Algeria and Tunisia. Considering the center of the contour, Morocco and Oman had low levels of bilateral imports, depicted by blue color.

Methodology
We have presented the dependent and independent variables as well as the sources of our data in Table 3. In the context of world trade flows, Tinbergen (1962) and Pöyhönen (1964) were the first to use the gravity model to study international trade. The gravity model applies Newton's law of universal gravitation to the study of trade between countries and assumes that country's income increases bilateral trade and their distance decreases this volume (Goh & Tham, 2013). The basic equation of the gravity model is as follows: where ij X is the volume of the country i import flows from (or export flows to) the country j , i GDP and j GDP are the economic levels of the countries i and j , respectively, ij D is the geographical distance between the two countries and  is a proportionality constant. Regarding international trade, the gravity model is mostly used to examine the effects of influencing factors on bilateral trade, such as the economic size, bilateral distance, and cultural factors (Niu, 2017). According to the basic assumption, bilateral trade is positively linked to the GDP and negatively related to the geographical distance and other trade barriers (Kuik et al., 2019). Most MENA countries have a common heritage, culture, language, and religion (Ekanayake & Ledgerwood, 2009). Iran and Turkey are exceptions to this rule with Persian the official language of Iran and Turkish is spoken in Turkey. In terms of religion, Iran and Iraq are the only Shia Muslim countries. All MENA nations possess the majority of Suni Muslims. One of the most prominent factors that should be added to the gravity model is the transaction costs of languages. Languages demonstrate both a tool in global economic transactions and a vehicle to transfer cultural contents (Selmier II & Oh, 2012). To investigate the effects of languages on bilateral export and import, we employ an index of language connectedness in gravity models. In addition, we define Arabic-speaking as a dummy variable, which is the major language in this region and spoken by more than 100 million people (Oh et al., 2011). The capability of people to speak the languages of other countries encourages them to perform cross-border activities. Konara (2020) used the Ethnologue Global dataset, which includes statistics on the speakers of 234 countries with 7479 languages to construct an index of language connectedness (LC). LC is an index that shows the proportion of languages in a country, which are spoken outside that country.
To operationalize LC, Konara (2020) The country's LC index is as follows: where i P is the population of the country i and ij i PP is the proportion of the population in the country i who speaks the language j . This multilingualism raises people's potential for communicating with the outside world. Based on Konara (2020) study, this index is ranged from 0.02 to 1.96, according to which Japan has the lowest value of LC and Luxembourg dominate the top part of the measure. Table 2 depicts the country's languages and language connectedness index for thirteen MENA countries included in this study. The MENA countries have different levels of language connectedness with a measure of 0.08 to 0.66. This study applied the sum of LC indexes of country pairs, dubbed LC-plus index, meaning the sum of two countries' language connectedness, in gravity equations. Figure 3 is an illustration of language connectedness among country pairs of MENA in a contour plot of LC-plus index values.
In Figure 3, like Figure 1 and Figure 2, the vertical axis depicts the country i (source country) and the horizontal axis depicts the country j (destination country). The LC-plus index values of country pairs, as defined on a scale from 0.2 to 1.2, are shown by colors ranging from dark blue to brown. The highest LC-plus index values, shown by red and yellow colors, belong to Tunisia and Jordan. Country pairs like Kuwait, Morocco, Oman, and Qatar, located in the middle of the plot and colored with green and sea glass, have more language connectedness compared to the country pairs with the shades of dark blue.
Other important factors that are considered are economic level and geographic distance. Therefore, the baseline estimation equations for bilateral exports and imports are presented as follows: Legal is a dummy for country-pairs with the same law system (Muslim law, civil and mixed systems of law); ij CB is a dummy which takes one value when two countries have a common border and otherwise it takes the value of zero. Source: Konara (2020) study, calculated by the authors.
Additionally, ij L is a dummy variable that takes one when country-pairs speak a common official language and zero if they don't have one. In this study, we define the Arabic language as a dummy variable for pairs of countries when only one country speaks the Arabic language (Oh et al., 2011). LC-plus represents a linguistic variable deriving from the sum of two country's language connectedness index. Finally, ijt  is a residual term.
The tests are done on annual data. Bilateral exports, bilateral imports, and GDP are indicated in US dollars. The bilateral exports and imports data are collected from the United Nations Comtrade database (https://comtrade.un.org). Data on GDP were obtained from the International Monetary Fund (https://www.imf.org/external/index.htm). We have also gathered data on political stability from the World Bank (https://www.worldbank.org). Information on the common legal system is compiled from the JuriGlobe research group at the University of Ottawa (http://www.juriglobe.ca). The CIA World Factbook website (https://www.cia.gov) provided language information and the common border dataset. Population statistics are collected from the World Development Indicators database (https://datacatalog.worldbank.org/dataset/world-development-indicators).
Moreover, physical distance in kilometers is calculated by Time and date website (https://www.timeanddate.com/worldclock/distance.html). Finally, the LC-plus index for these countries is calculated by Konara (2020) who used Ethnologue Global Dataset (Lewis et al., 2014). Table  3 is a better representation of the data sources used in Equations 5 and 6. Furthermore, the table also highlights the dependent and independent variables we have used in this research.

Analytic Methods
This study aims to investigate the effect of the Arabic language and the language connectedness on bilateral trade between MENA countries. To achieve this aim, Equations 5 and 6 are estimated using the panel data employing the PPML method. The gravity model with the logarithmic transformation and ordinary least squares (OLS) method is widely used in empirical studies with the prevalent assumption of homoscedasticity among country pairs (Yazdani & Pirpour, 2020). Recent studies have recognized some shortages with the method of log linearization and OLS estimation (Arvis & Shepherd, 2013). Silva and Tenreyro (2006) have argued that OLS estimation is inconsistent because of the heteroscedasticity issue and the problem of estimating with zero trade volume. They suggested a Poisson pseudo-maximum likelihood (PPML) estimation instead of log linearization and OLS estimation.
One of the most important specifications of the PPML estimator is that the dependent variable in the gravity model is in levels, which contributes to the elimination of the bias resulting from the loglinear transformation of the data. Moreover, this estimator gives the same weight to all volumes and the interpretation of its results is clear (Eichler et al., 2017).
We study n countries with ( 1) nn − sections. For a better understanding, with considering the export variable, we have thirteen exporting countries that export goods to twelve other countries (each country does not export to itself), and with adding the time, the total number of observations for exports is equal to

Results and Discussion
In general, empirical studies investigate the features of the time series for the regression variables. Hence, to examine the stationary of each time series the Levin-Lin-Chu panel-based unit root test is done using Stata13 software. If the probability of testing is less than 0.05, the variables are stable and with the probability greater than 0.05, the variables are not stable. The results of the Levin-Lin-Chu panel-based unit root test are represented in Table 4.
Results from this test show that imports are stationary in the first difference, but other variables are stationary in levels. Since all the variables in Equation 5 are stationary in levels and have a longrun relationship, there is no need to conduct a co-integration test. On the other hand, due to having a variable with different levels of stationary in Equation 6, we apply the Kao method to examine whether the long-run co-integration relationship exists (Yazdani & Pirpour, 2020). The estimated result of this test is depicted in Table 4. Test statistic Probability value -14.9623 0.0000 For all these variables, we estimated a Kao test that employs an automatic lag length selection using a Schwarz information criterion and a maximum lag length of 1. Based on the results from the Kao test, there is a co-integration and strong long-run relationship across variables. After considering the robustness check, Equations 5 and 6 are estimated using the PPML method and Stata13 software package, and the results are shown in Table 5.
In Table 5, the first column represents the explanatory variables mentioned in Equations 5 and 6 and the second and third columns demonstrate the estimation results of bilateral exports and imports, respectively. The estimated coefficients of explanatory variables in Equation 5 are significant except for the population of the country i and common border. The estimated results of Equation 6 are significant, but for the population of the country j and the Arabic-Speaking.
The correlation coefficients for the GDP of country i and country j are 0.90 and 0.52 for exports column and 0.47 and 0.83 for imports column. These positive significant signs indicate a strong correlation, which is consistent with the theory that the economic size of the countries (GDP) can contribute to higher exports and imports. Besides, the exports of the country i to the country j are positively affected by the populations of the country j , while the import flows are influenced by the populations of the country i . Note: The numbers in parentheses are z-student statistics. *, ** and *** denote coefficients significance at 1%, 5% and 10% respectively. Those with no star are not significant.
On the other hand, geographical distance has a negative sign, which means long geographical distance will contribute to a reduction in bilateral exports and imports among MENA countries. Additionally, the regression results show that the coefficients for political co-stability have the expected signs as a stable political setting could partly help these countries to be a safe place for trade. Furthermore, common legal systems have negative effects on bilateral exports and imports. This may be because of the enhanced role of international arbitration in cross-border trade and the standardization of international contracts (Powell & Rickard, 2010). In addition to that, the common border's coefficient is positive and significant, providing evidence of the positive impact of this variable (0.36) on imports but it has no correlation with exports.
Regarding the linguistic variables, speaking a common official language increases bilateral exports by 68% and imports by 30%, indicating that two MENA countries sharing the common language will reduce the cost of communication and then will promote trade. Moreover, the results show that speaking Arabic promotes bilateral exports by 28%, while a non-significant correlation was observed between imports and the Arabic language. This linguistic coefficient demonstrates that people who speak Arabic are more willing to trade with their linguistic counterparts. Arab nations in this region with a common language or other relevant cultural features, customs, and values are likely to understand each other's business practices better than firms operating in the less-similar environment, thus, engage in higher levels of trade.
Additionally, the estimated results provided evidence of the effect of LC-plus on exports and imports with the coefficients of 0.47 and 0.69, respectively. These results reflect the important role of language in international trade, that is, linguistic proximity will facilitate the country's cross-border activities in the MENA. To put it in another way, exports and imports require simultaneous communications (face-to-face or written contact) and a similar business environment. A country, whose people speak a language that is widely spoken outside the country, is more capable of making communication with the outside world, which demonstrates the importance of language profile as a key determinant of global trade.

Conclusions
In today's competitive business climate and during the rise of globalization, we have witnessed further increases in international trade. In the absence of much research due to the important role of linguistic variables on international trade, the purpose of this study was to examine the impact of the Arabic language and language connectedness on exports and imports in the MENA region using a gravity model and PPML method covering thirteen countries between 2000 and 2018. To understand the impacts of linguistic variables on bilateral trade, we applied the Arabic language and the sum of language connectedness index (LC-plus), which is the share of total potential communicative partners in the focal country and outside the focal country, in gravity equations.
The empirical results indicate that the GDP, the population of the destination country, and political co-stability have significant positive impacts on the bilateral exports. Additionally, GDP, the population of the source country, political co-stability, and a common border have had significant positive influences on bilateral imports. However, geographical distance and common legal systems have had negative effects on bilateral exports and imports. Regarding linguistic variables, results show a positive linkage between common language and international trade.
More importantly, the Arabic language has a significant and positive impact on exports, which can be since most of the countries in this region are Arab-speaking nations and are more likely to exports with people who have a similar language, culture, and customs. Based on the estimated results, the LC-plus index is positively correlated with the bilateral trade, showing that a country's language connectedness to the rest of the world can facilitate cross-border activities in the MENA.
Although in this paper, we have been able to seek extensively the set objectives. Yet other MENA countries could also be included in the examinations, but because of data constraint this work has been restricted to thirteen countries and 19 years. Moreover, in the MENA region, the only language that is spoken by more than 100 million people and is shared by many countries is Arabic, but more languages can be added if other regions will be considered.
We suggest further use of the LC-plus index in the study of international trade and linguistic factors in MENA or other similar regions. Besides, other scholars can choose another region with more linguistic diversity to select more languages as explanatory variables. More importantly, we suggest investigating the effect of other cultural factors on the volume or flow of exports and imports.
Author Contributions: Fatemeh Rahimzadeh Conceptualization; methodology; data curation; software; formal analysis. Bahman P. Ebrahimi; Supervision; validation; writing; editing. All authors have read and agreed to the published version of the manuscript Funding: This research has received no external funding.