Sampling strategy and responses

The methodology for distributing and analysing researcher survey data, sourced from the 2023 ORCID database, involved a multi-step process. A comprehensive data extraction was conducted, retrieving relevant details such as researcher IDs, given names, family names, countries of residence, and email addresses. Records lacking publicly available information were excluded from the dataset to ensure adherence to privacy norms and regulations. This filtering process yielded 333,105 email addresses corresponding to 213,511 unique researcher IDs.

Recognising the importance of capturing a globally representative sample of researchers' opinions, the country data was utilised to map the number of researchers per region, as delineated in Table 1.

Table 1. Original distribution of ORCID IDs by region.

Continent RegionORCID IDs

Africa

Northern Africa

4,753

Sub-Saharan Africa

4,666

Americas

Latin America and the Caribbean

41,979

Northern America

22,162

Asia

Central Asia

1,030

South-eastern Asia

10,325

Southern Asia

22,318

Eastern Asia

26,942

Western Asia

13,307

Europe

Northern Europe

15,935

Southern Europe

32,115

Western Europe

21,346

Eastern Europe

18,505

Oceania

Australia and New Zealand

5,932

Micronesia

20

Melanesia

80

Polynesia

36

Upon analysis, a noticeable geographical imbalance in the distribution of researchers across continents was evident based on the regional classifications employed in this study. Three key adjustments were proposed to address this disparity. The first involved the consolidation of all regions within Oceania into a single entity due to the minimal number of records in some areas, which did not warrant separate categorisation. The same was performed for Central Asia, who was incorporated with Eastern Asia. The third adjustment pertained to the Latin America and Caribbean region, where the number of researcher IDs disproportionately represented the region's geographic and scientific system diversity. Consequently, a division was proposed along the intermediate regional lines, effectively segregating Central America and the Caribbean from South America. This subdivision balanced the number of records detailed in Table 2.

Table 2. Adjusted distribution of ORCID IDs by region.

ContinentRegionORCID IDs

Africa

Northern Africa

4,753

Sub-Saharan Africa

4,666

Americas

Central America and the Caribbean

7,847

Northern America

22,162

South America

34,791

Asia

South-eastern Asia

10,325

Southern Asia

22,318

Eastern and Central Asia

27,972

Western Asia

13,307

Europe

Northern Europe

15,935

Southern Europe

32,115

Western Europe

21,346

Eastern Europe

18,505

Oceania

Oceania

6,013

Despite these adjustments, regional disparities in the number of researchers persisted. To mitigate this, a calculated approximation of the necessary sample sizes from each region was determined to optimally represent their respective research communities. This calculation was informed by data from the UNESCO Institute for Statistics (concerning researchers in R&D per million people, in FTE) and the United Nations Statistics Division (Standard Country for Statistical Use). Utilising the most recent data available for each country, the sample size for each group was calculated based on the Cochran formula, commonly employed for determining sample sizes in surveys and experiments. The calculation was predicated on a 95% confidence level and a presumed population proportion of 0.5, aiming to maximize sample size within a 5% margin of error.

Table 3 shows the number of available ORCID IDs in the database and Researchers per region, adding the results of the sampling process in three calculations. The first is the number of respondents needed to represent that research community. The second is the number of e-mails sent and the last one is the response rate needed to reach the desired number of responses.

Table 3. Sample size and response rate calculations.

Last updated