Posted by: Ian | December 8, 2009

Languages and Countries

One of the facets of Open Access Repository Junction is that it will know about languages in repositories, and countries repositories [claim to] live in… the benefit being that the Junction could return a list of all repositories in a particular country (useful for places like Africa), or that have a particular language in the interface (list all the french-speaking repositories in Canada, for example)

The thing that is needed, however, is a definitive list of countries, and languages.

Languages are fairly easy: The Library of Congress is the Authoritative source for the “ISO 639.2” list of languages, and provides a downloadable text file of three-letter codes; 2-letter codes; English name; and French name.
It would be nice to harvest the Wikipedia page on language codes as that includes the local name for the language… but that’s a “later version” thing.

Country Codes are again, seemingly simple: maintains the ISO 3166 list of codes, and provides a downloadable file of name:code for free.
The interesting, and again, a “later version enhancement”, data source is Appendix D of the CIA World Fact Book – which links the ISO 3166 code with the TLD domain code.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: