I have tried to load this file in a variety of ways (ArcGIS, Excel, DBF Viewer and through code) and cannot seem to get the correct encoding.
For example, I get a row with ‘name’ = Pjrvi when the expected result is Pääjärvi.
Does anyone else see this or do you see the correct value? Any ideas about what I could do to correctly load the file?
Thanks,
Brett
The trouble is, it seems (based on a bug report I made a long time ago) that the Natural Earth input scripts which are used to prepare the data ignore any character outside the ASCII range.
I pointed out this bug but was told (I think) that there was no easy way to fix it, which must surely be wrong. I now can’t find my bug report, but I found a note here (https://www.naturalearthdata.com/blog/natural-earth-version-1-4-release-notes/) that one of my suggestions had been implemented – but not, it seems, the general solution.
So somehow we have to persuade Natural Earth to accept the existence of non-ASCII characters in placenames.
The ne_10m_populated_places file has similar issues on version 3.0.0 as of 6/5/2014.
ne_10m_admin_1_provinces_and_states 3.0.0 DOES HAVE correct UTF-8 encoding for non-ASCII names but ne_10m_populated_places DOES NOT. Seemingly, the code to do the right thing, but care was not taken when produced some of the files, such as lakes and populated places.