Cities and cities.py

A colleague pointed me to the GNU miscfiles cities database after I posted geolocation and path cross, suggesting that it would be a useful database to support. Being that it includes five hundred places around the globe, and I already have the database installed, I have to agree.

GNU miscfiles is a package of, well miscellaneous files. It contains, amongst other things a list of world currencies, languages and the file we’re looking at today cities.dat.

In v1.4.2, the version I have installed, cities.dat contains 497 entries. The file is a simple flat Unicode database, with records separated by //, a format that would be as well suited to processing with awk as it would with Python.

ID          : 315
Type        : City
Population  :
Size        :
Name        : Cambridge
Country    : UK
Region     : England
Location    : Earth
Longitude  : 0.1
Latitude   : 52.25
Elevation  :
Date        : 19961207
Entered-By  : Rob.Hooft@EMBL-Heidelberg.DE

You don’t need to hand process the data though, I’ve added cities to the upoints tarball that takes care of importing the data. When you import the entries with import_locations() it returns a dictionary of City objects that are children of the Trigpoint objects defined for Trigpointing and point.py

On my Gentoo desktop the cities database is installed as /usr/share/misc/cities.dat, and can be imported as simply as:

>>> from upoints import cities
>>> Cities = cities.Cities(open("/usr/share/misc/cities.dat"))

And the imported database can be used in a variety of ways:

>>> print("%i cities" % len(Cities))
497 cities
>>> print("Cities larger with more than 8 million people")
Cities larger with more than 8 million people
>>> for city in Cities:
...     if city.population > 8000000:
...         print("  %s - %s" % (city.name, city.population))
   Bombay - 8243405
   Jakarta - 9200000
   Moskwa - 8769000
   Sao Paolo - 10063110
   Tokyo - 8354615
   Mexico - 8831079
>>> print("Mountains")
Mountains
>>> for city in Cities:
...     if city.ptype == "Mountain":
...         print("  %s" % city.name)
   Aconcagua
   Popocatepetl

You can recreate the database as a smoke test using the following:

>>> f = open("cities.dat", "w")
>>> f.write("\n//\n".join(map(str, Cities)))
>>> f.close()

unfortunately the files aren’t simply comparable using diff because of some unusual formatting in the original file, but visually scanning over the diff -w output to ignore the whitespace changes shows that we have a correct export.

The City class inherits Trigpoint which in turn inherits Point, and therefore has all the same methods they do. This allows you to calculate distances and bearings between the class:~upoints.cities.City objects or any other derivative object of the parent classes. For example, you could use the dump_xearth_markers() function:

>>> from upoints.utils import dump_xearth_markers
>>> scottish_markers = dict((x.identifier, x) for x in Cities
...                         if x.region == "Scotland")
>>> print("\n".join(dump_xearth_markers(scottish_markers, "name")))
57.150000 -2.083000 "Aberdeen" # 1
55.950000 -3.183000 "Edinburgh" # 83
55.867000 -4.267000 "Glasgow" # 92

Take a look at the Sphinx generated documentation that is included in the tarball to see what can be done.