Geograph British Isles :: Data Dumps

Links | Geograph Hub | Facets | Torrents | Geograph API | Contact Us |

Creative 
Commons Licence [Some Rights Reserved] All datasets on this page © Copyright Geograph Project
and licensed for reuse under this Creative Commons Licence.
You are free:
 to Share - to copy, distribute and transmit the work
 to Remix - to adapt the work
under the following conditions:
 Attribution - You must attribute the work in the manner specified by the author or licensor
 Share Alike - If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

All the following files are created via the mysqldump command, so should import easily into a 5.0+ mysql database.

Links to actual downloads are at the bottom of the page. Don't forget to tell us what you create with the data!

Base tables

Our 'gridimage' table is split into many tables in this dump, as everybody possibly don't need all the fields. They can always be combined back into one big table if required!

Explanation of the columns in the gridimage tables

gridimage_base.mysql.gz
Main Table with all active images (includes photographer credit, 4fig grid reference, and internal and wgs84 coordinates) - this table is probably enough for many uses!
Sample extract for SH myriad is available (1.5Mb) - you will also need the schema before importing.
gridsquare.mysql.gz
The Geograph 'Land Map' includes breakdown by square statistics.
user_dev.mysql.gz
Table of contributors - including full name and nickname (does NOT contain email or password etc!)

gridimage_extra.mysql.gz
Adds extra geograph specific colums, such as date submitted and sequence in grid square
gridimage_geo.mysql.gz
Geographic coordinates in easting/northing for photographer/subject location
gridimage_text.mysql.gz
The long description and category for each image

gridimage_tag.mysql.gz
Tags users add to image (gridimage_tag is the relation table, tag contains the actual textual tags)

gridimage_snippet.mysql.gz
'Shared Descriptions' attached to many images (gridimage_snippet is the relation table, snippet contains the actual textual data)

gridimage_term.mysql.gz
Extracted Textual terms from the description - via the Yahoo Term Extraction API. Note "ORDER BY gridimage_term_id" gives order in original description. Example page created with this data
gridimage_group.mysql.gz
Automated 'cluster' labels assigned to each image - powered by Carrot2 clustering engine. Example page created with this data, Another example
gridimage_size.mysql.gz
Pixel dimensions of the full size image

gridimage_log.mysql.gz
Number of views received to the main photo page for each image.

indexes.mysql
Adds baseline indexes to many of the above tables (the are actully derived tables so indexes arent automatically created) - NOTE you will still probably need to create indexes to suit the types of queries you will be running against the data

also available on request

Derived tables

Included mostly as examples of the types of things that can be calculated from the above data

gridprefix.mysql.gz
The list of the 100x100km myriad squares
user_stat.mysql.gz
Aggregated statistics for users (useful for leaderboards)
category_stat.mysql.gz
Aggregated number of images by category
hectad_stat.mysql.gz
Aggregated statistics by 10x10km hectad square
hectad_complete.mysql.gz
List of completed hectad squares
gridimage_kml_dev.mysql.gz
Breakdown used for creating the KML Superlayer -included only because it is an example hierarchy for images

URL formats - so can link to the page on geograph

http://www.geograph.org.uk/photo/{gridimage_id}

http://www.geograph.org.uk/profile/{user_id}

http://www.geograph.org.uk/gridref/{grid_reference}

Mirrors

If you would like to host a mirror of this data - please let us know!

Creative Commons Licence [Some Rights 
Reserved] All datasets on this page © Copyright Geograph Project
and licensed for reuse under this Creative Commons Licence.
mysql.gzschematsvtsv.gz
ALLschema
7,065 bytes
-for reference only
category_canonicalmysql.gz
10,115 bytes
2010-11-18 13:52:24
schema
318 bytes
category_statmysql.gz
20 bytes
2015-06-03 04:41:49
schema
306 bytes
tsv.gz
20 bytes
2015-06-03 04:41:52
Aggregated number of images by category
geotripsmysql.gz
20 bytes
2015-06-03 04:42:07
schema
675 bytes
The GeoTrips database
gridimage_base
Columns included
mysql.gz
20 bytes
2015-06-03 04:40:25
schema
679 bytes
tsv.gz
20 bytes
2015-06-03 04:40:28
Base table of all geograph images
gridimage_base_samplemysql.gz
20 bytes
2015-06-03 04:41:13
schema
679 bytes
tsv
1,012 bytes
2012-04-25 16:42:57
tsv.gz
20 bytes
2015-06-03 04:41:16
Base table for 10000 latest images
gridimage_extra
Columns included
mysql.gz
20 bytes
2015-06-03 04:40:37
schema
381 bytes
tsv.gz
20 bytes
2015-06-03 04:40:40
Geograph website specific columns
gridimage_geo
Columns included
mysql.gz
20 bytes
2015-06-03 04:40:49
schema
616 bytes
tsv.gz
20 bytes
2015-06-03 04:40:52
Easting/Northings data for each image
gridimage_groupmysql.gz
89,936,133 bytes
2014-02-22 17:21:34
schema
693 bytes
Images grouped by Cluster label
gridimage_kmlmysql.gz
15,366,304 bytes
2012-08-28 14:30:47
schema
203 bytes
Breakdown used for creating the KML Superlayer
gridimage_log
Columns included
mysql.gz
20 bytes
2015-06-03 04:42:04
schema
318 bytes
Hit numbers on photo page
gridimage_postmysql.gz
20 bytes
2015-06-03 04:41:25
schema
3,915 bytes
Uses of images in the Forum
gridimage_size
Columns included
mysql.gz
20 bytes
2015-06-03 04:42:01
schema
230 bytes
Pixel size for full size images
gridimage_snippetmysql.gz
20 bytes
2015-06-03 04:41:22
schema
2,627 bytes
Shared Descriptions
gridimage_tagmysql.gz
40 bytes
2015-06-03 04:42:07
schema
2,354 bytes
Tags
gridimage_termmysql.gz
60,999,781 bytes
2012-12-02 00:17:03
schema
208 bytes
Extracted Terms from each image description
gridimage_text
Columns included
mysql.gz
20 bytes
2015-06-03 04:41:01
schema
184 bytes
tsv.gz
20 bytes
2015-06-03 04:41:04
Description/Category columns
gridprefixmysql.gz
20 bytes
2015-06-03 04:41:46
schema
863 bytes
A list of the 100x100km myriad squares
gridsquaremysql.gz
20 bytes
2015-06-03 04:41:43
schema
794 bytes
The Geograph landmap
hectad_completemysql.gz
20 bytes
2015-06-03 04:41:58
schema
403 bytes
List of completed hectads
hectad_statmysql.gz
20 bytes
2015-06-03 04:41:55
schema
878 bytes
Aggregated statistics for hectads
indexesschema
605 bytes
Index definitions - HIGHLY recommended
user_devmysql.gz
20 bytes
2015-06-03 04:40:07
schema
187 bytes
Table of Contributors
user_statmysql.gz
20 bytes
2015-06-03 04:41:37
schema
996 bytes
tsv.gz
20 bytes
2015-06-03 04:41:40
Aggregated statistics for users
Data available from http://data.geograph.org.uk/dumps/ 
Copyright 2012 Geograph Project. 

and released under this Creative Commons Licence: 
http://creativecommons.org/licenses/by-sa/2.0/ 

The individual photos that are used to build this dataset are Copyright the respective Licensors, see the full list of contributors here: 
http://www.geograph.org.uk/credits/

====================

If reproducing this work, you must acknowledge the original author.