I am looking to do an interactive data visualisation of train times and ticket price changes over the last ten years. Specifically for the National express East Anglia Cambridge to Liverpool Street line.

If people have access to or know where I can get access to old timetables, ticket price information etc for 2000 - 2010 than I would love to hear from them.

Big +1 - I'd also love get either train times or prices (even just for the present). However I am rather pessimistic :( Is it worth trying to start scraping the current data now so in 2 years we will have a historical dataset for the previous 2 years?

(19 Jan '11, 09:57) rgrp ♦♦

Do we need a protocol for 'not what you wanted but maybe related' datasets? For example, whilst idly looking for an answer to this question, I found a dump of London bus route timetables [ http://www.londonbusroutes.net/timetables.htm ], as well as recent historical London bus route timetables [ http://mjcarchive.www.idnet.com/ ]

(20 Jan '11, 10:08) psychemedia ♦♦

@psychemedia: that's definitely an answer :) -- can you repost as one?

(20 Jan '11, 10:15) rgrp ♦♦

Hi rgrp

I have spoken to several people/organisations and the answer is always that this information is not kept.

"Fares data are in the National Fares Manual (published by Rail Settlement Plan, which is a subsidiary of ATOC), which until recently took the form of a set of half a dozen extremely bulky volumes (the size of old-style London telephone directories), which were reissued three times a year. These where not kept for more than a year or two."

It also seems that timetables are not kept either. I am currently trying the National Railway Museum Library at the University of York.

Apols - first time I've used this site and I've provided an 'answer' when what I intended to do was provide a comment. GUI wasn't obvious.

I don't have a complete answer to this question, but I can add something to what others have said. Unfortunately getting access to the information is somewhat easier than getting access to a machine-readable form of it. Timetables, for instance, have been generated from databases for a number of years, and many different printed documents are produced from a single underlying data source. (Working timetables, for instance, contain information different from passenger timetables, including the movement of empty trains from place to place.) But collecting efforts have historically concentrated on the printed documents.

The National Railway Museum (NRM) in York has a complete collection going back many years. The British Library will also have them and so will many county and perhaps some local libraries, but the latter two won't necessarily retain older timetables. I'm fairly sure that the NRM has been gearing itself up to preserve databases but I'm not sure whether they're actually doing so yet. So that's the story for print.

ATOC, and individual companies like NXEA, have been providing PDF timetables online for some time. Although they don't provide access to historical versions, web archives do have copies. OK, a PDF is still a long way from a usable datasource, but it's slightly easier to scrape than paper.

Fares data is another matter. It's the same as with timetables - yes, some bulky printed publications are produced, but they emerge from a database. You can buy the data, but only as part of application called Avantix traveller - it's available from TSO. Some earlier versions of it were widely leaked online, but ATOC was fairly aggressive in trying to protect its copyright. I imagine that reverse-engineering the data out of the program wouldn't be hugely difficult but probably violates a licence or two. I can't believe that this data wasn't also made available to the department of transport, which would bring it in the scope of an FoI request - but they are unlikely to have retained data from 10 years ago.

I've been urging those involved (national archives and others) to collect this data, as opposed to the publications, for over 10 years. I know that the soon-to-be-abolished railway heritage committee took an interest in it when I presented to them in 1998, but it didn't translate into action for some time. The hands-off nature of privatisation meant that much of the information (such as fares data) fell outside the scope of FoI and any effective government action.


See above. This was intended to be a comment, not an answer.

Update: internet archive has snapshots of national rail site back to 2001. That does include PDF timetables, but there's a big BUT. In those days, only timetables corresponding to old intercity routes were available as PDF. That includes London<->Norwich, but not London<->Cambridge. The rest were only accessible by online query forms.

(21 Jan '11, 17:19) cziwkga

I think your answer-that-should-have-been-a-comment is actually too long to be a comment. Comments are capped at 600 characters?

(21 Jan '11, 19:20) psychemedia ♦♦

Great answer cziwkga -- definitely an answer not a comment (comments are more clarificatory).

(22 Jan '11, 00:13) rgrp ♦♦

Thanks @rgrp & @psychemedia - wasn't sure of protocol here and whether it's OK to claim to have answered something when you haven't really given someone what they're looking for.

(22 Jan '11, 13:57) cziwkga

I contacted the Nation Rail Museum (http://www.nrm.org.uk/ResearchAndArchive.aspx) and this was their reply:

*"Thank you for your inquiry regarding Timetables and National Fares Manuals. We do hold copies of both timetables and fares manuals. I am attaching the two smaller databases for you to browse and check whether they are suitable for your requirements.

We also hold a post 1947 collection of timetables (however please note there are some gaps within this collection)

You are welcome to visit Search Engine to carry out your research. You may book in advance of your visit so that items can be ready upon your arrival. Please remember to bring one form of I.D. with you on the day i.e. bank card."*

The attachments appeared to be extracts from a directory of the timetables. I would expect the database they mention is a database of the printed copies they hold and not the actual data but at least it exists somewhere in print format.


The NetworkRail website appears to have current timetable information, at least in PDF form. There is a zip file on the site that claims to contain the complete timetable of 3,613 pages of current timetable information, but I'm not sure what form it is in?



