This article provides a basic non-technical introduction to the United Kingdom’s electronic local public transport data: The data sources primarily used to produce passenger travel information. It does not cover solely operational data, for example, financial, patronage or staff rostering.
The article is intended to provide a background for anyone wishing to understand how these data sources might be used. It was written to support my commentary on the Implications of Google Transit in the UK. The article first introduces the local public transport sector (primarily bus and rail), then explores the development of different data formats, before summarising data availability.
On this page:
- UK Local Public Transport – An Introduction
- Local Bus Service Information
- Traveline: ATCO-CIF and Locations
- Real-Time Information and Related Standards
- Railway Information
- Data Availability
- Further Reading
Also in this series of articles:
UK Local Public Transport – An Introduction
“Local” refers to transport within towns, cities and regions. For buses, local has a statutory definition – essentially, passengers must be able to get off within 15 miles of where they boarded . I will not discuss long-distance “national” services – most coach networks, Inter-City rail and aviation. I have also excluded discussion of unscheduled public transport (taxis and other demand-responsive modes), and services not available to the general public (most health and social services transport, some education transport).
That leaves buses, trains, and modes that are important in specific locations, such as tram, metro, and ferry.
The UK’s bus operations were deregulated in 1986, with most control moving from the public to the private sector. Services in London and Northern Ireland are still controlled by public authorities. There are also a few examples of operators primarily owned (at “arm’s length”) by local authorities, such as Nottingham. A plethora of smaller (less than 50 licences to operate vehicles) operators exist, although these tend to be very limited in their operations (a few specific routes, rather than a whole city). Overall, UK bus services are now characterised by a series of local monopolies or oligopolies (one or a few operators dominating), mostly owned by one of five large (stock market-listed) companies.
Bus operators retain considerable commercial freedom to alter their services and fares, but in practice most local networks change little from year to year. While local authorities contract additional services at the margins (typically evening, rural or “socially necessary” services), and vast majority of passengers travel on buses operated entirely commercially.
Britain’s heavy rail network was “privatised” in the 1990s. Private sector expertise was introduced to operations through the franchising of passengers services, while assets (track, rolling stock) also moved to the private sector. In practice, national government continued to control services and fares (with regional involvement in some conurbations). Since the collapse of Railtrack in 2002, government effectively control the track too.
Bus and rail may appear to provide similar functions, but passengers differentiate between the two: In crude terms – the wealthier you are, the more likely you are to travel by train; the poorer, the more likely to use bus . But some large cities have no significant local rail network at all (such as Edinburgh and Bristol), while others have extensive urban rail networks (notably London and Glasgow).
Metro and tram systems can be found in some large cities (the most obvious being London’s underground). Island communities rely on ferries. These modes are important to the places they serve, but represent a small proportion of all public transport journeys.
Local Bus Service Information
The overall provision of information about bus services is commonly a function that is uncomfortably shared between the public and private sectors.
Government (local and national) typically perceives it has a role in providing impartial information about the services of all operators, because logically commercial operators will promote only their own services. In practice only a tiny proportion of journeys will offer a choice of operators. Often where there is choice, competition has matured such that operators advertise the existence of services provided by others.
Data interchange formats tend to have little value to individual operators, who normally only need to know about their own services. Most electronic data formats are consequently created to meet public sector objectives.
Outside of London and Northern Ireland (where bus services are more directly control by government), local bus services must (by law) be registered with regional Traffic Commissioners. Traffic Commissioners are appointed by central government. Normally, the only limitation on bus operators changing their services is time – they must give 56 days notice. Their registration should be confirmed 28 days before the date of the change. Unlike North America, where timetables generally change on a few fixed days each year, the UK’s bus network is constantly evolving day to day. Except it isn’t – most operators only change their timetables a few times each year. The problem for those intending to use the data is that operators can change services daily.
The information registered about most services is straightforward: The geographical route taken, key points served, the time of arrival/departure at those places, and the dates/days operated. Of course, there are plenty of quirks to fool the unprepared: The figure-of-eight route (that serves the same point three times), operating over a 27-hour day, with a route variation on school-days in February.
Bus operators do not need to register information about the fares they charge passengers to travel, although drivers were obliged to carry “fare tables” (a matrix of fares between points) on journeys. The result is that data on fares is very sparse, particularly outside of urban areas with simple (flat fare or zone-based) fare structures.
Information about the specifications of vehicles used is also very limited. This poses a problem for those trying to use the information to show services that are accessible to people with mobility (locomotion) disabilities.
Traveline: ATCO-CIF and Locations
Local authorities process the data from registrations . They often create public timetables, journey planners, and other sources of public passenger information. The most important difference between the data held by the Traffic Commissioner and the data held by local authority is structure: The first is structured for the benefit of the operator, the second for the passenger.
By the mid-1990s, many local authorities operated (or contracted) telephone based enquiry services, supported by local computer databases of services. However, provision of information was not consistent between areas.
Regional travel information consortia appeared in the late-1990s, under the umbrella of Traveline – a process loosely managed by the Confederation of Passenger Transport (CPT), the bus operators’ industry association, but with significant involvement from local authority officers, notably via the Association of Transport Co-ordinating Officers (ATCO). Traveline was intended to provide a single point of contact for impartial local public transport information, with a single telephone number, which directed callers to the appropriate regional travel centre. The concept of Traveline pre-dates the widespread adoption of the internet in the UK, but each consortium now operates its own regional online journey planner.
From memory, some within the bus operating industry were reluctant to contribute to funding Traveline (funding models vary by region, but operators typically subsidise each call about one of their services), but were perhaps unwilling to antagonise government on the issue. One result was that many operators stopped funding Peter White/Southern Vectis’ production of the Great Britain paper bus timetable . That data now forms the oldest commercial national electronic public transport timetable/journey planner, Xephos. Xephos operates on a paid basis to users, drawing data mainly from published public timetables. Other services similar to Xephos have since been developed commercially, notably PlanaJourney.
Traveline was faced with the problem of gathering data together from many local databases, most of which had not been designed to communicate with one another. The result was an interchange format called ATCO-CIF (PDF). This took data structured for the passenger, and placed it in a fixed-width text format. The format is partly extensible (one can add bespoke lines), which has allowed some software vendors to include additional information, such as the geographical points (nodes) along the service’s route. The main benefit of ATCO-CIF is that every commercial journey planner database system can output it. An XML-based specification called JourneyWeb eventually emerged (although I don’t know how widely it has been adopted).
Location is easy to relate to services when all the information is presented on maps, but the dataset needs to support telephone enquiry systems. Location transpired to be a problem. Some authorities recorded only “timing points” – places where buses were scheduled – while others recorded times at every bus stop along the route. Some assigned the same code to pairs of stops (the same code for each direction), while others gave stops on different sides of the street different identities. And you’d be amazed how many different variations emerge on the words “Bus Station”. Areas where there were no formal stops (many rural areas and suburbs in smaller towns) posed a problem for everyone. The NaPTAN (National Public Transport Access Node) dataset is the result.
But people rarely relate to bus stops when planning journeys. Localities are often referred to using many different names, some of which are only recognised by local people. Concepts such as clustering emerged to group stops together. Data about place eventually developed into the National Public Transport Gazetteer.
Development of TransportDirect started around 2000. It is a central-government project to provide a single portal for passenger travel planning information (both public and private), anywhere in the UK. The overlap between the roles of TransportDirect and regional Traveline online journey planners (which all ostensibly do the same thing, although may provide slightly different results) was never clearly resolved. TransportDirect drew heavily Traveline data for data about local bus services. Each regional Traveline consortium shared a common format, but problems emerged gathering all the information together. For example, different regions applied slightly different principals to naming and locating places. TransportDirect has played an important role in structuring data so that it can be reliably used for nation-wide systems.
Central government (Department for Transport) sponsored an electronic interchange format for registering local bus services – TransXChange. This format is structured to suit the operator and Commissioner – registrations of bus services. Unlike ATCO-CIF, the format supports precise geographic route plans. It is also far more extensible, so unforeseen quirks in the way operators describe their services can be easily accommodated.
It is worth noting that the pace of adoption of electronic service registration has been rather slow. Most larger operators have been using computer software to schedule (and specifically roster) their services for the last 5-10 years. However, the use of Geographical Information Systems (GIS) lags behind North America – there is still a tendency to describe services by photocopying a map and drawing on it. Many smaller operators are likely to continue to use manual systems for the foreseeable future, due to the low volume of registrations they make.
Real-Time Information and Related Standards
The previous data and standards all relate to scheduled information. However, increasingly buses (vehicles) are being tracked in real time. There are three common sources for this information:
- Vehicle maintenance/tracking or ticketing systems.
- At-stop real-time information displays (“live departure boards”) – and related tracking equipment.
- Urban Traffic Control (UTC/UTMC) and Intelligent Transport Systems (ITS) – buses commonly carry transponders to activate priority measures.
Describing these is outside the scope of this article. Although interchange standards do exist (for example SIRI and RTIG-XML), coverage is currently far from universal, and not all sources are reliable.
For most urban networks, real time information offers minimal benefits over scheduled information: The aim of most “frequent service” urban networks is for the customer to have the confidence that they can just “turn up”, without having to worry about precise times. Real-time information offers some reassurance about waiting time, but isn’t as essential as knowing a service exists.
Britain’s railways change their timetables twice a year, on fixed dates in December and May. This follows a Europe-wide standard. The process of timetable planning is heavily centralised – access to the track, a key restriction on train scheduling, is governed by a single body, Network Rail. Timetable changes are typically published three months in advance of the fixed date, to allow passengers to make advanced bookings. The whole process typically takes over a year . (These relatively long lead times are endemic to the rail industry – everything from driver training to leasing/buying rolling stock/vehicles to opening new terminals (stations/stops) takes longer for rail than for buses.)
Rail timetable information is held centrally in Network Rail’s Train Services Database. This is exported in an interchange format, RJIS-CIF (Rail Journey Information Service Common Interface Format) – sometimes called “ATOC-CIF”, since the Association of Train Operating Companies (ATOC) distribute the data. The format is similar to ATCO-CIF, although the precise specification is different.
ATOC also distribute fares data. Rail fares data is derived as part of a complex centralised process, the Rail Settlement Plan.
ATCO-CIF data can include schedules for local rail services, just as it will often include information about metro/tram systems, ferries, and other less common modes . Local authorities commonly included rail data within their local journey planners.
Rail data is licensed by ATOC, and sold on a commercial basis. At the time of writing a full set of data, updated daily, will cost around £27,000 (about $50,000) a year. Monthly updates are sold for £1,500 ($3,000) a year. In addition, the use of the data has to be approved.
Until Summer 2007, it was not possible to obtain electronic bus data (i.e. ATCO-CIF) unless you were in the public sector, or working for them on a specific local project. The data has, none-the-less, been gathered into one place (the National Public Transport Data Repository, NPTDR) to support the national accessibility planning project. Coverage of NPTDR was historically patchy, particularly for London and Scotland – I presume this has improved over the last year. This data, excluding train services, can now be licensed from Traveline – either at local or national level. Rates and terms have not been quoted to me, but the data is not free .
Update (2010): In September 2010 the National Public Transport Data Repository data became available under a UK Open Government Licence. Note that NPTDR is an annual snapshot for analysis, not suitable for specific journey planning.
Update (2020): The data sources described in this article are now largely historic. Current readers may wish to investigate these open data sources (this is not a comprehensive list, but will get you started):
- Traveline – domestic bus (and tram and ferry) services in Great Britain.
- Department for Transport – bus services in England (the future, under development as of mid-2020).
- Rail Delivery Group – passenger rail in Great Britain.
- Translink – public transport in Northern Ireland.
- Implications of Google Transit in the UK – describes how the UK’s public transport data is being integrated into Google, questions why data is being made available based solely on the business model adopted, explores the real value of this information, and presents a case for the liberalisation of data.
- Public Transport XML Standards – list of most data interchange formats and standards.
- DfT’s TransportDirect documentation – includes recently published (June 2007) research and background documents.
- Traveline Data – background on Traveline data preparation and use.
- Association of Train Operating Companies’ Data Feeds – list of available data feeds and terms.
-  A “Local Bus Service” within the Transport Act, 1985. Alternative law applies to London and Northern Ireland.
-  National Statistics’ “National Travel Survey” data can be used to demonstrate this market differentiation across Britain as a whole.
-  Specifically, county councils, unitary authorities, or Passenger Transport Executives (in conurbations).
-  For example, Scotrail, the name of the franchise for passenger rail services within Scotland, takes 15 months, and Scotrail is one of the most self-contained (and therefore relatively straightforward to schedule) franchises.
-  Journalist Christian Wolmar has some alternative commentary on the relationship between Xephos, Traveline and TransportDirect. He makes some interesting cost comparisons.
-  Information from Roger Slevin (Traveline South East).
-  Commonly ATCO-CIF files include codes for a range of unusual public transport modes, including Mag-Lev, Horse Tram, and VTOL (Vertical Take-Off and Landing aircraft). I admit, I’m struggling to think of a VTOL local passenger transport operation in the UK.
I (Tim Howgego) have had no formal role in developing any of these standards. However I was probably the first person outside of the “journey planner industry” to attempt to use much of this data for wider analysis work. For example, relating students to the places of education they can reach by bus. I have also worked extensively with other forms of operational data, and am aware of some of the common data interpretation and structure problems. While the information given here is correct to the best of my knowledge, much is drawn from my over-loaded memory. Feel free to make corrections in the comments below.