2.3 Existing Data Sources
There are several sources of existing traffic operational and safety data available online. Consider existing data for traffic operational and safety analysis to the extent practicable. Prior to beginning any analysis, determine the types of existing data that are already available that can be used for the project versus additional data that needs to be collected. Outlined below are several resources that TxDOT provides, along with references to other websites or tools the user can research for information on data collection.
2.3.1 Statewide Traffic Analysis and Reporting System (STARS II)
TxDOT on average collects 82,000 short-term counts and 1,000 vehicle classification counts statewide annually. In addition, TxDOT has many permanent count locations that collect data 24 hours a day, 365 days a year. This traffic data can be viewed on the STARS II database. A link to the database is found in
Appendix C, Section 6 – External References (Reference 1)
.
TxDOT’s STARS II is a data analytics and reporting platform with detailed traffic data and statistics. It provides an online web interface for viewing annually reported traffic data as well as additional traffic data that is not part of the annually validated data set. Annually reported traffic is submitted to the Highway Performance Monitoring System (HPMS), which has specific conditions set by FHWA regarding the collection and reporting method. HPMS data may be calculated differently than some of the other traffic data on the overall STARS II database. Detailed information on how to use the STARS II database is found in the TxDOT
Transportation Planning and Programming Traffic Forecasting Analysis Standard Operating Procedures
(SOP) document. Coordinate with the TxDOT project manager to gain access to the latest SOP document.2.3.2 Other Sources of Traffic Count Data
Agencies may also have their own sets of traffic data. Cities and tolling authorities may have traffic management centers which often collect real time data about the transportation network.
Some MPOs and cities have their own traffic count databases. Cities may have an annual count program which can be a good resource for obtaining historical counts. Typically, these online databases only include ADT counts and not TMCs. Some MPOs combine counts from multiple sources such as cities and TxDOT STARS II and display it under one database. These counts may be used for planning or sketch level (macroscopic) analysis.
Some major cities, TxDOT urban areas, and cities that have traffic management centers may also have ADT or turning movement count data for their roadway facilities. These cities may conduct an annual traffic count program and collect traffic count data on their major roadways. The traffic management centers may also have data related to speed and travel time as part of their Intelligent Transportation System (ITS) network. As of 2024, the suitability and availability of speed and count data is under review by TxDOT.
Tolling agencies may also be able to provide ADT counts for locations adjacent to the toll gantries. This information can be helpful for projects involving toll lanes or managed lanes.
Pedestrian and bicycle data can be obtained through Texas A&M Transportation Institute’s (TTI’s) Texas Bicycle and Pedestrian Count Exchange (BPCX). A link to BPCX is provided in
Appendix C, Section 6 – External References (Reference 3)
. Cities and MPOs may also collect pedestrian and bicycle data. Coordinate with local jurisdictions to see what data is available.2.3.3 Big Data
“Big data” in general refers to a large volume of information, the different types of data, the real-time nature of the data, and the tools and techniques to manage and analyze it. For transportation and traffic engineering, FHWA defines big data as having at least seven dimensions:
- Volume – the amount of data available;
- Velocity – how quickly the data is generated or gathered;
- Variety – how the data is structured;
- Veracity – how trustworthy is the data;
- Value – how meaningful is the data;
- People – refers to those who process and analyze the data; and
- Governance – how the data is gathered and processed
Big data has many advantages over traditional methods of collection. The data is readily available through third-party providers and does not involve infrastructure investment. Because the data is collected and stored by the provider, there is no need for local storage. Collecting the data does not involve manual and in-field methods. Data can be analyzed for an entire day, month, or year, not just during peak periods. The flexibility of data collection may enable a deeper understanding of travel patterns and other traffic operations-related issues.
There are some challenges and limitations associated with big data. Due to the automated nature of the data collection, there is a lack of information on field conditions, such as construction, detours, weather, or malfunctioning traffic signals. Typically, the data provider does not share its process for how the data is aggregated, processed, and analyzed, and the user relies on it and trusts the provider’s methods. Other challenges may arise when the defined study area does not have a one-to-one relation within the defined limits of the provider’s platform.
Additional detail about providers of big data (e.g., INRIX, Replica, and StreetLight) related to transportation planning and traffic engineering is provided in the sections below. As of 2024, TxDOT has subscriptions to INRIX and Replica, and the subscriptions are available to TxDOT employees and contractors working on TxDOT projects or initiatives. To access this data, consultants and external entities need a TxDOT sponsor and to fill out forms agreeing to terms of use before access is granted. Sample forms are provided in
Appendix C, Sections 2-5
. Information regarding Replica data access can be found in Appendix C, Sections 6 (References 5 and 6). Information regarding INRIX data access can be found in Appendix C, Section 6 (Reference 7)
.2.3.3.1 INRIX
INRIX aggregates probe-based data from numerous sources, including crowd-sourced, public, and proprietary data. The data types include consumer devices (e.g., connected cars, mobile phones), local fleets (e.g., service, delivery), and long-haul trucks. All data types collected are GPS based. Historical data availability goes back three years. MOEs reported in the platform include travel times, speeds, congestion, and bottlenecks. INRIX data does not capture stops; therefore, that information is not available. The following data analysis tools are available to users:
- Real-Time Traffic Flow – provides speed and travel time data by roadway segment and is updated every minute for most roadways.
- Roadway Analytics – provides average speed and other data by roadway segment. The data can be organized into 15-minute bins and is available for dates from January 2018 onward.
- Historical Speed Profile – analytical tool that helps the user visualize, monitor, measure, and manage the performance of roadway networks.
- Trip Analytics – a tool that helps provide detailed data on trips such as origindestination.
2.3.3.2 StreetLight Data
StreetLight data processes anonymized location records from various sources such as smart phones, navigation devices, and trucks. Additional context is added to this data by layering information from other sources such as parcel data and the roadway network. This data is then analyzed and aggregated into normalized travel patterns. Users can access this data via the StreetLight InSight platform.
StreetLight data primarily helps in analyzing origin-destination, annual average daily traffic (AADT) counts, segments, trip lengths, and route choice. StreetLight AADT counts are typically only used if no other data source is available. It is recommended that any traffic counts collected using the StreetLight platform be validated with permanent count stations or ADT where available. If there are no permanent count locations available, refer to temporary count locations or historical counts. It is important to note major differences between StreetLight counts and those collected from historical data, STARS II, or other resources.
2.3.3.3 Replica Data
Replica combines anonymized and aggregated data from a variety of sources, including census data, surveys, and location-based services, to create a synthetic vehicular dataset. From this synthetic vehicular dataset, users can derive OD flows, study freight patterns, analyze transit ridership, and perform link analyses, among other features.
2.3.4 TxDOT Open Data Portal
TxDOT’s Open Data Portal is an online GIS database. The portal allows users to explore and download GIS datasets in various data formats. Users utilize queries and filters to view the information on a map on the portal. The database includes information regarding TxDOT’s roadway inventory, mile markers, most congested roadway segments, District boundaries, legislative boundaries, bridges, and more. This data may help in a planning level analysis to get information regarding a corridor or study area. The data is also downloadable and may be used to create graphics and maps to convey information in a customizable way. Some of the information on this portal can also be found on other TxDOT online data sources such as the TxDOT Statewide Planning Map and TxDOT Project Tracker.
2.3.4.1 TxDOT Statewide Planning Map
TxDOT’s Statewide Planning Map contains information on control sections, traffic counts, future traffic estimates, trunk systems, MPOs, and highways. The Statewide Planning Map displays a more selective portion of the information available on the Open Data Portal. It may be used to scan for AADTs on TxDOT roadways, determine congested areas, and find other planning level information.
2.3.4.2 TxDOT Project Tracker
TxDOT’s Project Tracker includes information on planned projects in map format. These projects are differentiated by four phases; construction underway or begins soon, construction begins within the next 4 years, construction begins within the next 5 – 10 years, or projects that are in corridor study phase or 10+ years out. Additional information is included on the map such as estimated construction cost, funding sources, and type of project.
2.3.5 TxDOT Pavement Management Information System (PMIS)
TxDOT’s PMIS is used for storing, retrieving, analyzing, and reporting pavement conditions for all of TxDOT’s roadways. The PMIS is an internal database and information from the PMIS needs to be requested on a project basis. The request needs to go through the TxDOT Maintenance Division. A data request form for PMIS and a confidentiality agreement are provided in
Appendix C, Section 3 – PMIS Data Request Form
and Appendix C, Section 4 – PMIS Confidentiality Agreement
, respectively.2.3.6 TxDOT Traffic Management Centers
There may be datasets available through Traffic Management Centers in metro/urban Districts such as Austin, Dallas, El Paso, Fort Worth, and San Antonio. This data typically includes speed by lane, volume per lane, and occupancy per lane. This dataset can be used as a resource for traffic operations and safety analysis, especially for microscopic model development. See
Appendix C, Section 7 – Traffic Management Center Data
for more information about TxDOT’s Traffic Management Centers. See Appendix C, Section 8 – Lonestar RVSD Detector Locations
for locations the detectors. See Appendix C, Section 9 – Lonestar RVSD Detector Example Data Output
for an example
of the Traffic Management Centers data outputs.2.3.7 Travel Demand Model Outputs (TDM outputs)
Travel Demand Model outputs (TDM outputs) are used for determining future traffic volumes based on changes in land use, population, employment, and other demographical factors. TDMs also incorporate planned and approved future projects. MPOs have a TDM for their specific jurisdiction. TDM outputs may help with analyzing future growth of traffic, providing insight in travel patterns, and identifying areas for traffic improvements.
TxDOT also has a SAM which includes areas outside of the MPO TDM boundaries. The SAM encompasses all the State’s major highways, ports of entry at the border with Mexico, and some freight corridors that tie in with the rest of the United States.
2.3.8 CRIS
Historical crash records are used to perform a safety analysis. Crash data can be obtained from TxDOT’s CRIS database. Crash data available on CRIS are collected from reports from the Texas Peace Officer’s Crash Reports (CR-3) and processed by TxDOT. The CRIS database has information related to the location of crashes, type of crash, conditions, date, and other pertinent information. This data can be queried and used in conducting various types of crash analysis. A CRIS database query allows the user to find crashes based on time and location. Once the user determines the location of interest and the range of years, they select attributes from three different fields: crash, unit, and person.
- The crash field has attributes related to roadway type and conditions, crashrelated factors, crash severity, pedestrian and bicyclist-related crashes, contributing factors, traffic count, and location.
- The unit field has attributes related to vehicles (e.g., make, model), commercial motor vehicle-specific attributes, and damage associated with the vehicles.
- The person field has attributes related to the people involved in the crash, such as gender, ethnicity, and drug usage (related to the crash).
Users select the different attributes of interest, download the raw data, and analyze it in many ways, such as creating crash heat maps, generating graphs or infographics, identifying crash patterns based on certain criteria, and much more.
The user can also review queried results with preset data processing. These include a query results map, which has some predefined filters and map modes in place, such as:
- Standard view, which shows individual crashes denoted by different crash types;
- Cluster view, which combines crashes by area and shows an overall number of crashes by location; and
- Heat Map view, which shows colorcoded crash areas based on density of crashes
These views are helpful for understanding where there is a higher frequency of crashes and the severities of these crashes, without having to download data or generate maps. The CRIS database also has a category of popular queries, which may be selected for screening and analysis purposes. For projects that need FHWA approval, it is recommended that crash data be obtained through the CRIS helpdesk or TxDOT’s TRF Division. For projects that do not need FHWA approval, crash data may be downloaded through the CRIS website. A link to the CRIS website is shown in
Appendix C, Section 6 – External References (Reference 4)
. MicroStrategy (MSTR) is a business intelligence software connected to the CRIS database. MSTR is used to pull crash data, generate reports, and create dashboards. Access to MSTR can be requested through TxDOT’s TRF Division.
2.3.9 Freight
TxDOT has access to Transearch, which is a tool that aggregates historical truck traffic and projects freight flows up to 30 years in the future. Freight information can be filtered by origin, destination, commodity, and transportation mode. It also has county-level freight movement data.
Transearch data is produced by IHS Markit, a company that oversees the software. To get access to the data, an IHS Markit data release form needs to be filled out and signed (see
Appendix C, Section 5 – IHS Markit Data Release Form
) and an email with the following information needs to be submitted to TPP:- Project title;
- Project description;
- TxDOT project manager;
- Recipient company name and address; and
- Name, title, and email address of company representative with signing authority
2.3.10 Bike and Pedestrian
Bicycle and pedestrian traffic counts are often considered in traffic and safety analysis. If this data is unavailable from state resources, it may be available from the local MPO or city. Additionally, agencies can use the Texas Bicycle and Pedestrian Count Exchange. The Texas Bicycle and Pedestrian Count Exchange is a free statewide database that agencies can use to access bicycle and pedestrian counts. If counts for the location are unavailable, agencies can also submit a request for bike and pedestrian counts to be completed or request count equipment to conduct their own counts through the Bicycle and Pedestrian Counter Loan program. For more information about the loan program, use the online resource shown in Appendix C, Section 6 – External References (Reference 3).
2.3.11 Regional Integrated Transportation Information System (RITIS)
RITIS is a large online database of the latest transportation-related data available from the public and private sectors. It was developed by University of Maryland’s Center for Advanced Transportation Technology (CATT Lab) in 2006 with the intent to use data integration for improving traffic operations, incident management, and traveler information for the greater Washington D.C. area. Coordinate with TxDOT TPP Division for access to this data.
There are three main RITIS features that help with analysis: real-time data feeds, real-time situational awareness tools, and archived data analysis tools. RITIS data types are listed below and detailed information regarding each data type can be found on the RITIS website.
- Traffic volume, class, speed, and occupancy;
- Event, work zone, and incident;
- Crowdsourced Waze data;
- Weather;
- Signal status and signal timing
- Freight (OD for shipments, types of shipments, value, and quantity of goods, etc.);
- Travel time;
- OD;
- Routing (e.g., fastest route, shortest path, turning restrictions); and
- Parking (e.g., location of facilities, space utilization, restrictions)