National Association: Industry Data Warehouse Modernization and Optimization

Download the PDF

Company:  

The National Labor Exchange (NLx), a public-private partnership, is an electronic labor-exchange network created in 2007 between the National Association of State Workforce Agencies (NASWA) and DirectEmployers Association (DirectEmployers). The NLx provides workforce development professionals, academic researchers, and other organizations that rely on labor market information with high-quality, transparent, current, and historical data that represents the diversity of jobs available in the United States labor market. 


Challenge:

NLx received grant funding to modernize their data warehouse in order to improve the reliability, efficiency, and extensibility of the data solutions the organization provides to its stakeholders. 

The existing jobs data feed lacked the level of reliability and scalability required for consistent data capture. Data was locked inside of the legacy data warehouse without a method to share securely with external users or scale to additional capacity over time. NLx leadership recognized the value of their data to members and labor market researchers but encountered consistent roadblocks to delivering that value. In addition, NLx required a solution that could be efficiently maintained in the long term without the allocation of internal technical resources or substantial ongoing investment.


Solution:

The CorrDyn Team:

  • Conducted a needs assessment that included a review of existing code and documentation, interviews with internal and external stakeholders, and presentation of final recommendations for the data pipeline, data warehouse, and external-facing API.
  • Developed the data pipeline to process XML files from S3 using an event-driven approach
  • Built the serverless data warehouse in Amazon Aurora PostgreSQL to scale data warehouse investment to usage, minimize upfront cost and maximize scalability to new external users and data sources.
  • Developed the external API interface using a containerized Django application that enabled NLx staff to efficiently manage credential provisioning and de-provisioning, as well as allocating API user permissions for redacted and unredacted data. 
  • Enabled “live” and research-oriented API use cases by developing a synchronous API for fetching small batches of results (for example, to a website providing a data visualization) and an asynchronous API to download large quantities of data in CSV or ndjson format.


Results:

CorrDyn met all NLx goals for their data infrastructure, including:

  1. Enabling hundreds of researchers and workforce development professionals to access previously unavailable data for research on questions of interest, such as “what are the largest and fastest growing occupations?” at the national, state, and local level.
  2. Providing an interface for internal stakeholders to explore data on an ad hoc basis to assist with strategic conversations with government agencies and stakeholders
  3. Creating a platform for AI and ML-driven applications built on NLx data in collaboration with external partners.

CorrDyn has continued engaging with NLx stakeholders to deliver value from the data using dashboards, data visualizations, and data enhancements. Questions that were once unanswerable by NASWA staff can now be answered immediately. Labor Market Researchers and professionals around the country can gain real-time access to NLx’s extensive dataset and query the data immediately via the NLx API. 


Testimonial:

"Corrdyn has been an extremely valuable partner in realizing the potential of our data and have exceeded all expectations on understanding our needs and delivering on time and on budget. They are always available for strategic conversation and present multiple options to address future needs." – Charlie Terrell, NLx Director, NASWA