The O*NET database, created by the U.S. Department of Labor, is a treasure trove of information about jobs in the U.S. It has many details about different occupations, including the skills you need, what the work involves, and the environment you'll be working in.
This database is necessary for the Occupational Outlook Handbook (OOH), published by the Bureau of Labor Statistics. The OOH uses data from ONET to provide a clear picture of various careers. It tells you what you'll be doing in a job, what education or training you'll need, how much you can expect to earn, and what the job market looks like. Thanks to ONET, the OOH provides reliable and up-to-date information that helps job seekers, students, and career counselors make intelligent decisions about career choices and development.
I will use the ONet database to create MNavigator since it is freely available and challenging to reproduce independently. As of this writing, the current version is 28.3 and can be found here. I will use free tools to build the database to make it easier to follow, even though I usually use DataGrip to work with databases.
Although the database is available as text files and Excel files, its real power comes when you use a relational database. I am using Postgresql because it is my favorite, but you could use MySQL, SQL Server, or Oracle. I will provide a step-by-step guide to loading the database and figuring out the queries, but I don’t want to bore non-coders, so I will skip over the particulars in this article.
If you click on the All Files tab, you can download a single file that includes all the tables in the database. There is also a row-marked data dictionary that describes in depth each table and each data field in each table.
I don’t want to bore you with all the poking and prodding I had to do to understand the database structure, but it has a dizzying amount of detail, and after a couple of hours of noodling around with it, it made a lot of sense.
If you like databases but have never touched a database with FORTY tables, I invite you to dig into the structure of the ONET database. Studying it was a master class in modeling incredibly intricate data. However, even as detailed as this database is, it needs more. We will also need the Occupational Employment and Wage Statistics data, which can be found here, and the projections central database, which can be found here.
Once I had the database loaded and understood it, it was time to review the existing job profiles to determine what data I could get by querying the database and what I would need to generate using the OpenAI API.
Boiling down a random job profile, we end up with 8 sections: Summary, What They Do, Work Environment, How to Become One, Pay, Job Outlook, State & Area Data, and Similar Occupations. For each of these sections, I had to look at what they showed to figure out how and where I could get the data.
The Summary section has several high-level statistics for the job profile, including Median Pay, Entry-Level Education, Work Experience, On-the-Job Training, Number of Jobs, Job Outlook, and Employment Change. These are available in the ONet database and can be easily retrieved with a few SQL Queries.
The What They Do section has a 1-2 sentence description of the position, followed by several bullet points highlighting the duties of the position, followed by a more extended multi-paragraph section describing the position. The bullet points of the responsibilities can be pulled from the ONet database, but the two narrative sections must be generated with the OpenAI API.
The Work Environment section is limited to two classes of statistics: the number of jobs and the percentage of those jobs in different industries. Both of those data are available in the ONet database.
The How to Become One section has short narrative segments covering the education needed to get the job, the work experience needed, and other important qualities. All of the data are available in the Onet Database; however, the OpenAI API will probably be needed to generate the narrative text to tie it all together.
The Pay section has various stats, including the median pay with the pay range, the median income in multiple industries, and, for comparison, the median salary for all occupations. This data is available in the wagedata table.
The Job Outlook section includes the number of current positions, the rate of change of the number of positions, and the projections for 5 and 10 years. The number of positions and the rate of change are available from the wage data table, but the predictions must be found in the projections central database.
The State and Area Data section includes the number of jobs, pay range, and projections for states and the U.S. as a whole. This information can be found in a combination of the wagedata tables and the projections central database. Additionally, it is possible to pull the number of jobs and pay range by metropolitan area.
The Similar Occupations section includes jobs closely related to the requested job profile. This information is included in a table in the ONet Database.
There are four additional datasets I will be using to build this application: Career Videos, Certifications, Professional Associations, and Occupational Licenses. All of them are available from Career One Stop’s developer site here. Using these with the OpenAI API will allow me to include verbiage in each profile that talks about the certifications, licenses, and/or association memberships that will make finding a job more possible.
For those who want to get their hands dirty, I will be posting a YouTube video that provides a walkthrough of combining these sources into a single database and retrieving the individual data points from the database.
In the next post, I will discuss using the OpenAI API to complete our project. AI-enabling your application is far simpler and slightly more complicated than most people think. Stay tuned; it’s going to be fun.
Data Sources
—---------------------------------------------------
ONet 28.3 Database - https://www.onetcenter.org/database.html#individual-files
Occupational Employment and Wage Statistics - https://www.bls.gov/oes/
Projections Central - https://www.projectionscentral.org/
Career OneStop - https://www.careeronestop.org/
https://www.careeronestop.org/Developers/Data/data-downloads.aspx
Special Thanks to Hamrawit Tesfa for the idea about the career videos.