LinkedIn Job Postings Analysis

LinkedIn job postings analysis (over 15000 data) using Apache Spark, Plotly, and Streamlit. Mainly focused on open job count and monthly salary, can be filtered per state or as the whole country (US). There are also some insights that are focused on companies rather than states/areas.

Why the US? Why not Indonesia?

I already tried scrapping Indonesian job data for a few weeks, but there's still not much to analyze. Most of my graphs rely on salary data, while most Indonesian companies are not that open yet. Out of the 5000 job postings I scrapped, there's only 1 post that provided the salary data (I wish I was joking). Therefore, I decided to use the US data instead.

Setup

For setting up Apache Spark and other requirements (without Docker), please see SETUP.md.

Note that if you don't use the setup above, Apache Spark can also work locally and be installed using Python pip. However, you won't be able to connect to Metabase and only 1 Spark cluster can be used (which doesn't have any advantage compared to Pandas).

Plotly Visualization

To release the full power of the graphs, it's recommended to run the notebook file directly since the graphs are interactable. GitHub doesn't support interactive graphs so I need to export it as images as a workaround.

Here are some random examples you can view. Alternatively, you can also view all the graphs on my Streamlit Cloud.

Salary Distribution Based on Experience Level

Monthly Salary (USD) Based on Industry

Job Opportunity Based on Job Title

Average Applicants Based on Job Title

Companies That Pays The Most

The Most Common Types of Job Benefits

Salary Distribution of Data Related Jobs

Most Desired Skills for Applying Jobs

Job Opportunity Based on Employment Type

Dataset

Arsh Kon's LinkedIn Job Postings dataset (v8, 2023).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
_misc		_misc
data		data
helpers		helpers
processed		processed
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SETUP.md		SETUP.md
app.py		app.py
main.ipynb		main.ipynb
requirements.txt		requirements.txt
start-all.sh		start-all.sh
status-all.sh		status-all.sh
stop-all.sh		stop-all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LinkedIn Job Postings Analysis

Why the US? Why not Indonesia?

Setup

Plotly Visualization

Dataset

About

Releases

Packages

Languages

License

AndhikaWB/linkedin-job-postings-analysis

Folders and files

Latest commit

History

Repository files navigation

LinkedIn Job Postings Analysis

Why the US? Why not Indonesia?

Setup

Plotly Visualization

Dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages