How to scrape LinkedIn jobs in Python (and the API that replaces it)
A real Python walkthrough of LinkedIn's guest jobs endpoint with requests and BeautifulSoup - parsing cards, pulling job detail, and why it breaks in production. Then the API where LinkedIn is the main source.
Eng team
Engineering
LinkedIn has the deepest job graph on the internet and the most hostile surface to scrape it from. This is the code-level version - a real Python walkthrough of the one semi-stable endpoint, how to parse it, and where it falls over in production. For the higher-level overview of methods, tools, and legal posture, see our guide to scraping LinkedIn jobs; this post is the hands-on Python version.
The guest jobs endpoint
You do not need to touch the logged-in app. LinkedIn exposes a logged-out “guest” jobs API that returns server-rendered HTML fragments of job cards - no auth, no JSON, just HTML you parse:
https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search
?keywords=python+developer
&location=United+States
&start=0The start parameter pages in increments of 25. Each response is a list of <li> elements, one per job, carrying the title, company, location, and a link to the full posting.
A working scraper
import time
import requests
from bs4 import BeautifulSoup
SEARCH = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search"
HEADERS = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36"}
def scrape_linkedin_jobs(keywords, location, pages=3):
jobs = []
for page in range(pages):
params = {"keywords": keywords, "location": location, "start": page * 25}
resp = requests.get(SEARCH, params=params, headers=HEADERS, timeout=20)
if resp.status_code != 200:
print(f"stopped at page {page}: HTTP {resp.status_code}")
break
soup = BeautifulSoup(resp.text, "html.parser")
cards = soup.select("li")
if not cards:
break
for card in cards:
title = card.select_one("h3.base-search-card__title")
company = card.select_one("h4.base-search-card__subtitle")
link = card.select_one("a.base-card__full-link")
base = card.select_one("div.base-card")
if not title:
continue
jobs.append({
"title": title.get_text(strip=True),
"company": company.get_text(strip=True) if company else None,
"url": link["href"].split("?")[0] if link else None,
"job_id": base["data-entity-urn"].split(":")[-1] if base else None,
})
time.sleep(2)
return jobs
for job in scrape_linkedin_jobs("python developer", "United States"):
print(job)The data-entity-urn attribute holds a value like urn:li:jobPosting:3741290021 - the trailing number is the job ID you use to pull full detail.
Pulling the full job detail
The search endpoint gives you the card; a second guest endpoint gives you the description, seniority, and employment type:
def get_job_detail(job_id):
url = f"https://www.linkedin.com/jobs-guest/jobs/api/jobPosting/{job_id}"
resp = requests.get(url, headers=HEADERS, timeout=20)
resp.raise_for_status()
soup = BeautifulSoup(resp.text, "html.parser")
desc = soup.select_one("div.show-more-less-html__markup")
criteria = [
c.get_text(strip=True)
for c in soup.select("span.description__job-criteria-text")
]
return {
"description": desc.get_text(" ", strip=True) if desc else None,
"criteria": criteria,
}
print(get_job_detail("3741290021"))Why this breaks in production
- Rate limiting. The guest endpoint tolerates a trickle of traffic. Push it and you get
429responses, then IP blocks. Real coverage needs a rotating residential proxy pool. - It is a subset. The guest API exposes a fraction of what the logged-in search shows, with coarser filters. Scraping the authenticated app means real accounts, and LinkedIn bans accounts used for automation - this is the most aggressively defended target in the space.
- Markup drift. The class names above (
base-search-card__titleand friends) change, and the parser breaks silently when they do. - Terms of Service. Automated access is against LinkedIn’s User Agreement. The hiQ v. LinkedIn line of cases is about the Computer Fraud and Abuse Act and public data, not LinkedIn’s contract - so a terms breach is its own risk for a commercial product. Not legal advice.
The API that replaces it
LinkedIn is JobsPipe’s primary source. We run the collection, proxying, and parsing once, for everyone, and serve LinkedIn postings through the same normalized endpoint as every other source - so you write zero scraping code:
curl https://api.jobspipe.dev/v1/jobs/search \
-H "Authorization: Bearer jp_live_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"job_title_or": ["python developer"],
"job_country_code_or": ["US"],
"posted_at_max_age_days": 7,
"limit": 25
}'Same record shape from every source - title, company, normalized location, parsed compensation, seniority, posted_at, and an apply_url - de-duplicated across sources, with no proxies and no banned accounts. The free tier is 5,000 requests per month.
Related research
- Job scraper: the build-vs-buy guide for 2026
- Where to get job posting data in 2026: 7 sources compared
- The 5 best jobs APIs in 2026 - comparison, pricing, and coverage
Get LinkedIn jobs without the scraper - free tier, 5,000 requests/month.
Get a free API key