# Enrichments

In addition to the raw job data, the Fantastic.jobs API includes several enrichments. Our goal is to provide you with as many useful job and company enrichments as possible.

## `locations_derived`

Some ATS platforms are more diligent than others in keeping their location data accurate and structured. To create consistency between sources we run every job's location through a normalization step that returns values in a consistent `city/county, region, country` format.

The normalization is powered by [Geoapify](https://apidocs.geoapify.com/playground/geocoding/#autocomplete), which is built on OpenStreetMap data.

If you already maintain your own location-normalization pipeline, you can use the raw `locations` and `locations_alt` fields over `locations_derived` if you so wish.

## `include_ai`

We run an LLM over each job's description, title, location, and several other fields (including some job page elements that we don't return in the API) to extract structured insights: salary, benefits, experience level, detailed remote/work-arrangement signals, and more. We cover roughly 99.9% of all jobs and have a failover system between several LLMS to ensure that these fields are (almost) always set.

Good to know:

- LLM Extractions are performed by a probabilistic llm model. While it can make mistakes, we've finetuned our process to make this very minimal
- No outside information is included in this process, the LLM will exclusively use the context of the job's data

See the [API Reference](/api/new-jobs) reference for the full list of `ai_*` fields.

## Organization enrichment

We enrich ATS jobs with the matching LinkedIn company profile, plus additional company context from Crunchbase, Glassdoor, and our own derivations. The LinkedIn mapping process covers over 99% of jobs at roughly 99.4% accuracy - if you spot an incorrect mapping, please report it and we'll correct it in our database.

The enrichment is available in two tiers:

### Basic

Opt in inline on `active-ats` / `modified-ats` via `include_basic_organization_details=true`, or fetch the dataset separately from the [`organizations`](/api/organizations#organization-list) endpoint. Covers the most common LinkedIn fields - name, logo permalink (Crustdata-hosted), industry, headcount, size, followers, HQ, locations, type, founded date, slogan, description, specialties, and the recruitment-agency flag - plus two Crunchbase fields: `org_crunchbase_categories` and `org_crunchbase_total_investment`.

`active-jb` already returns the basic LinkedIn fields inline on every job - no flag needed, since the org is read directly from the job board.

### Advanced (beta)

Available only via the [`organizations-advanced`](/api/organizations#advanced-organization-details) endpoint. Join back to ATS jobs by `org_linkedin_slug`. On top of the basic set, advanced adds:

- **Extra LinkedIn fields:** LinkedIn URL / ID, short description, full industry list, estimated revenue bounds, largest headcount country
- **Industry classification:** NAICS, SIC, market segments
- **Crunchbase fiscal data:** fiscal year end, acquisition status, IPO date
- **Glassdoor:** scalar ratings (work-life balance, compensation, culture, etc.) and review / salary / interview / benefit counts
- **Other profile links:** Twitter, Crunchbase profile, Glassdoor profile

Heavy nested fields (funding rounds, headcount / follower / revenue timeseries, news articles, Glassdoor reviews) are excluded by default and must be opted into individually via dedicated `include_*` flags. See [Organizations](/documentation/endpoints/organizations) for the full field list and flag names.

## `domain_derived`

ATS platforms don't always expose the employer's domain or homepage, so we derive it ourselves. This is useful for joining against external datasets or running further enrichments.

Coverage and accuracy:

- ~96% of jobs have a populated `domain_derived` value.
- Accuracy sits around 98% on the populated subset.
- Accuracy is highest for medium-to-large US companies and lowest for non-US companies and companies with generic names.

## `exclude_ats_duplicate`

We've released a beta cross-feed deduplication system for users who consume both the ATS and job board datasets. Every LinkedIn job is checked against the ATS dataset using two signals:

- Job title + organization name
- Job title + LinkedIn company profile mapping

The result is exposed as `ats_duplicate` on every `active-jb` row:

| Value | Meaning |
| --- | --- |
| `true` | At least one of the two checks matched an ATS job - this LinkedIn listing is a likely duplicate. |
| `false` | Neither check matched - this listing is treated as unique to the JB feed. |
| `null` | Job was not eligible for the dedup check (see below). |

Setting `exclude_ats_duplicate=true` on `active-jb` drops the rows flagged `true`.

### Known false-positive sources

The dedup logic looks for exact matches only, so the priority is precision over recall - some duplicates will still slip through. During internal testing, the residual false positives clustered around the following patterns:

- The LinkedIn listing is indexed before the ATS listing.
- The job is older than 6 months on the ATS platform but newly indexed on LinkedIn. Jobs older than 6 months are considered expired and excluded from the dedup checks. This can also be a positive signal - the employer is re-promoting an older listing.
- The LinkedIn listing was posted via a programmatic platform (e.g. Appcast, Adzuna) with minor edits to the job title or organization name.
- The LinkedIn listing uses a different title or organization name than the ATS listing.

To fully eliminate cross-feed duplicates, we recommend layering a fuzzy deduplication pass on top of this flag.
