FREE SEO TOOL
Page Indexable Checker
Enter a URL to make sure it's able to be crawled and indexed by search engines. Checks for index tags and robots.txt file.
Indexability Check
This tool checks a single URL for the most common signals that determine whether a page is in shape to be indexed by search engines. It fetches the page, looks for indexing directives, identifies the canonical target, and checks for a basic robots.txt disallow.
What the tool checks
- Indexable or not: A simple verdict based on the signals below. If any blocking signal is found, the page is treated as not indexable.
- Canonical target: If a canonical URL is declared, the tool reports it so you can confirm which URL is intended to be the primary version. Source
-
Noindex detection: Looks for
noindexin meta robots and theX-Robots-Tagheader. Source Source - Robots.txt block detection: Checks the site’s robots.txt rules to see if crawling is disallowed for the URL’s path (basic match). Source
Key SEO terms (plain-English definitions)
Indexable
A page is “indexable” when search engines are allowed to include it in their index and potentially show it in search results.
This tool focuses on common technical blockers (like noindex and robots rules), but pages can still fail to index for other reasons
like quality, duplication, or soft 404 behavior.
Canonical (canonical URL)
A canonical URL is the preferred version of a page when multiple URLs contain the same or very similar content.
You typically declare it with a <link rel="canonical" href="..."> tag in the page’s HTML.
Search engines use canonicals to consolidate duplicates and choose which URL to treat as the main one.
Source
Source
Important nuance: a canonical is not the same thing as “noindex.” A page can be crawlable and indexable, but still not be the URL that gets shown if the canonical points somewhere else.
Noindex
noindex is an instruction that tells search engines not to include a page in search results.
It’s commonly implemented via a robots meta tag in the HTML (for example <meta name="robots" content="noindex">)
or via an HTTP response header like X-Robots-Tag: noindex.
Source
Source
X-Robots-Tag
X-Robots-Tag is an HTTP response header that can communicate indexing rules (like noindex) without requiring changes to the page HTML.
Google documents support for using X-Robots-Tag as part of robots controls, and it’s widely used in practice.
Source
Source
Robots.txt
robots.txt is a file at the root of a site (example: https://example.com/robots.txt) that provides crawling rules for automated agents (crawlers).
It’s part of the Robots Exclusion Protocol and is primarily about controlling crawling access, not guaranteeing indexing outcomes.
Source
How to interpret results
-
In shape for indexing: The tool did not detect blocking signals (no
noindex, no robots.txt disallow, and the HTTP response is healthy). - Not in shape for indexing: One or more blocking signals were detected. Fix those issues, then re-run the check.
- Canonical points elsewhere: The page may be technically indexable, but you are telling search engines a different URL is the preferred version.
Common reasons a page still might not index
Even if a page is “in shape for indexing” by these checks, search engines can still choose not to index it due to factors outside the scope of this tool (duplicate content, thin content, soft 404s, internal linking issues, crawl budget, or quality signals).