If you are having a hard time finding out why your website pages are not indexing or ranking, and if none of the usual reasons you found on the Internet explain your case, here is a set of little known issues you can investigate:
2. Your page has now an invalid canonical link – Using canonical links is a common way to avoid duplicate or near duplicate content issues and penalties. However, when migrating a site from non-www to www, or when implementing secured connections (http to https), one often forgets to update canonical links or to implement proper forwarding. Although the page is still accessible from its non-canonical URL in a browser, the corresponding canonical link value is obsolete. Hence, search engines do not know what to index.
3. Your page is in a competitive niche and has few/no backlinks – Although your page may have perfect, useful and original content for your users, if it is in a competitive niche, and if there is no (or few) backlinks to this page, and if your website does not have much authority and traffic, your page may be crushed by the competition and may never get enough exposure to rank. Your page is not the issue. Google tends to give a chance to all pages with original content, but you need to patient (sometimes for weeks or months). For other search engines, you will need more backlinks, authority and traffic before they will give a real chance to your page.
4. Your content is updated too often – This is not to be confused we creating new content on your website often. If you keep modifying, updating or recreate existing content on your website, search engines may have a hard time finding out how to indexing it to serve it as a result to user queries. It is believed that Google may slow down or suspend its crawling of such pages, until the content becomes more stable. Excessively updating content can postpone indexation and proper ranking.
5. Multiple sitemaps are not registered in your robots.txt – When one creates a new website, one often creates a sitemap.xml file at the root and submits it to its favorite search engines. Other search engines can always find this sitemap by themselves, since it bears the default name and is available from the root. Later, as the site grows bigger, one can create more sitemap files bearing different names. Although one rarely forgets to submit these to their favorite search engines, other search engines are left behind with no way to find out about such new sitemaps. Hence, they don’t crawl and index corresponding pages. The solution is simple, declare all your sitemaps in your robots.txt file. You won’t need to submit them anymore and since all search engines know where to find your robots.txt file, they will know about all your sitemaps too.
6. Premature indexing – Many content creators are eager to have their content indexed and ranking quickly. When creating a new website, they often create sitemap.xml files with dozens if not thousands of pages which are still work-in-progress or near duplicate content, hoping this will accelerate and maximize their exposure. In fact, this achieves the opposite. For new websites, Google initially tries to find the intent of the website, which may take a couple of months. The crawling and indexing rate of a website is driven by the content quality, the number of natural backlinks and the amount of traffic. If one submits a lot of poor content early, the crawling and indexing rates will be significantly slowed down. The solution is to store all pages in the sitemap.xml, but to mark those who are not production ready as NOINDEX. When these pages becomes production ready, remove the NOINDEX tag and let search engines process them progressively.
7. Performing online sales without secured communications – If one does not implement secured connections (HTTPS) for online transactions, this means anyone can intercept and modify the communication between the server and the user’s browser. Search engines do not want to expose low quality or risky websites to their users. This is a good reason to discard such websites.
8. Improper URL parameter configuration and usage – Websites often use URL parameters to customize the content requested by users. The differences are usually minimal and cosmetics, or the returned content is a subset of what is already available elsewhere on the website. It could be considered as near duplicate content. Unfortunately, a given URL with different parameter values is considered as a separate URL to index for search engines, resulting in indexation of near duplicate content. It is possible to configure URL parameters in Bing and Google webmaster tools to avoid such issues.
9. Improper redirection to a Home page – Say your home pages is located at http://mysite.com/home.html, it is not uncommon to get backlinks to http://mysite.com or http://mysite.com/. These are considered as different URLs by search engines. If you don’t redirect those backlinks URL to your main http://mysite.com/home.html, you will loose some link juice (i.e. PageRank) and this can hamper your rankings.
More checklists to find reasons why you site and pages are not indexed or ranked are available here.
Estaban Marifiuzo likes to write about SEO for Ligatures.net