Boosting your seo by helping googlebot

What’s the secret to Google loving your website and indexing it? The secret is to steer crawlers to the most important and useful pages of your site and ignore those pages with little value to searchers. You may think that you want Google or Bing to index every URL of your website but you could be preventing it from crawling your most important pages by using up your ‘crawl budget’ on poor or irrelevant pages.
What is ‘crawl budget’?
When Google’s crawlbot (or Bing’s for that matter) comes to your website, it has a finite amount of resources to spend on accessing your website’s pages; after all, Google has millions of other websites to crawl that day too.
Your crawl budget can be determined by the value or quality of your website already, including but not limited to the quality of backlinks to your website. I won’t speculate on more factors here, but there are some good research pieces on the web where people have tried to identify these factors.
Why control access?
Give important pages priority
By controlling where Googlebot is allowed to crawl within your website, you are increasing the likelihood that important and valuable pages will be crawled every time Google visits your website.
Examples of these could be your product or service pages, blog post pages or even your contact details page. All of these are pages you want to have ranked highly in the search results so users can find this information quicker.
Ignore pages that don’t need to be ranked
There will be pages of your website that have no need to be indexed in the search results. These include pages that a user wouldn’t typically look for in the search results but perhaps will browse to whilst on your website. These could be your privacy policy page, terms & conditions page or your blog tag or category pages.
How to help Googlebot access the right pages of your website
There are a number of different ways in which you can help Googlebot access your website. The more of the following you can adjust or implement, the more control you should have over Googlebot or Bingbot.
Robots.txt file
The first thing to look at is setting disallow rules in your robots file for all pages, folders or files types on your site that do not need to be crawled. Upon visiting a site, the first place a crawler will look at is your robots.txt file (provided it is always located at http://www.mydomain.com/robots.txt). This will help indicate to various crawlers which parts of your website it should not attempt to crawl. You can set rules depending on what crawler bot you want to control.
NoIndex tags
To help prevent certain pages from being indexed, it is also recommended that you add the NoIndex tag to the header code of those pages. Once added to a page, you should test these tags by doing a ‘Fetch as Google’ request on the URLs in Google Search Console.
Up-to-date XML sitemaps
Although Google won’t take your XML sitemap as a rule of which pages to crawl, it takes it as a hint – so make sure it’s up to date to help reinforce which pages of your site it should be indexing.
Remove any old pages from your site and add any new pages.
Fix internal links
Googlebot will follow links it finds in your webpage content so make sure you aren’t going to waste its time by letting it crawl links to missing pages. Use a crawling tool such as Screaming Frog’s SEO Spider tool to find these broken internal links and fix them at the source.
Page load times
Googlebot will need to load each of your pages when it visits them so by reducing the load time of each you can allow it to crawl and index more pages within the same overall time. There are a number of free tools available to help you analyse and improve site speed.
Site Structure
A good site structure is an underrated method of helping Googlebot crawl your website a lot easier. Clearly categorising page content and not hiding pages away too deep in your site structure increases the likelihood they’ll be found by the crawler.
The SEO Benefits
If you’ve managed to implement some or all of the above recommendations and tested them using the tools mentioned, you should begin to see some changes in crawl stats shown within Google Search Console.
Here we’re looking for the number of pages crawled to be similar, or just over, the number of actual pages on your site in the first blue graph. The reduction in kilobytes downloaded (in red) should mimic the reduction in pages crawled if you previously had lots of pages being crawled.
Below is an example of a site with a significant number of URL parameter issues in which Googlebot crawled up to 12,000 URLs when in fact there were just a few hundred actual pages of the site. Through the application of URL parameter rules and the other factors mentioned above, the number of pages crawled became much more consistent and realistic.
Google is crawling your useful pages each time, the rankings of your pages will be more likely to change frequently, and most likely for the better. Fresh content will get indexed and ranked a lot quicker and time won’t be wasted from your ‘crawl budget’.
Source: www.koozai.com
Share:

No comments:

Post a Comment

Popular Posts

Powered by Blogger.

Search This Blog

Popular

Recent Posts