The biggest hurdle for the growth of any business may sometimes be due to the simple unnoticed errors and you identifying and fixing them timely would enrich the website performance and helps it grow consistently.
Are you finding difficulty in understanding the crawl errors? No worries! Here is a detailed article about the google webmaster crawl errors and ways of fixing it.
Many times webmaster or SEO professionals face difficulty in understanding the severity of the crawl errors and hence they get ignored easily and those errors would hamper the progress of keywords rankings on Google.
This article covers details on the following,
- Crawl Errors
- Crawl Errors Effect on SEO
- Understanding webmaster crawl errors
- Types of webmaster crawl errors
- Site level Errors
- URL Errors
- User experience with crawl errors
- The severity of Crawl Errors
- Avoiding Crawl errors
- Tracking Crawl Errors
- Fixing the crawl error
- Avoiding Crawl Errors
What are Crawl errors on Search Console?
The bots which are meant to crawl the content on the internet would experience inconveniences in crawling the content effectively and possibly those incontinences would be encountered on the website hosted servers or on the system where accessing the content itself by the bots would be impossible or bots may perceive it wrongly due to the erroneous implementations on the system.
The errors or the inconveniences that bots face during the crawl of the content are considered to be as the crawl errors.
How does these crawl errors effect SEO?
The organic performance of keywords on the Google search engines are dependent on the crawl along with CTR and User metrics, whereas still, the crawl is the core of everything, without bots crawling the website search engines would never get a copy of it to its servers and henceforth serving of the content pertaining to your website would not happen. To be one of the players on the organic search results your website need to get crawled and it getting crawled often helps to come to the first page of Google.
The crawl errors impact the crawl rate, as the bots visit the website with certain bandwidth or conditions and whenever those are not met the bot would not crawl your website effectively and hence this limitation of crawling very few web pages is nothing but drop in the crawl rate.
A dropped crawl rate implies that a certain number of web pages are not crawled often or may not at all got crawled ever, which never allows Google to show your website on search results with respect to searches happening for that content.
- Here is what happens when there are too many crawl errors
- The Crawl frequency would drop
- The crawl rate comes down
- The Avg position of keywords will fall
- The keyword coverage comes down
- The URL may completely get removed from the Google index
- The updated content on the webpage may not appear timely on search results
- Overall the organic growth may not be that great or as expected, which may sometimes go down.
How do I understand Webmaster Crawl Errors?
Brainstorm the basics before you attempt to understand the crawl errors
Webmaster crawl errors are simple to understand and even before you get deeper into it you have to know few of the basics about the HTTP status codes.
HTTP Status Codes:
Servers communicate in a language called hypertext transfer protocol and their understanding of the content and responses to the requests will be as follows,
Status Code 200:
The server is responding to the request and the requested URL/Content is available on the server and it been served successfully to the client (Request from the browser by the real user or the bots trying to access the content)
Status Code 301:
The server is responding to the request and indicating that the content/URL requested is been redirected to other URL/content
Status Code 404 :
The server is responding to the request and trying to find such URL on the system, but then there is no such URL/content on the system and that’s when the request status code will be 404.
In this scenario, the file doesn’t exist but instead of serving to return a status code of 404 it's serving some other status code for the nonexisting file on the system
It’s slightly different from the above one and here servers have such URL/Content but then trying to portray as not available. This is a usual case when there are too thin webpages or custom 404 pages are not handled properly.
Status Code 500 :
In this case, the server is busy with some other requests and not able to serve the current request and that’s when the status code 500 will be served
Access Denied 700 and above:
The server is refusing to communicate its status, it doesn’t prefer to leave any kind of information to the requester and that’s when it is considered to be access denied.
The spelled out version of DNS is domain name server, for every website domain name there will be an address/IP on the server mapped only to that domain and it is the domain name server. In the case of IP conflicts or any other connectivity issues, they will be considered to be as DNS errors
Understand the basics around servers such as what are servers, how do they work, what all is needed to be in place in order to see maximum output for a website and more.
Classification of the webmaster errors:
Site errors are the errors which affect the complete site and are not specific to any URL/Content, which is universal or at least for a set of web pages.
- DNS Errors
- Server Connectivity Errors
- txt Fetch
The URL errors are specific to URL’s and which may or may not be the site level issues, but then they are always found being associated with a specific set of URLs.
- Server Error
- Soft 404
- Not Found 404
- Access denied
- Others – Usually the undetermined errors
What are DNS Errors:
DNS errors are the very rare cases and which would not occur until there is a severe issue in pointing the domain name to the appropriate server and sometimes this may happen due to errors in the configuration files or the .htaccess files or issues from the hosting providers
How To Fix DNS Errors?
Get in touch with your hosting provider or you may have to review your settings by logging in to the cpanel or reviewing the configuration files or.HTaccess files, but then this would be much easier for a developer or the person who is expert in it, so suggest you get in touch with them.
What are Server Connectivity Issues?
Server connectivity issues are mainly due to the server performance and associated issues with it, which may also be sometimes due to the hosting plan that you have opted for if you are on a shared hosting with very limited ram, hard disk and other feature limitations you may experience these problems
The server connectivity issues will be as follows:
- Connect Timeout: The request cannot get in touch with the server at all, as the server may be busy or may be misconfigured
- Connection Refused: The server refused for the connection
- No Response: The connection with the server will be closed
- Timeout: The request wait time is very high and server timed out waiting for the request.
How to fix server connectivity issues?
Speak to your hosting provider and mention all the problems that you have come across in the server connectivity section of your webmaster tools, they should be able to fix it for you.
Robots.txt fetch Errors:
Robots.txt is a text file with a set of instructions or commands for bots residing on the root directory of the server. Sometimes some of the commands may instruct bots not to crawl specific sets of content/URL on the system and which may block the valid web pages crawl and index by search engines.
How to fix robots.txt crawl/fetch errors:
- Get a thorough understanding of commands/instructions used on the robots.txt file
- Understand what types of URL’s to be blocked or allowed for crawl and index
- Use webmaster robots.txt tool to check the URL’s allowed/blocked with respect to the commands.
What are URL Errors?
URL errors are the errors those usually bots found being associated with a specific set of URL’s and this can sometimes be a site level error too.
Webmaster tools provide sample data for 1000 URL’s which are prone to or associated with errors.
How to fix webmaster URL Errors?
If you have understood the severity of all the URL errors and also if you have already found a solution to it you can simply select all the URL’s on the same interface and you can mark as fixed, if those are really fixed you would not see them coming in.
Anyways we have more details on how to fix them all even before you do mark as fix:
Fixing Server Errors- Status code 500:
Server errors are the errors of status code 500 and it implies server is busy in handling other request or may not be in a position to handle any request, and in this case you have to figure out is it specific to a certain number of URLs or an issue across system or site.
In case if it is specific to URL’s debug a bit more and seek the help of a developer.
In case if its some other problem as discussed in the site level issues you have to seek the help of hosting provider or upgrade your hosting plan.
Fixing the soft 404 errors on GWT crawl errors:
Understand how those web pages got removed from the system if that’s not an intentional activity try to get them back.
If its an intentional activity serve a proper response code 404, and in case if you prefer bots not to bring 404 to your notice when you are already aware of pages removal or intentionally removed just move the status code to 410 so that bots understand that website owner has removed intentionally
In case if you have a custom 404 page, make sure that’s handled properly and also try to remove the content thin pages or enrich the content on those pages.
Fixing not found 404 crawl errors:
404 errors occur when there is no such file on the system, but then bots are somehow trying to access them.
Here, in this case, go through all the URL’s understand why are they removed, their credibility, traffic flowing to them, and their necessity
- If they are accidentally removed get them back
- If they are intentionally removed the check for the following and take actions accordingly
- Has lots of traffic but page PA/Backlinks/Internals are negligible or zero: Do a 302 redirect
- Has a lot of traffic and has PA/Backlinks/Internals- Do a 301 redirect
- Has no traffic to credibility- Throw as status code of 410
How do I understand the credibility of the URLs/Content?
The credible URLs are the ones those are driving quality traffic to the website or have a good number of external links or internal links or both.
How do I understand the traffic of the error URL’s?
To understand the traffic of the URL’s found on crawl errors section first you have to download all the crawl errors and recheck the status codes of the URL’s as many if the URL’s may have returned to the original status by now, the errors were only for that particular moment during the bot crawl and which may or may not exist now.
Use the bulk status check tool or the SEO plugin for excel and you will be able to find the status of all URL’s within no time.
Now, you download the landing pages and their traffic from analytics and vlookup with the data for which you have already found the status code details as well.
Now filter out the URLs which has got errors and understand if there is any traffic to these URL.
Note: If these URL’s has a lot of organic traffic it implies they are doing good and are credible and hence you should be doing a permanent redirect of such URL’s.
Anyways there is a method to figure out the URL credibility in terms of external links and internal links.
Find the internal and external links for the URL:
To understand if there are any external links to the errors URL’s you can rely on webmaster, links to your site data and internal links, you can find it under the search traffic section. You can download the external links and internal links from the webmaster and figure out which all URL’s has got external and internal links if each of these URL has too many external links and internal links you have to redirect them to the new URL or any URL that’s relevant.
Alternatively, you can use the Majestic SEO tool to find the external links. Majecticseo provides a detailed report on the number of external links a URL has.
Do users too experience the same kind of issues?
Yes, if you have found issues with the valid URL’s on webmaster crawl errors for sure the real users too must be finding the same issues and these experience on the website becomes worse and hence this, in turn, impacts the search rankings of the keywords.
How to avoid crawl errors?
- In order to avoid the crawl errors technically your servers and their setting should not have any problem
- While editing content on the website you should be conscious about the permalinks, as just a small tweak or change in the URL’s will overwrite an old URL’s and creates a new URL.
- Prefer the best hosting plans if your website has got good traffic already
- Make sure the sitemaps are clean and hold only valid URL’s (URL’s of status code 200)
- Monitor the tasks running on the server and taking up your processor time, kill the faulty processes on the server
- Remove all the redundant/junk code or processes running on the server
- Make sure the external links and internal has URL’s are proper and any mistakes here would give you lots of 404 errors
- Find how bots found these errors, they may have either found from internal/external or sitemaps and fix it.
- Keep monitoring the webmaster crawl errors on a regular basis and fix them
- Disavow the links built from spam websites to the wrong URL’s of your website
How severe are the crawl errors?
- Crawl errors severity depends on the type of errors, site level or URL errors
- The type of URL’s got affected, traffic and revenue-driving
- The credibility of the URL’s
Based on the above 3 points you can decide if that is severe or not, but then if any of these errors are affecting the craw frequency or crawl rate significantly then the crawl errors severity is very high.
How to track webmaster crawl errors?
- Sign up for Google webmaster tools and submit your website for search engine crawl and index and webmaster tools will provide you all kinds of error details on its interface and you can keep an eye on this in order to address whenever there Is an issue with the URL’s or website.
- You can also enable the track of bot activity on the server and process it on a regular basis and this ensures you an early fix of issues, as this holds a lot of data whereas has webmaster can provide only sample data.
One cannot ignore the webmaster master errors until they are processed and understood from all perspectives and also the reason for their existence and on understanding finding the appropriate fix is very crucial.
If you are seeing too many webmaster errors let us know, we have worked on websites with few millions of URL’s and have retained their traffic and credibility, so that they continue to shine despite all other external ods.