This document outlines my findings whilst carrying out a website audit for Jacamo website jacamo.co.uk. It lists the issues I have uncovered, along with my recommendations to fix them.
Jacamo Technical Audit Table of Contents
Please use the links below to jump to the appropriate section as desired.
- The Current Landscape
- Technical Analysis
- Crawling & Indexation
- Site Performance
- Site Health
- Backlink Analysis
Please bear in mind that in this instance I am not familiar with the CMS and the difficulties/limitations that it may or may not have. I also do not have access to Google Search Console or Google Analytics.
It’s also worth mentioning that I will only be using a sample of crawl data due to the size of the website and the time it took to crawl.
I was initially able to crawl 790,099 URLs on the Jacamo.co.uk website before my crawl failed. After this, I had to use a trial version of DeepCrawl and use a much smaller sample size of 1,000 URLs.
If I had more time and access to more crawlers such as Deep Crawl I would be able to crawl and analyse much more, if not all of the URLs on the website.
Finally, this audit was conducted in early March 2020 so it is entirely possible that changes have been made since then.
The Current Landscape
This area simply looks at some of the elements that make up the website or are contained/used on the site as it is currently:
- Java Programming Language
- Criteo targeting
- Google Analytics
- Google Tag manager
- Incapsula CDN
- HTML 5
- Mobile responsive website
Observations: From an initial look over the website, it seems to be returning the correct status codes for live (200) and redirected (301) URLs.
The biggest issues I can see are that the URLs contain a lot of parameters and aren’t as “clean” as they could be. For example this is the current accessories page:
Ideally this URL would be something like:
|Recommendations: I would recommend mapping out a new URL structure for the site and rewriting the URLs to a more friendly version then making sure that the older versions are redirected to the new destinations to preserve link equity.|
Why is this important? When a crawler requests a page on your website, your web server returns an HTTP status code along with the response. It is very important to make sure that your server returns a status code with a value of 200, which is the equivalent of saying that everything is OK.
URL structure is also very important for any website that needs categorisation of some sort.
URL rewrites are important as it will make them much easier to crawl by search engine bots and for them to easily determine the content or topic of a given page.
Rewriting the URLs and placing the relevant keywords within them will also further assist with keyword and topic relevance which will strengthen a given page and assist with improving rankings (it may not directly improve rankings as other factors will be at play, but will certainly help).
Non 301 Redirects
Observations: There are instances of non-301 redirects on the website.
Redirections usually come in the form of a 301 status code which tells Google that that URL has moved. In this case, after a period of time the old URL is generally removed from the index.
Those that come with other status codes therefore send other messages to Google. The most common form is the 302 which tells Google that the URL has temporarily moved. In this case, Google will tend to keep the original URL in it’s index as it understands that it will be reinstated at some point. However, Google has been known to treat 302s as if they are 301s if they are left for longer periods of time.
You can see the non-301 Redirects we’ve encountered on Jacamo here: https://drive.google.com/file/d/1AVVqVI1o3c8dcVw4kqdcq9gzmNS4ScbD/view?usp=sharing
|Recommendations: Assess these URLs and consider updating the redirects to 301s where possible.|
Why is this important? 302 redirects do not pose a problem in themselves, but it is recommended to update these to 301s where necessary, especially if they have a number of incoming links (as is the case with Jacamo). Link equity is also passed in this way from URL to URL and despite 302 still passing this, updating to 301 ensures that this will remain the case.
HREFLANG and Internationalisation
Observations: Jacamo does have an international audience but it is not all on one domain. The jacamo.co.uk domain is set up specifically and it appears that the jacamo.com domain is used for international audiences.
For example, we checked to see if there was an international URL structure in place, which there is but it is all on the .com domain and .co.uk is almost treated as a separate website (despite the same branding etc.) specifically for the UK site.
On the .com domain, this is just some of the hreflang setup:
These URLs are set up to display English content in each country, for example:
- Display English content for users in United Arab Emirates (AE)
- Display English content for users in Antigua and Barbuda (AG)
- Display English content for users in Anguilla (AI)
Why is this important? There is no one set approach for internationalisation and we would not recommend changing Jacamo’s approach at this stage. From what we can see, HREFLANG is currently set up correctly and is working well on the .com domain.
Observations: It looks as though the homepage currently returns two possible URLs
After inspecting this page and viewing the Page Source, I can see that there is a canonical tag in place, which is good and shows Google that the correct version of the page is http://www.jacamo.co.uk/.
There are also instances whereby the default link to a sub-page from the navigation is a longer, less friendly URL, but the canonical tag present on the page itself points to a slightly cleaner version of the URL. Take the accessories page for example;
The default linked URL is:
However the canonical URL in this instance is:
|Recommendations: I would recommend changing the internal links to match their canonical counterparts.|
A potential issue here could be an external website linking to the non canonical version of the URL and there is a risk that this non-canonical URL could get added into the Google index (this isn’t the case currently but it is a risk). It is always recommended, where possible, to add prevention measures before these types of issues arise.
Many URLs have mixed cases, it is not recommended to use upper cases in URLs.
If external sites regularly link to pages on Jacamo using uppercase characters in the links then Google may ignore the canonical tag and also index the page with uppercase characters as well as those with lowercase which will cause page duplication.
|Recommendations: I would recommend if possible requesting that the dev team add a rule in the .htaccess file that automatically resolves any uppercase characters into the default and more user / Google-friendly lowercase characters.|
Observations: The website on the whole seems to have a fairly good, internal linking structure – product pages are two clicks away from the home page so it is really easy for users to find and purchase items.
Usually I would also run checks for orphaned pages but unfortunately my access to some tools has been revoked so I was unable to accurately check for this issue.
|Recommendations: N/A at this time.|
Why is this important? Internal linking is very important to the SEO strategy of your site.
Your website needs to have a good internal link structure and all your pages must be linked together for the search engines to be able to crawl and reach every single page. If some of your pages remain isolated from the rest of the website, crawlers will struggle to find out about their existence and index their content.
Observations: There is a 404 call back protocol in place, this essentially ensures that if a bot stumbles across a page that doesn’t exist it returns the correct status code (404).
In some cases websites are incorrectly configured which can cause non existent pages to return a 200 code, this is problematic as those pages can potentially get indexed. If this happens, it means that crawl budget is not only being wasted on useless pages, but the index is bloated out with useless pages that users can potentially find.
If I had access to Google Search Console I could also check if there are any soft 404 pages, these can often be actual pages (or in some cases old product pages) but could be deemed thin or not having enough usable content. In this instance, I would make a call on either redirecting them to the closest relevant page, or improving the quality of the content would be the appropriate action.
Crawling & Indexation
Observations: When looking in the robots.txt file I couldn’t see any immediate issues, the only thing I would really advise in this instance is to specify the location of the XML sitemaps.
As a general rule, consider the following for your robots.txt file:
- Does it exist?
- Does it disallow all appropriate folders
- Does it disallow certain folders you don’t want it to?
- Are folders you want excluded in indexation in this list?
- Have you referenced all XML sitemaps?
|Recommendations: Make sure there’s a valid robots.txt and it has all of the correct references.|
Why is this important? A well optimised robots.txt file can hugely benefit how search engines navigate your website. You can direct search engines to files such as your XML sitemap and also prevent them from crawling the less important pages such as the “terms and conditions” page. Each website is given a limited amount of crawl budget by search engines, a well optimised robots.txt file can reduce the amount of wasted bandwidth and in turn ensure that search engines spend more time crawling the important pages on the website which can further improve organic visibility.
Observations: I was unable to find a sitemap for Jacamo. Despite employing several methods to locate one I was unable to do so.
This could be because there is no sitemap in place.
Ideally the sitemap would be straight after the root of the site, for example:
Sitemaps are important because they make it easier for search engines to find and index your websites pages. Sitemap.xml are highly recommended and widely used. It is usually smaller websites that have less use for them, not ones of this scale.
This is because, even if your website is well linked, Google can easily overlook some of your pages, especially if they are newer, so it is a good way to make sure your content can all be discovered and indexed.
Contrastingly, if a website does not have sufficient internal linking then this is all the more reason to use a sitemap.
When using site:jacamo.co.uk in the browser, I can see that there are 84,200 results in the Google Index. This is fine, but at the time that I had to stop my crawl (which had been running for several days) due to computer processing power, there were over 1.6 million URLs in the queue.
Depending on what this content was, this is an extremely large amount of content to be missing from the index and the sitemap would be a good way to make this content more visible. It would also be a good way to help manage crawl budget as since my initial crawl had been running for 4 days (plus another 2 to process the data before it failed completely) and had only crawled 790,099 URLs this would suggest vast crawl inefficiencies. If my crawlers (first Screaming Frog and then Sitebulb) were struggling then it is likely Google could struggle too. Crawl inefficiency is a large issue and can be indicative of larger problems as well as holding back the site from better rankings and visibility. I ended up using much smaller sample data from DeepCrawl of only 1,000 URLs.
At this stage, if I had acces, I would refer to Google search console to check indexation and compare this with the sitemap URL and search indexation numbers. This would also allow me to see if there are any other sitemaps that I may not have picked up when searching as well as any other wider issues.
|Recommendations: I would recommend firstly creating an XML sitemap and including a link to this in the robots.txt for ease.
I would also compare the above figures to the indexation figures in Google search console to get a full picture. I would then, if needed, put in a request to update the XML sitemaps to reflect the true number of URLs, taking into account the 50,000 URL limit – in which case multiple sitemaps are needed. As the Jacamo website would exceed 50,000 then I would segment them into sub-sitemaps and group by category for efficiency.
Why is this important? XmL sitemaps are essential to all websites, they are like a library of pages which search engines use to effectively crawl and index your website. A well optimised sitemap can tell a search engine exactly where a page is, how important that page is to the website (through priorities) and when it was last modified. A search engine can then use this information to determine how often the content on a given page is likely to change and amend its crawl rate to be inline with the estimated page modification frequency ensuring that any new changes to a given page are reindexed in a timely manner.
META Robots tag
Observations: The most prominent use of the meta robots noindex directive appears to be applied to the faceted nav filters in the main categories, however this is mainly done through canonicalisation as opposed to no-index tags.
|Recommendations: There were no major issues that I could see in this area.|
Indexation and Accessibility conflicts
Observations: From what I can see, there doesn’t appear to be any issues with the responsive design from a search engine perspective.
|Recommendations: No recommendations or concerns in this area.|
Why is this important? One of the most common indexation issues occurs when the entire website is blocked in the robots.txt file by mistake. Generally the only time you would block a website from being indexed is when it is still in development prior to the launch.
Some common errors that can appear in this report include: Content wider than screen, Text too small to read,Clickable elements too close together.
Observations: Overall the site speed for both mobile and desktop is below where it would ideally be:
When using Speed measuring tools as GTMetrix it is easy to get caught up in the scores they use. For example, GTMetrix gives you an ‘A’ if things are good and ‘F’ if it is bad. However, this is not always accurate. I find it to be much more accurate and rewarding to use the above metrics as a measure of speed and performance.
Ideally the desktop fully loaded time would be less than 3 seconds, but the average time on GTMetrics is 7.2s. Obviously this isn’t realistic in all instances, especially on sites of this nature due to imagery, but there are definite improvements that can be made. I have listed below some key issues that once addressed, should help improve the pagespeed.
Leverage browser caching –
Page load times can be significantly improved by asking visitors to save and reuse the files included in your website. This is particularly effective on websites where users regularly revisit the same areas of the site, for example, they may visit the product page several times before making their purchase.
Dev access is usually needed to implement this but the rewards should be high.
However, upon further inspection these resources are all external resources meaning they are resources that are not part of the main domain. Because of this, it is unlikely that Jacamo would have direct control over these to allow them to leverage browser caching.
In light of this, we would not recommend that Jacamo leverage browser caching. Instead, we would suggest that they assess the use of these and whether the weight and load time they are adding to the site is justified.
Caching is beneficial as it stores some static information on the user’s hard drive in order to save those resources being called repeatedly from the server. This makes site loading faster. Compressing resources with gzip or deflate can reduce the number of bytes sent over the network.
Minimise Redirects –
There are a number of redirect chains on the site which means there are multiple ‘jumps’ that search engines and users have to go through to reach the final URL. It is generally known that Googlebot will not follow more than 5 redirects in a chain. The typical guideline is to avoid chains longer than 3 URLs.
However, similar to browser caching, these resources are all external resources meaning they are resources that are not part of the main domain.
Because of this, it is unlikely that Jacamo would have direct control over these to allow them to add redirects in the typical way.
In light of this, we would suggest that they assess the use of these and whether the additional ‘jumps’ and load time they are adding to the site is justified.
Image optimisation –
Some images could be optimised to reduce page weight. We’d suggest optimising these through a tool such as ImageOptim and then replacing the originals on the server with the newer optimised versions.
Reducing page weight would mean that pages load faster which is a known ranking factor.
|Recommendations: I recommend working to optimise the above recommendations to improve the webpage loading speed.|
Why is this important? Generally a website should load as fast as possible for optimal performance. Website load speed is a confirmed ranking factor and as we can influence this we should prioritise performance fixes as much as possible.
Observations: when checking the code against W3 standards there were a fair few issues that had been flagged up, please see the link below for further information.
|Recommendations: I would recommend putting a request in with the development team to work through the code discrepancies and amend to meet the current standards. Search engines prefer when the code on a website meets the current web standards.|
Why is this important? It is essential that any code written for a website meets a given standard. The more “validated” code is then generally the better the quality of the web page and the easier it is for search engine bots to read it. This in turn will assist in the overall performance of the website as it will be much easier for search bots to read and understand the lines of code.
Tracking Code Implementation
Observations: The Jacamo site does currently have Google Tag manager implemented throughout the site.
**At this stage I would usually dig around in Google Analytics or other analytics tools to check tracking and more importantly what you are tracking. I would also check for macro goals (sales), and micro goals (events and assisted conversions).
Why is this important? It is vital for any website to ensure that they are tracking site visits properly through a form of analytics tool, the most common being Google Analytics. With Google Analytics we can not only track traffic sources but we can also break these down by channel e.g. direct, organic etc and by location & even country if needed. This is all essential data that can only be monitored once proper tracking is set up.
Internal Site Search
Observations: In this area I would analyse the current performance and usage of the internal site search function. This area gives great insights into the keywords that users have actually searched for internally on your website, we can also attribute conversions against internally searched keywords which can help within future content campaigns.
Why is this important? Internal search queries can provide good insight into what visitors search for once they are in your site. If URL parameters are used (they are in most cases) then this can be integrated into Google Analytics where you can find some useful insights.
Observations: In this area I would analyse how the site is currently being crawled, average pages crawled per day and the time spent downloading a page.
These stats can give great insights, an example of this would be if your website had seen an unusually high spike in pages crawled but no major website changes on the site had been made, this could indicate an imminent Google penalty.
|Recommendations: I would recommend checking this area in Google search console daily to ensure that if any unusual spikes or dips are present, especially in pages crawled per day it will enable you to take action more promptly.|
Why is this important? Crawl stats provide information on how long it takes Googlebot to download a page, kilobytes downloaded per day and the average amount of pages downloaded per day. This is key information that can be used to improve your website for example; If you had noticed a huge drop in pages being crawled per day over a period of time this likely indicates that there are issues with the website.
Observations: This area would look at the amount of soft 404’s, 500, 404 and 410 codes currently present on your website. I would also look at any invalid pages in the sitemap. This area can help find broken pages on your website, which you can then redirect.
|Recommendations: I would recommend using both Google search console and Google Analytics to search for broken pages and more importantly, broken pages that still receive traffic and pageviews.|
Why is this important? Crawl errors indicate the amount of resources that Google has failed to crawl, this can be for a number of reasons with the most common being a 404 error or a not found error. This can occur when you move pages or products on a website without redirecting them properly. Google has stated that 404 errors are fine to have on a website, however larger numbers can have a negative effect as they use up valuable crawl budget on pages that do not exist anymore.
Penalty & Disavow Information
Observations: I would check to see if there is a current disavow file, if present I would download it to double check the URLs listed within to ensure that they are in fact bad links. I would also check the manual actions area in Google Search Console and the general messages to see if there had been any notifications of penalties.
|Recommendations: I would recommend regularly checking the backlink profile and actively attempting to remove any links that could be deemed harmful, either by adding to the disavow file or preferably, manually contacting the webmasters.|
Why is this important? Checking your site messages is an important task and should be carried out daily. The inbox in your Google Webmaster Tools account is the preferred method for Google to contact you if there are any issues with your website. If you get a manual spam action placed against your website you will be notified here, similarly if you successfully get a manual action revoked you will also get a message here.
Content Checks and Optimisations
Typography & Content
Observations: The website makes use of the various H headings and presents the text in a clear manner on the main website. Although from my sample on Deep Crawl, I did find a small amount of URLS that were missing H1 tags – this made up for 1.7% of this crawl and given that my initial crawl had audited 790,099 URLS with 1.6million in the queue, we can safely assume this 1.7% would make up 40,632 URLs on the whole site.
I also found examples of pages that had instances of duplicate page titles or descriptions or those that were not making the full use of available space in titles and descriptions on SERPs.
|Recommendations: Given the time and the right tools, I would look into what pages these H1 headings were missing from and assess if it would make sense to add them in / adjust existing text.
I would also do the same with the rest of the meta descriptions and titles that I encountered here and assess the value of these pages and whether optimising them to make them more unique to shorter / longer (as appropriate) would be worthwhile.
Why is this important? Good typography on a website can vastly improve conversion rates and overall user experience. Well laid out content that is easy to read will also help reduce bounce rates and encourage user interaction on the website and headings are a great way to do this. Although they are not essential, H headings are a good way to help Google figure out the context of your page.
Page titles and meta descriptions can also help here, but making sure they accurately convey your content can increase your CTR in SERPs which leads to more traffic on site.
Click to Call Buttons
Observations: There does appear to be a click to call button on the contact page, and this is mobile responsive which is good.
Why is this important? People who browse on mobile devices are generally quite specific with their searches, this means that they are much closer to converting into a customer. For this reason it is always advised to provide a click to call button on the mobile responsive version of your website. This way if a user wishes to contact you they can do so with the tap of the screen. This is good for mobile user experience and will result in a higher conversion rate overall.
Observations: There are instances of keyword cannibalisation on the Jacamo website, but given the size of the website this is not unexpected.
This can cause significant problems as it can damage rankings. The thing is, it can be an easy problem to run in, especially for large e-commerce websites. It happens when a website targets a single keyword or phrase on multiple areas of a website.
This spreadsheet shows the keyword cannibalisation that is currently present on Jacamo: https://drive.google.com/file/d/1LTmxym4U8eo4zqqOgqJime5cKj6usk4-/view?usp=sharing
If we take the third row on the spreadsheet as an example, we can see that the keyword “jacamo jeans” currently ranks in first position (but cannibalisation can affect this) but there are 10 URLS which Google things are relevant for this one term and in unsure which is the correct one to use.
|Recommendations: Cannibalisation can be resolved using canonicals. Typically (although this is much more difficult for ecommerce), instead of using the same keyword on every page, variations or long-tail versions should be used and then link back to the canonical source for the main term.
Redirects also help. If there are any pages that are no longer used or relevant, then 301s should be used on these pages and then redirect back to the one main page.
I realise that this is not necessarily the best course of action for ecommerce websites, nor is it possible. In this case, I would suggest monitoring performance to see how much ranking fluctuation does happen. If it is significant then I would suggest looking at using varied keywords and consider some of the above fixes.
Why is this important? Cannibalisation is a problem as it can affect rankings as I mentioned. This is because Google does not know which is the single most relevant page for that query / keyword. When this happens, rankings tend to fluctuate as Google tries to determine which is the most relevant URL. So just because you rank in position 1 at the moment for “jacamo jeans” oif Google starts to rank the URL www.jacamo.co.uk/shop/jeans/_/N-1ytjt4g/products/show.action then you could slip to position 3.
Using redirects and canonicals as mentioned takes some of the leg-work away from Google by telling them which URL you would like to rank. When this is done, rankings usually start to stabilise.
Observations: The Jacamo website does have breadcrumb implementation.
This makes it really easy for users to retrace their steps to higher level pages due to them being anchor text links.
|Recommendations: There are no concerns with the breadcrumb implementations.|
Why is this important? Breadcrumb navigation can provide value to both users and search engines alike. In regards to the user, they can see the hierarchy of the website which can help them navigate up or down a level.
From a search engine perspective they can also assist bots in determining the hierarchy of a website and assist them with navigating to the deeper pages on a given website.
Observations: When running a sample URL through the structured data tool it did not flag any errors, but this is because no structured data was found.
|Recommendations: I would recommend discussing the initial implementation with the development team / Tag Manager and adding structured data, e.g. for the Organisation as well as more localised information for stores in certain cities etc.|
Why is this important? Properly implemented rich snippets on a website can help search engines further semantically understand the content of a given page, they can also show up in the search engine results pages when users are searching which can further encourage them to click through to your website.
When a business has multiple physical locations it is important to geo target your businesses to ensure that when a user in the local area is performing a search for a service or product that your business offers you would show up in the given area. The local listings usually show the business name along with the physical address and contact details. For mobile users this is key information as they can easily navigate to your business or give you a call directly from the SERPs.
Due to tools restrictions I am unable to see the whole picture when looking at the backlink profile, I have listed below some findings with the data that I can access.
Observations: From an initial scan over the anchor text (from the data that I can see), it looks to be a good mixture of variations of branded anchor text. I have put an image below of the top 20 anchor text for reference:
As you can see, there are no immediate threats in this list above, although there are some that we would like to take a deeper look at e.g. “299”, 63282.jacamo.co.uk” etc to ensure nothing is too spammy.
On the whole, the anchor text that is used is largely brand / product specific, although there are a couple more that refer to wallpaper downloads e.g. “get free high quality hd wallpapers hairstyle cuts for men” that we would usually disavow.
The threats are more from the lower quality directories that are currently linking to the site.
Observations: A large portion of the historic links are from the website thefashionexaminer.wordpress.com – these all appear ok and this website does appear to have topical relevance. The link text that is used here is “jacamo mensware online”.
There also appears to be links from coupon websites, these do link to the homepage for the most part, I would also assume because it is a coupon code that the deals will expire at some point so very little value is passed on, meaning these links should not pose much of a threat.
It looks like from the data I can see the recently acquired links are more varied in link type and depth.
As you can see from the below image, it appears as though the majority of the domains linking to the website are from reputable domains.
Observations: There are some backlinks that have become broken, this means the link either redirects to a destination that redirects to a broken page (404) or the target page 404s without it being redirected first.
When this happens, the link equity and authority is lost.
Here you can see the links that are broken: https://drive.google.com/file/d/1gkzyiBjiACaaNTxuTCofp1NJAQoFDHsq/view?usp=sharing
This spreadsheet also allows us to see the number of links and domains that are linking to these URLs and therefore the amount of potential equity loss.
|Recommendations: At this stage, I would usually conduct a deeper analysis if time and tools allow. I would do this to check the overall back health of the profile and take necessary action to redirect / update links and disavow those that could pose problems.|
Why is this Important? If the link profile isn’t cleared up, and good quality links are acquired there is a risk of the website being hit by algorithm updates or even having a manual action put against it. In this instance I would recommend a full link audit and clean up.
Having completed this audit, I’ve identified some areas that Jacamo could look to improve and if I were working on their website this would give us a starting point. However, as mentioned at the start of this audit, there will likely be factors at play that we are unaware of that have led to the current website set up.