Scheming the Schema: An Introduction to Structured Data

One of the major traffic sources for personal websites and blogs is Organic Traffic, i.e traffic originating from search engines. People working on the web would quickly realize that search traffic can also be bought through Pay Per Click (PPC) Programs such as Google Adwords and Bing Search Advertising. However, for those of us who do not have Venture Capital funded blogs to run, we still do things the old fashioned way … we write good content.

 

Content is still King

The good thing with this idea is that search engines such as Google do give more weight to things such as meta descriptions, meta keywords, and other Search Engine Optimization (SEO) wizardry which was erstwhile known to a few. Web applications like Google Webmaster Tools (GWMT) and Bing Webmaster Tools (BWMT) are making it easier for website owners to recognize these things and highlight important data to search engines.

Content is KingThis move gives the power back to the people, instead of focusing on complex discussions such as Information Architecture, Content Taxonomy, Tagging and Keyword Optimization, you just have to worry about one thing, and one thing only. Good quality original content!

How do these search engines make sense out of this content? Well, they look for patterns within the page. In fact, if memory serves me correctly, one of my classmates (who went on to become an ethical hacker) had written a similar piece of code for his final year engineering project. It’s a decade since he wrote the code, we can assume without a shadow of doubt that the search engine algorithms would be a lot smarter now to extract the right content. Smart enough to discern between the signal and the noise.

So how do site owners help search engines find important data?

One method is to use the tools that they provide us – the WMTs. However, each search engine has it’s separate nuance, and hence there was a need felt for a common standard … a standard which we, the content creators can use to signal to the search engines … that hey … this piece right here … yes this bold sentence … is meant to be special. It might mean something special to the person searching for it … it’s a beacon … a beacon of data.

 

Defining Schemas for better results

This is where structured data comes into picture. In early January of 2011, the major search engines came together and helped define the format of this structured data via schemas. You can see these formats on schema.org. An excerpt from their home page –

Search engines including Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages.

Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web.

What this means for us, is that if we use the mark-up defined for each of the individual schemas in our content, then the search engines will display this data in interesting manners to the people searching for that very piece of information. Wan’t to see an example at work? Here is one

Schema Display

Last week, I instructed my developers at EduPristine to start making use of certain schemas in our mark-up. It’s less than a week, and we can already start seeing them pop-up all over our search engine results.

 

Why Should You Bother

It does not take a rocket scientist to understand that in the game of organic traffic acquisition, the more interesting and relevant your search engine listing, the higher the chances of getting a click on your listing. Thus, marking of your content is bound to increase your organic Click Through Rates (CTRs). You can get this data in Google Webmaster Tools, better to check your current set of keywords (and their CTRs along with average ranks) and then compare the change when you start including schemas in your content.

If you have not heard about structured data or implemented schema.org before then I can guarantee you an increase in your CTRs and in your organic traffic.

Understanding the __utmz Google Analytics Cookie

Google Analytics does tracking by using the urchin tracking cookies. In case if you do not know what are HTTP Cookies, then you need to start reading some of these articles that I am linking to!! They can give the crunch to your digital marketing initiatives.

Do bear with me through this post, I will get a bit technical. In case if you do understand a bit of PHP, you will enjoy this post. For now, just keep this information with you, it will come to your aid sometime later when you would be talking to your developers!

Some Cookie Basics first

So if you have any kind of server side script running on your server, then you can create your own cookies! You should know that on a PHP server environment, the entire set of cookies is available in the $_COOKIE reserved variable.

Isn’t this awesome? What this means is that with one line of PHP code, you can refer to all the cookies which are now on your site for each of your visitors.

If you take look under the hood, then you will see that there are many cookies used by Google Analytics.

kidakaka utmz cookie

Each of these cookies have a certain purpose. The way GA utilizes these cookies is well documented, and you can find the developer note here.

So why the __utmz Cookie?

The __utmz cookie contains the traffic source information in Urchin tracker format. It contains information about how and when an individual visitor hits your site for the FIRST time. That means whenever the visitor comes back to your site, this value remains unchanged.

Imagine a scenario wherein a visitor first sees a post of yours on Facebook, or on Google … or other traffic sources. The visitor comes to the site, checks out a few pages and goes away (… such a shame!!). After a few days, the same visitor comes back to the site via an ad. This time the visitor leaves his information on one of your contact forms (I do hope you are using these!).

How would you know that the user is a repeat visitor? The form will only contain data which is about the visitor’s current visit.

So, whom would you attribute this event to? Your advertisement (the later) or your social media sharing (the former)? Do you believe in first impressions or do you believe in the recency effect?

Without having information about the user’s previous visit to your site, you will always choose the later. Thus misattributing your visitor acquisition to the wrong medium! And that’s why the __utmz cookie is so important! In effect it represents a unique visitor for your website.

I treat the __utmz cookie as equal to a real person!!

Associated with it is a story, this story which rolls out in a set of visits. Google Analytics can tell you if you are willing to sift through the data that is available to you.

What can we do with this Cookie?

How do we bite into this Cookie? Well, there are more than one method of cutting this cookie!!

  • If you have conversions on your site (Downloads, Sales, Contact Forms, etc), then you can always use the value in this cookie to track the origin of the visitor. This will give you a definite number on which traffic source (and which marketing campaign) is more effective for you when it comes down to conversions.
  • Each __utmz has a unique value set, this unique value can be converted into a Custom Variable and you can then track a unique visitor across your site using this Custom Variable (the free version of Google Analytics allows 5 Custom Variables, so use them sparingly).
  • For the more technically sound platforms, you can use this and the other GA cookies to track the visitor ACROSS multiple visits. However, there are other tools which are far more easier to setup to do this viz., Mixpanel.
  • If you are betting a lot on acquisition of traffic via Search engines, then the __utmz cookie can be mined for keyword data … yes, the data which people are searching for and are coming to your site. However, as of 2013, GA only provides the keyword data of users who have not signed in into their Google Accounts (around 30% of the data is not available).

Hmmm … crunchy delight!! And nutritious as well!!

I would love to hear from you how else can we use this and the other Cookies. In the meantime, do start integrating your Cookies with your CRM for more visitor intelligence!

The Cinderella Visitor

It’s surprising, every time I open Analytics, I learn something new about this product and I learn something new about my site … like the fact that I have to deal with Cinderella Visitors!

Here is a maths problem for you. In fact if you have pored through your Google Analytics account, I am sure even you would also have been stumped by the same problem.

Below is a screenshot of my site’s monthly traffic stats –

Monthly Stats

Now, yes … these are humble beginnings! That is why some of you should visit this site more often!! So if you look at the unique visitors (that’s the total number of visitors coming to the website), then it is 530. This is good, that means through the blog I am getting in touch with at least 530 people a month!! Holy cow!! That’s roughly 18 conversations in the day.

Out of those 530, a huge number are new visitors (511). So does that mean only 19 visitors are returning back to the blog for those 126 visits? Hmmm … that should not be that hard to find. One look at the Returning Visitors Advanced Segment should shed more light on this …

Returning Visitors

Well, this seems to the problem don’t it. Things don’t just add up here! 45 visitors are responsible for those 126 visits, which means that 485 visitors are responsible for those 511 New Visits. How can this be possible? These 485 people have come on the site ONCE ( … sigh!! aee jaanewaale ho sake to laut ke aajaa …), but the number of visits tell me otherwise.

Which should bring us to the real question of what does Google Analytics define as a visit? Here is an insightful article from Google about How Visits are Calculated in Analytics. There we have our answer –

A single visitor can open multiple visits. Those visits can occur on the same day, or over several days, weeks, or months. As soon as one visit ends, there is then an opportunity to start a new visit. There are two methods by which a visit ends:

  • Time-based expiry (including end of day):
    • After 30 minutes of inactivity by the visitor
    • At midnight

Ahhh! So if the visitor comes to the site, sticks around for more than 30 minutes without doing anything and then clicks on any other links, then its considered as ANOTHER visit. This should have been a sufficient reason IF the average visit duration was on the higher side, but since its not the case (and I write reasonable short posts too!), so the other method seems to be in action here!

At least 26 visitors have been around the site around midnight!! A quick check on my GA for hourly visits confirms my suspicions! I had 26 visitors coming to the website around midnight … and as the clock struck twelve, these Cinderella visitors fled (or at least Google Analytics made them fade away) and came back as New Visits!! By the way, in case if you are wondering why have you not heard about this term Cinderella Visitor before, then do not worry … I just coined it!

So where are the glass slippers? The glass slippers are the ubiquitous utmz cookies!!

Data Highlighter by Google Webmaster

If you are a webmaster or own a site, then it is likely that you know about Google Webmaster Tools (WMT). If not, then the first thing you need to do is bookmark my blog! Then head on over to Google Webmasters and register your site the NOW! I cannot stress more on this. Google Webmasters allows the webmaster to slightly influence the method in which the Google bot crawls your website. Why is this important, well you can actually tell Google to crawl certain pages of your website, and ask it to not crawl other non-important pages. Typically, you would do this using Sitemaps (do remind me to write about this sometime later!).

The latest feature roll out by Google WMT is their Data Highlighter feature. You will find this in the Optimization section of Google WMT. At present this nifty feature lets you highlight event based data on your website. So if you are a training institute for financial certifications like us, then you can easily benefit from this feature. All you have to do is create a page set of pages which have this data that you need to highlight.

Page Sets in WMT

I created a page set called Batches (which is a set of pages for all our classroom centers in India). Then, the WMT wizard started and asked me to point out specific event meta information which was on those pages. Information such as the name of the event (for us it was events such as CFA Level I, Fin Mod, FRM Part I, etc.), event date and time, venue and address.

Data Highlighter by Google Webmaster

Now, the next time when the Google bot crawls this site, it will simply read and store this additional information and display them in the Search Engine Ranking Positions (SERPs).

Here is a really helpful video by the good folks at Google WMT explaining this feature –

Custom Reporting in Google Analytics

Google Analytics rolled out more than 5 years back, and it has been rocking ever since. The first free enterprise class analytics seemed to keep on adding more and more awesome features every year. Especially after the acquisition of the Urchin tracker system, GA has been the de facto analytics system for all websites.

Of course for the more seasoned people out there who cannot get their exact set of data from GA, there are other niche analytics products. In fact Avinash Kaushik has an entire chapter dedicated to this in his kickass of a book Web Analytics: An Hour a Day (I recommend that you do read this!!)

I have been seriously working on GA for about a year or so now and the more I use this tool, the more I learn about how little I know! That’s the thing with knowledge, by the time you know a lot, you think you know very little. I wish if the other way was true as well :-D

The thing with GA and what turns off most people is that the sheer volume of data it can show in those pretty little orange pages is a lot. So a cursory glance gives you loads of data, however to get an insight, you have to sift through this data. Create segments, look at those segments and search through various different reports to find that one insight which will help you drive more traffic, more leads, more sales to your site.

This is where Google has shone their brilliance, they have allowed web developers, analysts, webmasters and business intelligence guys to actually work together and create custom reports, custom segments which can be shared. Yes, so I can burn the midnight oil trying to find which content works best for my site … however if I had to do the same for another website, I would have to re-create all those steps all over again. What a colossal waste of time! But now, I can simply share that report/segment and voila!!

In the next few days, I will be sharing more custom segments as well as custom dashboards which you can simply import in your Google Analytics and start right off!

Entrepreneurs in 2012

With the year 2012 coming to an end, I have been thinking for quite some time about entrepreneurship now. In fact, currently I am thinking about bootstrapping my own setup with a friend. Having said that, it is interesting to note how entrepreneurship has changed its face in the society.

There used to be a time when having a government job would automatically include you in the elite of the society. Every family would aspire to raise their son to have a government job. To know that you are going to be having your own business would mean years of hardship and minimal chances of making it big ahead.

However, as the License Raj has come to an end and the government has slowly liberalized over the past decades, entrepreneurship has changed over a new leaf. It has started becoming synonymous with capitalism and everyone who goes to the IITs or IIMs dreams of starting their own firm and making it big one day. I share that dream – a dream which I hope to make real some day.

The infographic below shows how entrepreneurship was in the year 2012. Almost half of the entrepreneurs who had managed to survive and scale into becoming SMEs are optimistic about the year ahead. Yes, 2013 is going to be a good year.

DNA-infographic

But being optimistic does not mean taking inadvertent risks. It means being prepared … being prepared for the good, and being prepared for the bad as well. Yes, it means insuring against the bad, and thinking along the lines of business insurance.

The thing about start-ups is that many firms do not realize the sheer amount of business risks they face day-in day-out. All these risks are called as operational risks. The risk we face in day to day operations. It could be a simple thing such as forgetting to file the taxes, or neglecting to buy the software license of some critical software that you require. In India, start-ups do tend to cut corners … not to make the quick buck, but to avoid the load of paperwork that comes with it. I am sure that would be the case in other countries as well.

Take a look at this infographic I found through Hiscox Business Insurance. It was here that I learnt about the myriad of business risks. Now the question is, why entrepreneurship is the road to success? Well, take a look at the top internet billionaires list. Almost all of them have touched their first million in the past to years, and their first billion in the past 5. That clearly displays the power of a good idea can launch a start-up to stardom.

These things cannot happen without proper preparations. Take a look at the top 4 billionaires on that list, 3 of them work for Google, one is the CEO (the man with the plan), the two others are the founders. A good idea will attract funds, yes … but a good idea requires excellent execution skills and proper risk mitigation tactics to avert calamities. So entrepreneurs, dream big and prepare for the worst!!

Now find exactly how many people are bidding for your keyword

Search Engine Marketing (SEM) is often becoming the de facto traffic generating mechanism for people who have the pockets. This, in addition to the fact that Google is really aggressive about its growth in the Indian markets puts Google Adwords right up there in the strategy of any digital marketing executive.

I have been working on the Adwords interface for the past 18 months or so now and get excited whenever I discover a new and useful feature in the web app. One such awesome feature is the single keyword bidding tool.

Where do you find this?

You find this feature in the Keywords interface of any campaign. Be sure that you have enough data points (works for a weekly period for most of my keywords).

The tool is only available if you are selecting one single keyword at any point of time.

Google keyword bidding

There is a small bar graph icon for the keyword bidding tool, now simply select this keyword and view the bidding and competitors.

Google Keyword Competitors

What can you do with this?

This data is good, but what can one do with this data? Well, this gives you a wealth of information –

  • How many people are you competing against (for eg. I did not expect ask.com to be running an ad on this keyword)
  • How are you doing vis-a-vis your competitors in SEM (yes, we are beating the crap out of the rest ;-))
  • The approximate budget spends of your competitors
  • A bit more research on the landing pages can help you identify what your competitors are doing right and you are doing wrong

This is one fun tool that the folks have Google Adwords have churned out.

Economics teaches us that when information is provided to everyone, then the wealth extraction within the market is at a maximum. The adwords bidding is one such great example, wherein me (and all my competitors) have the same access to data. The two players who will end up benefiting the most after the correct usage of this tool are bidders who take fast decision and Google.