Machine learning at Google and it’s applications to search rankings and other product innovations isn’t something that is talked about much in the marketing blogosphere. It’s critical, however, to understand that Google is pushing this type of technology throughout it’s enterprise.
One strong sign that Google is years into machine learning is an open source software library they created for Machine Intelligence called TensorFlow, which they have given to the world to iterate on top of.
They have openly admitted they use TensorFlow in assisting with search results. Jeff Dean calls attention to it in this video at the 27 second mark.
Aki Balogh, the Founder of MarketMuse, helps define it further.
“Machine learning and artificial intelligence are technologies that rely on computers to notice patterns in data. When you combine this with large volumes of data (‘Big Data’), you get systems that are really good at specific tasks, because they can pick up on patterns millions of times faster than a human. Often purpose-built systems like MarketMuse or what Brian calls attention to below that Google might use, are called Artificial Narrow Intelligence (ANI). Combining the pattern recognition of an ANI system with the creativity of a human being is a winning combination.”
Machine learning could be applied at Google in the following ways. Let’s take a look further.
Enter Machine Learning Use Case #1
Take the link disavow database
When you need to build a data set in machine learning you need examples, examples, and more examples.
As data adds up and you collect more and more data samples that have some sort of consistent and clean contextual reason for being submitted, then you can further teach the system. Overtime, the system becomes smarter and can take more and more examples of commercialized links, and discredit their ability to manipulate or help the page they are pointing at.
Devaluing commercialized backlinks
When Google launched the disavow file they launched the largest ever mass example submission of commercial, shotty/dodgy/malicious backlinks. Their database of bad backlinks must be quite massive and those can feed into a platform like TensorFlow quite well.
Discrediting negative SEO attacks
The Google spam team has to deal with legit sites, getting attacked by outside sources that aim to de-rank websites with link spam. One can see why this could be a major problem as competition heats up in high dollar verticals.
Hacked websites and underground network link purchases can be used by competition to damage a site. It’s critical for Google to ascertain the common characteristics of those types of links and simply discredit them. This could be accomplished with Machine learning. In 2006, Google could not do this very effectively. Enter Machine learning use Case #1, which is, I think, one of the more obvious ways Google could use this technology.
Hummingbird, Topic Relationships, Entities and Natural Language Processing
Over the last few years, and in some cases, you probably haven’t noticed, you started ranking for terms you never optimized for. That could be on the back of machine learning as Google uses something like SintaxNet and other NLU techniques to quickly draw associations and commonalities.
Further down this rabbit hole is something known as entity search. It’s broken up into named entities and search entities. These classifications further optimize Google’s ability to show the correct information for the different types of queries users commonly or not-so-commonly search for (ie. head terms and longtail). This is why you don’t have to include exact matching keywords in your copy anymore, Google will ‘help you out’ in tieing together associations automatically.
As Gianluca Fiorelli puts it, optimizing for semantic search is optimizing for meaning. If it’s not clear, semantic content optimization is now a thing and this should be an area of focus for webmasters looking to capture meaningful traffic from Google. Ohh and by the way, users tend to like things written by real experts and the side effect of having expertise behind your content can naturally create these semantic associations.
An aside, the people who develop this innovation at Google do things like trying to build the most audacious RV ever imaged. The people who work and have contributed to all these changes and new technology are some of the smartest and boldest minds in the world.
Google Now, swiping to remove content
Training classifiers and delivering personalized context in the form of relevant news and offbeat topics is the hole I see Google Now filling. Google Now is a Facebook feed that delivers relevant, timely links through Google’s treasure trove of geographic usage signals. This data is crunched very quickly to determine a list of articles nationally and locally that might suit your fancy.
The premise is simple: if you like a news item you can opt to click on it, sending signals to Google.
If you don’t like an article then you can swipe right and send it to the trash bin.
Sometimes Google even tries to define whether or not timing of the item isn’t ideal.
As you feed the machine, it gets better over time and starts to learn what types of information to show you and when to show it to you. Undoubtedly, Google Now was built with machine learning at the forefront to create long term product usage.
Google Maps and Google Local: Leave Feedback on Places You’ve Been To
Google maps is one of the company’s older products, but you can see clear signs that Google is leveraging machine learning techniques now that they can push notifications to you through the mobile interface. Here is an example of how Google is asking for feedback after they have associated within reason that you had visited a local business.
This data can help Google with semantic associations that they could later leverage for greater personalization or enhanced results.
Can you think of any product enhancements Google might be leveraging through Machine Learning and other AI techniques?
Update: Further reading from Bill Slawski: Machine Learning Inside Google