Search engines love human written content. This is true for all of them, but especially for Google. This is one of the reason for recruiting text analysis specialists and developing context analysis tools. It is asumed, that synonymes and overall context of the citing gives greater weight for the link. Google can distinguish set of words that are used in the same context, that are similar in meaning. You can easily test it by going to google.com (english version) and puting “remove spyware” in the query box. You will see not only query words highlighted, but word removal also.
Thats the simpliest use of context analysis. There is much more. All search algo is a large formula of filters, weight factors, context and links. Google patent covers only part of the factors used for ranking pages. However it clarifies some tendencies. The first one is that google is looking for authority sites and content. The second one is that google is trying to reduce spam page amount in the SERPS.
This brings us to the question of the day: How can spam pages be distinguished? There are many ways and some of them are more overlooked than others. One is scrapper pages, that is with content from other sites. Google is succesfuly eliminating them from the index. It is easily distinguishable having the page history in the search engine cache.
However, there is another way of spaming, that is automaticaly generated pages. Depending on the generation algorithm, they detection is harder, but solvable with text analysis. Google assumes, that automatically generated pages have always lower ranking than other pages and perhaps even penalizes the sites that use such approaches. But there is a catch: sometimes such pages can be fully legitimate and interesting for searchers. Lets imagine such simple scenario: a web site owner has a database of component parts, for example 1000 of them. They have quite few parameters. The owner puts them on web, each one in separate page. There is no way to create an additional description for each component as they are quite similar, perhaps differing in couple of numbers. However, each of the pages will get ranked lower on each component name than pages that ask about them (i.e. forums). The more there are interest in the data the less chances that the answer will be found using Google.
Thus there is issue with Google Spam filter today. Hopefully it will be solved.
Latests posts
- Burn books, buy Kindle?
- How to uninstall Security Suite
- How to stop listening to bullshit and make your PC secure for real
- Google Maps mobile: why there is no updates for non-USA Android phones?
- Do not copy others, surpass them
- Antivir Solution pro – new rogue mimicking legitimate antivirus
- NDrive Contest : win free lifetime maps (ending 13.07.2010 )
- Global unlike: it does exit as MyWOT toolbar
- Avoid AV Security Suite infection!
- Marriage, the trip to Portugal and vacation
Tags
Android Antispyware Soft boonex browser chrome domains Egypt firefox google analytics Hootsuite hurghada IE Antivirus Intelinet jedit komodo edit mywot ndrive nexus one pandora software php php editor php editors Programming removers review Reviews rogue Security SEM spymaxx spyware spyware guard 2008 spywareremovers spywares spywaresweeperpro taxes Travel twitter virusheat VirusIsolator webdevelopment websites winAntivirus2008 Win AntiVirus 2008 wordpress
Recent Comments