You are here: start » seo

SEO

For a general introduction to SEO you should read the respective article on Wikipedia. The techniques to improve the probability that your site will on top of the search results of search engines fall in to categories:

  • off-page optimization
  • on-page optimization

Off-Page Optimization

Probably the most important factor are inbound links to your site (so-called backlinks). So you might strive to get these backlinks. But you should avoid linkfarms or otherwise bought backlinks, as modern algorithms of the search engines might detect these, and as result will penalize your site. The modern social networking services might help getting those links: make your site and the services your offering known to others. If others find them useful, it's quite likely that they “spread the word about it” (i.e. set links back to your site). Another option is to get backlinks from an superordinated or related organization. E.g. a sports club might get backlinks from related local or regional sports clubs, or from the repective sports association of the region or country.

On-Page Optimization

It's hard to tell exactly what you should do to optimize your site's pages for search engines, as the search engine vendors don't disclose their algorithms, and these are likely to change over time (e.g. in the early days of the web the keyword attribute of the meta elements was very important, but due to its misuse, it's quite likely to be ignored by search engines altogether these days).

A general rule of thumb: a search engine tries to “read” a page as closely as possible as a human being. So try to optimize your pages for humans, and not for search engines! E.g. a human being as well as a search engine will regard headings (<h1> etc.) more important as the following paragraphs as a heading is typically displayed with a larger font. A particular bad practice is to include (nearly) invisible information on a page, that is solely meant to manipulate search engine results. The search engine might well detect this manipulation attempt, and might penalize your site.

Besides the actual page content, there is some additional information that is important for search engines. This information can be set for the complete website (note that you have to do it for each language separately) under “Settings→Website”. Particularly important is the “Site→Title”, which is used for the <title> element and is displayed in the browsers headline. The <title> is typically displayed as heading in the search engine results. The “Meta→Keywords” are probably being ignored by search engines, but you might nonetheless enter reasonable keywords there. The “Meta→Description” could be used by a search engine to display additional information in the search results. Usually the text is trimmed to at most 150-160 characters.

It's possible (and often reasonable) to overwrite these settings for single pages. This can be done with the standard plugin Meta Tags.

A note on the <title>: this is actually constructed from the setting of Site→Title plus the heading of the page, for example “My Website – Welcome”. Since CMSimple_XH 1.6 you can customize the general format by changing the configuration option Title→Format. Before CMSimple_XH 1.6.5 Meta→Title could be used to overwrite the Site→Title (not the page heading!) for individual pages; since CMSimple_XH 1.6.5 Meta→Title directly sets the full title.

Duplicate Content

Duplicate content (DC) results from (nearly) the same page being available under different URLs. This is bad for SEO and has to be avoided. The first thing you should do is to exclude one of the typical two ways to access your site: http://www.example.com vs. http://example.com. This can typically be done in the website administration tool offered by your provider. It's important to use a 301 redirect (301 moved permanently) to signal the search robots, that the redirect is not only temporary.

Particularly for CMSimple_XH you have to take additional measures. E.g. the first page of your website can be accessed with or without giving the page name in the URL (http://www.example.com/ and http://www.example.com/?Welcome). To cater for this and similar situations you can use a Canonical link element. You might want to use the Canonical plugin to automate this.

Sitemaps

The sitemap protocoll was invented by Google and is recognized by the most important search engines. It is an important part of SEO regarding sites, that couldn't be completely indexed otherwise (e.g. as they are build with Flash), and not strictly necessary for CMSimple_XH sites. However, a sitemap might be useful for CMSimple_XH sites too, and the sitemap can automatically be created by the Sitemapper_XH plugin.

robots.txt

The Robot Exclusion Standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web crawlers and other web robots from accessing all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code. The standard is different from, but can be used in conjunction with, Sitemaps, a robot inclusion standard for websites. (from Wikipedia)

CMSimple_XH is not shipped with a robots.txt (the shipping with CMSimple_XH 1.5 and 1.5.1 was done by accident), as it's content is highly dependent on the actual user and his desires to hide or make available contents to crawlers. Furthermore robots.txt has to be put inside the root folder of your webserver, even if CMSimple_XH is installed in a subfolder.

It's not stricly necessary to use a robots.txt, but if you want to use one, the following might serve as a base, if you have installed CMSimple in the web root:

User-agent: *
Disallow: /2lang/
Disallow: /2site/
Disallow: /cmsimple/
Disallow: /content/
Disallow: /plugins/
Disallow: /templates/

This will explicitly prohibit crawlers access to the mentioned folders, as you'll probably not want the information in this folders to be read by them. But if no links to files in those folders exist somewhere on the WWW, crawlers won't scan those folders anyhow. Well, bad behaving crawlers might do so nonetheless, but probably those crawlers will ignore the robots.txt anyway. Particularly you have to decide for yourself, if you will disallow access to /plugins/. Some plugins might put files into the plugin's folder, that are reasonable to be crawled.

There are several validators available, which will check your robots.txt. You may try e.g. http://whois.net/robots-txt-validator/.

Further details on using robots.txt can be found on The Web Robots Pages.

Search Engine Submission

Explicitely submitting your site to search engines is not strictly necessary, as a search robot who finds a backlink to your site somewhere else, will typically follow this link and so your site will be indexed. But this might take some time, so you might consider submitting your site to different search engines. How this can be done, is explained on Wikipedia.

Checking your Success

From time to time you should check the success of your SEO. You can do so by simply entering appropriate keywords to any search engine, and see, if your site is already on top of the search results. Don't expect too much for general keywords. Particularly if you offer services for local surroundings, you should consider to include the location as search keyword. For example your local football club will probably not be on top of a world wide search for “football club”, but it might well be on top for a search for “football club {name of your city}”, what should be sufficient.

For a more general overview how your site is ranking, you might consider to check the its PageRank (there are several free online checkers available) and its ranking on Alexa for example.

When all else fails

If you're not able to achieve the desired SEO, you should consider hiring a SEO expert. This will probably not be cheap, but if a good ranking in the search engines is mandatory for your business, it might well pay off.

Further reading

 
You are here: start » seo
Except where otherwise noted, content on this wiki is licensed under the following license: GNU Free Documentation License 1.3
Valid XHTML 1.0 Valid CSS Driven by DokuWiki