Wednesday, August 04, 2010

Blogger's Font Controls and HTML Generation

Not exactly the post I started with... but while putting together a new guide I got fed up with Google's new in-Blogger picture loading system. Quite simply, the newly exposed in-Blogger composer is not ready for any kind of usage. After filing a bug report in the Blogger help section for stuff that's broken I decided to take on the in-Blogger font controls as well. Since I've been on Blogger I've never really said much about the behind the scenes controls exposed for fonts in the content generator. I've often accepted that Blogger is a relatively free-of-financial cost product that is made available as it leverages Google's core competency: Advertising Distribution through Search.

Google's continued development of Blogger, as well as it's exposure as a tool for Commercially Oriented Blogs, finally warranted something being said on the said state of the internal font controls as they are applied to user generated content. The font controls of the old blogger interface were pathetic at best. Let me demonstrate with a picture of the old in-Blogger composer as accessed through Firefox.

Here I've typed out 3 lines of text. Now I am going to apply a single Font change to these 3 lines:

Now I'm going to access the Edit Html tab at the top of the composer:

Each line of text has been given it's own separate span tag. Now I'm going to go back to the Compose tab and select Font Size.

With my font size set as small, I'm now going to click back on the Edit Html tab:

My entire file size has exploded. I checked in Open Office. There are 339 characters in this HTML generation. The original text only had 92 different characters. Blogger's composing system added 247 new characters, more than tripling the raw text. I now have 5 different sets of commands to set the font size to 85% and 3 different commands to set the font itself to Courier.

The actual amount of HTML code needed to generate the text I want is signifcantly smaller:

As can be seen, with just a total of 149 characters I have the same exact text formatting output.

So that's the old Composer. One would think that Google's new Composer system would fix some of this ludicrous over-use of HTML tags.

The short answer is that no, the new composer actually makes the HTML generation even WORSE than before. The browser that you use will also influence just how many redunant HTML commands are included in the Blogger HTML generation. So here we have the Blogger composer under Google-Chrome using the older editing interface.


Our Text here uses 62 characters. The Blogger HTML generator uses significantly more characters:

No, that is not your eyes decieving you. That is 1,097 characters. More than 1,000 characters have been added to the document source just to describe the formatting on 4 lines comprised of 62 characters.

Is the new Composer system any better? Well, here's the plain text in the new Composer Interface:

Again, this is just 4 lines of text with just 62 characters set to Font Courier with a size of small. This is the HTML generation in the new Composer:

Believe it or not, the amount of HTML has actually gone up, with a total character count of 1168characters. The new Composer system adds 71 new characters. The entire system is OUT OF CONTROL.

Now, I realize that all of the pictures shown thus far deal with converting existing text into the text format I want to use. Is the system really this bad when I am just generating text as a new input? To find out I'm going to use Seamonkey's Composer system to generate the html, and then do the same generation in Google's online system.

With Seamonkey I was able to specify the font type of Courier and font size of small before I actually entered the text:

This is the HTML Source that Seamonkey Compozer generated from 63 characters:

As can be seen, Seamonkey Compozer took 297 characters to fully describe the text, the line breaks, the page, and the pertinent meta data to describe the text as a standalone HTML page. Of this, only 162 characters were used to actuallydescribe the text itself, including the line breaks. Of those 162 characters, 134 characters are used to describe the text without the line breaks.

Of the actual HTML commands, Seamonkey Composer correctly used only two commands to dictate the text formatting. One 'small' command to dictate relative font size, and one font-change command. The result is a relatively efficient use of HTML.

Now, for the comparison, here's how Google Chrome does with Google Blogger. There is another bug with the new content generation system here alone. I can not dictate a font size and type of font before I start entering text. I have to actually enter some text first, then I can specify the size of the font and the type of the font.

As can be seen, I have not typed out the full 4 line character test from before. Rather I can going to make a single carriage return and type a single letter:

With only 13 characters in the edit box, this is what my HTML generation looks like:

Google Blogger has used 497 characters to describe just 13 characters.

Again, I realize that by and large Google's Blogger system is a free-of-financial-cost tool. Google does expose it as a professional option which has been used by CEO's like Rahool Sood of VoodooPC. The old still directs to, which at first glance still seems to be using the Google Blogger interface. Google also appearently utilizes the content generation Engine in some of it's Premier Business edition tools.

By the same token, Mozilla Seamonkey Compozer is also free-of-financial-cost. Yet the HTML generation of the two software systems is dramatically different. Google's Blogger Compozer is a bloated wreck of an application that adds browser-context-sensative html code.

On a small scale viewpoint the massive train-wreck that is Google's Blogger HTML generation isn't that big of an issue. Internet Speeds are increasing on an across the board basis, and Microsoft has finally figured out that upholding Industry Standards in it's browser might stop the flood of users to competing browsers. What's the big deal if Google Blogger generates a massive amount of redundant and useless HTML code?

The problem is the scale of Google itself. Google hastens of thousands, if not hundreds of thousands, if not potentially millions of active users worldwide. Google goes to great lengths to design datacenters to hold and sort through information. Google goes through a great deal of engineering work to cool these data-centers off and exhaust the massive amount of heat that can be generated.

Okay, fine, one person turning 64 characters of text into over One Thousand Characters of Text is not really a bad thing. 10,000 users turning 64 characters into over One Thousand Characters IS A BAD THING. 100,000users is even worse than that. 1,000,000 uses would generate over 1,000,000,000 characters of text for 64,000,000 characters. This now goes from the realm of small potatoes into that's actually a large amount of bandwidth through the network and storage space on the hard-drives.

Google's formatting system for Blogger NEEDS TO BE FIXED.  Getting a handle on HTML formatting and fixing the underlying formatting controls would have a direct and immediate effect on operating network bandwidth, processing time, and file-size storage. I mean, I really would have thought then that an upgrade to Blogger's content generation would have focused on streamlining the HTML generation processor. Instead, the system has gotten even more bloated and more ludicrous in the generation of HTML.

This type of generation is something I would expect from a MICROSOFT PRODUCT.

No comments: