Sunday, May 02, 2010

Using Squeak with Blogger Export Files

My recent efforts in the Squeak community is different from what I always thought. I imagined that I would be contributing code rather than an installer, survey and so forth. There are conversations about integrating a documentation system, known as HelpSystem, within Squeak Smalltalk and I will write documentation whenever it is up and running.

Meanwhile, I have decided to write another article on how to retrieve data from Blogger export files using Squeak. My previous article has shown how it's done in Ruby (see here). It's a great opportunity to destigmatize Smalltalk programming for the lay programmer.


Blogger offers export and import features available in its Dashboard (Settings > Basic). It generates an enormous XML file that can be used to move your blog onto another one or save your data for later. There is a Developer's Guide: Protocol available online.

The approach I have taken is straightforward for the lay programmer. Don't be distracted by the elements you don't understand. There are many other elements that you already know, or at least vaguely know. I am using a Workspace window as a scripting area where I write, inspect and evaluate code in a similar fashion to Ruby IRB.

Smalltalk tip The beauty of Smalltalk resides in the possibility to evaluate, inspect, print and debug code anywhere. Squeak embraces this perfectly.
The following code snippet allows one to inspect data contained in an XML file. Open a Workspace window from the world menu using left-click on Squeak desktop. Select all the code (either using left-click menu or alt-a) and evaluate it (do it; either using left-click menu or alt-d) within the Workspace window.
| xml |
FileStream
readOnlyFileNamed: 'blog-04-29-2010.xml' do: [ :stream |
xml := XMLDOMParser parseDocumentFrom: stream.
].
xml inspect.
The resulting Inspect window is shown below. Unfortunately, it's hardly comprehensible as far as XML is concerned and contrary to its Ruby counterpart.


Change xml inspect. to xml explore. and evaluate again. The resulting Explore window is more satisfying in this particular instance.


We have previously learned in the Inspect window that xml is an XMLDocument and its structure is outlined in the Explore window. The gibberish coding is about to become clear. Open a Browser window from the world menu and right click in the first pane in a Browser window, select find class (or use alt-f) and type XMLDocument to search for its class. However, I suggest to read more about XMLParser and XMLParserTest first.


The following code snippet allows one to display every entry available. On Blogger, entries hold comments among other things.
| xml |
FileStream
readOnlyFileNamed: 'blog-04-29-2010.xml' do: [ :stream |
xml := XMLDOMParser parseDocumentFrom: stream.
].
(xml elements at: 2) tagsNamed: #entry do: [:e |
Transcript show: (e asString); cr.
].
Smalltalk tip The code outputs data in a console window when using Ruby. Fortunately, there is such a concept in Smalltalk. Open a Transcript window using the world menu and evaluate the last code snippet in a Workspace window.
An entry produced by the last code snippet looks like this:
<entry>
<id>tag:blogger.com,1999:blog-29346655.post-5518814233253242399</id>
<published>2009-03-18T12:50:06.694-04:00</published>
<updated>2009-09-28T17:25:26.425-04:00</updated>
<category scheme="http://schemas.google.com/g/2005#kind" term="http://schemas.google.com/blogger/2008/kind#comment"/>
<title type="text"/>
<content type="html"/>
<link href="http://www.blogger.com/feeds/29346655/2316597115724243670/comments/default/5518814233253242399" rel="edit" type="application/atom+xml"/>
<link href="http://www.blogger.com/feeds/29346655/2316597115724243670/comments/default/5518814233253242399" rel="self" type="application/atom+xml"/>
<link href="http://mecenia.blogspot.com/2009/03/blog-post.html?showComment=1237395006694#c5518814233253242399" rel="alternate" title="" type="text/html"/>
<author>
<name>Dogan</name>
<uri>http://www.blogger.com/profile/06563922196121231187</uri>
<email>noreply@blogger.com</email></author>
<thr:in-reply-to href="http://mecenia.blogspot.com/2009/03/blog-post.html" ref="tag:blogger.com,1999:blog-29346655.post-2316597115724243670" source="http://www.blogger.com/feeds/29346655/posts/default/2316597115724243670" type="text/html"/> </entry>
The following code snippet uses information learned from inspecting, retrieves each comment and displays them in a Transcript window.
| xml term |
FileStream
readOnlyFileNamed: 'blog-04-29-2010.xml' do: [ :stream |
xml := XMLDOMParser parseDocumentFrom: stream.
].
(xml elements at: 2) tagsNamed: #entry do: [:e |
term := (e firstTagNamed: #category) attributeAt: #term.
('*comment' match: term)
ifTrue: [
Transcript
show: 'Date: ', ((e firstTagNamed: #published) contentString); cr;
show: 'Author: ', ((e firstTagNamed: #author) contentStringAt: #name); cr;
show: 'Content: ', ((e firstTagNamed: #content) contentString); cr; cr.
].
].
A partial output with processed entries is shown in a Transcript window:


Programming in Squeak is an exciting experience and one should not be afraid because it looks and feels different. It is certainly possible to program as one used to do with other programming languages, whether they have a write-compile-run cycle or are interpreted. One will however realize programming can be more productive when done the Smalltalk way.

1 comment:

  1. Such a great information and I've been looking for this..

    ReplyDelete