A SVN Browser Using Scraping and WinForms in COBOL for .NET

23 Dec 2009  
This simple project shows just how advanced MS COBOL for .NET is for modern development.

Introduction - Why?

This project sprang from a real need. Whilst it is possible to browse Subversion from a web browser, the handling of non-HTML files is a pain. We wanted a program which would show HTML files as web pages and all other files as text. This approach allows people to just click around the SVN tree seeing what is there.

The reason for doing this in COBOL is simple - why not? Out of the big three .NET business languages, COBOL handles this sort of thing just as well as VB and C#. Not only that, good examples of using COBOL for .NET to interact with the rest of the CLR and class libraries are hard to come by.


This project is built around two major pieces. There is the main form and the HttpGetter. The HTTP getter (which started life as an RSS reader) uses System.Net.HttpWebRequest to send GET requests to the web server interface of the SVN installation. These are managed via the MainForm which displays the directory structure reported from the SVN server in a tree view in the left hand pane, and when a node in the view is clicked, it shows a view of the contents of that node in a web browser control in the right hand pane:.

Image 1

If a node in the tree is a directory, this is indicated by a trailing /. To avoid having to download the entire tree structure up front, the tree is not populated with the child nodes of a directory until that directory node is clicked upon. When a directory which has been populated is clicked upon a second time, the node is expanded. This means double clicking on a directory node populates it and expands it all at once.

Image 2

Image 3

Image 4

When the application is run, it will ask for the server and login details for the SVN server. The code will handle the URL with or without the http:// prefix. If the SVN server does not require a user name and password, then they do not have to be supplied. In the examples I have used here, I have connected to the publically available server. I have put in a user name and password to illustrate the screenshot - but actually, the server ignores these!

Image 5


The entire application (100% of the code) is written in Micro Focus COBOL for .NET. The example here is based on Studio Enterprise Edition 6.0. However, the code should work with no source code alterations in the free academic version of the net-express product.

Here we can see the code being stepped through in Visual Studio 2008. Yes - this really is COBOL! It is amazing how far the language has come from its humble 1950's beginning.

Image 6

For me, the most interesting parts are the interaction with the HTTP protocol and parsing the HTML which comes back from the SVN web server. However, the web form stuff may be of interest as well. On the HTTP side, we can see the code in the source below; however, I would like to highlight the following:

perform varying httpKey thru response::"Headers"::"Keys"
   set header to String::"Format"("{0}: {1}" 
         httpKey response::"Headers"::"Item"(httpKey))
 If httpKey::"ToLower" equals "content-encoding" Then
     set contentEncoding to response::"Headers"::"Item"(httpKey)

This is a nice example of iterating through a collection. We get the content encoding from the response keys in a completely civilised way! By so doing, we are then able to handle compressed streams using standard .NET classes:

set rs to response::"GetResponseStream"
If contentEncoding not equals null Then
 If contentEncoding::"ToLower" equals "gzip" Then
     set rs to New "System.IO.Compression.GZipStream"
       (rs type "System.IO.Compression.CompressionMode"::"Decompress")
   If contentEncoding::"ToLower" equals "deflate" Then
       set rs to New "System.IO.Compression.DeflateStream"
        (rs type "System.IO.Compression.CompressionMode"::"Decompress")

Next, we can look at the HTML parser. This is really simple because the SVN web server puts out the structure of the SVN tree using HTML lists, with each list element on a new line. However, it does work out as a really nice demonstration of using CLR generics inside COBOL. We use the String::Split method to get an array of lines and then take the bits we want from them anf append them to a "System.Collections.Generic.List"[string]. We can see here that the generic type is String, which in COBOL is set using the [] syntax. By using a list, we avoid all the trouble of having to know how many elements there might be up front. Working in COBOL for .NET really is nothing as hard as working in classic COBOL!

method-id ParseHtml.
local-storage section.
   01 rawLines string occurs any.
   01 rawLine  string.
   01 chars    character occurs any.
   01 rets     type "System.Collections.Generic.List"[string].
   01 blocks   string occurs any.
procedure division using by value htmlToParse as string
   returning urls as string occurs any.
   set content of chars to (x'0A' as character)
   set rawLines to htmlToParse::"Split"(x'0A' as character) 
   set rets to new "System.Collections.Generic.List"[string].
   perform varying rawLine through rawLines
       if rawLine::"Trim"::"ToLower"::"StartsWith"("<li>") Then
           set blocks to rawLine::"Split"('"' as character)
           invoke rets::"Add"(blocks(2))
   set urls to rets::"ToArray"
end method ParseHtml.

The Source


$set sourceformat(variable).
class-id. HTTPGetter as "COBOLSVNBrowser.HTTPGetter".
working-storage section.

method-id DoRSSRequest.
local-storage section.
   01 request   type "System.Net.HttpWebRequest".
   01 encoder   type "System.Text.ASCIIEncoding".
   01 response  type "System.Net.WebResponse".
   01 httpKey   string.
   01 exp       type "System.InvalidOperationException".
   01 rs        type "System.IO.Stream".
   01 respSt    type "System.IO.StreamReader".
   01 contentEncoding string.
   01 header    string.
procedure division  using by value
   url      as string
   username as string
   password as string
   returning retHtml as string.

   set request to type "System.Net.WebRequest"::"Create"(url)
       as type "System.Net.HttpWebRequest"
   set request::"Method" to "GET"
   set request::"ContentType" to "application/x-www-form-urlencoded"
   set request::"Credentials" to new "System.Net.NetworkCredential"(username password)
   invoke request::"Headers"::"Add"("Accept-Encoding" "gzip,deflate")
   set request::"ProtocolVersion" to type "System.Net.HttpVersion"::"Version11"
   set request::"KeepAlive" to false
   set request::"ServicePoint"::"Expect100Continue" to false

   *> Get results    perform varying httpKey thru request::"Headers"::"Keys"
           set header to String::"Format"("{0}: {1}" 
              httpKey request::"Headers"::"Item"(httpKey))
       set response to request::"GetResponse"
   Catch exp
       *> Debug only - should handle with a form really!        display "RSS Request Failed:"
       perform varying httpKey thru request::"Headers"::"Keys"
               set header to String::"Format"("{0}: {1}" 
                 httpKey request::"Headers"::"Item"(httpKey))

       *> try again        set response to request::"GetResponse"
   perform varying httpKey thru response::"Headers"::"Keys"
       set header to String::"Format"("{0}: {1}" 
           httpKey response::"Headers"::"Item"(httpKey))
     If httpKey::"ToLower" equals "content-encoding" Then
         set contentEncoding to response::"Headers"::"Item"(httpKey)

   set rs to response::"GetResponseStream"
   If contentEncoding not equals null Then
     If contentEncoding::"ToLower" equals "gzip" Then
         set rs to New "System.IO.Compression.GZipStream"
           (rs type "System.IO.Compression.CompressionMode"::"Decompress")
       If contentEncoding::"ToLower" equals "deflate" Then
           set rs to New "System.IO.Compression.DeflateStream"
             (rs type "System.IO.Compression.CompressionMode"::"Decompress")
   set encoder to type "System.Text.ASCIIEncoding"::"New"
   set respSt to type "System.IO.StreamReader"::"New"(rs encoder)
   set retHtml to respSt::"ReadToEnd"
   invoke respSt::"Close"
end method DoRSSRequest.
method-id ParseHtml.
local-storage section.
   01 rawLines string occurs any.
   01 rawLine  string.
   01 chars    character occurs any.
   01 rets     type "System.Collections.Generic.List"[string].
   01 blocks   string occurs any.
procedure division using by value htmlToParse as string
   returning urls as string occurs any.
   set content of chars to (x'0A' as character)
   set rawLines to htmlToParse::"Split"(x'0A' as character)
   set rets to new "System.Collections.Generic.List"[string].
   perform varying rawLine through rawLines
       if rawLine::"Trim"::"ToLower"::"StartsWith"("<li>") Then
           set blocks to rawLine::"Split"('"' as character)
           invoke rets::"Add"(blocks(2))
   set urls to rets::"ToArray"
end method ParseHtml.
end object.
end class HTTPGetter.


*> TODO: Insert code to perform custom authentication
*> using the provided username and password *> The custom principal can then be attached *> to the current thread principal as follows: *>     My.User.CurrentPrincipal = CustomPrincipal *> where CustomPrincipal is the IPrincipal *> implementation used to perform authentication. *> Subsequently, My.User will return identity information *> encapsulated in the CustomPrincipal object *> such as the username, display name, etc.      
class-id. LoginForm1 as "COBOLSVNBrowser.LoginForm1" is partial
       inherits type "System.Windows.Forms.Form".
environment division.
configuration section.
working-storage section.
   01 repo string public.
   01 username string public.
   01 password string public.

method-id. NEW.
procedure division.
   invoke self::"InitializeComponent"
end method NEW.

method-id.  "btnOK_Click" final private.
procedure division using by value sender 
         as object e as type "System.EventArgs".
   invoke self::"AllDone"
end method "btnOK_Click".

method-id.  "btnCancel_Click" final private.
procedure division using by value sender 
         as object e as type "System.EventArgs".
   set self::"repo" to null
   set self::"username" to null
   set self::"password" to null
   invoke self::"Close"
end method "btnCancel_Click".

method-id.  "LoginForm1_KeyPress" final private.
procedure division using by value sender as object e 
            as type "System.Windows.Forms.KeyPressEventArgs".
   if e::"KeyChar" equals 13 then
       invoke self::"AllDone"
end method "LoginForm1_KeyPress".

method-id. "AllDone" final private.
   set self::"repo" to self::"tbRepo"::"Text"
   set self::"username" to self::"tbUserName"::"Text"
   set self::"password" to self::"tbPassword"::"Text"
   invoke self::"Close"
end method "AllDone".
end object.
end class LoginForm1.


class-id. LoginForm1 as "COBOLSVNBrowser.LoginForm1" is partial
   inherits type "System.Windows.Forms.Form".
environment division.
configuration section.
working-storage section.
01 label1 type "System.Windows.Forms.Label".
01 label2 type "System.Windows.Forms.Label".
01 btnOK type "System.Windows.Forms.Button".
01 btnCancel type "System.Windows.Forms.Button".
01 tbUserName type "System.Windows.Forms.TextBox".
01 tbPassword type "System.Windows.Forms.TextBox".
01 label3 type "System.Windows.Forms.Label".
01 tbRepo type "System.Windows.Forms.TextBox".
01 components type "System.ComponentModel.IContainer".
*> Required method for Designer support - do not modify *> the contents of this method with the code editor. method-id.  "InitializeComponent" private.
procedure division.
set btnOK to new "System.Windows.Forms.Button"
set btnCancel to new "System.Windows.Forms.Button"
set label1 to new "System.Windows.Forms.Label"
set label2 to new "System.Windows.Forms.Label"
set tbUserName to new "System.Windows.Forms.TextBox"
set tbPassword to new "System.Windows.Forms.TextBox"
set tbRepo to new "System.Windows.Forms.TextBox"
set label3 to new "System.Windows.Forms.Label"
invoke self::"SuspendLayout"
*> btnOK *>
set btnOK::"Location" to new "System.Drawing.Point"( 12 165)
set btnOK::"Name" to "btnOK"
set btnOK::"Size" to new "System.Drawing.Size"( 75 23)
set btnOK::"TabIndex" to 4
set btnOK::"Text" to "OK"
set btnOK::"UseVisualStyleBackColor" to True
invoke btnOK::"add_Click"(new "System.EventHandler"(self::"btnOK_Click"))
*> btnCancel *>
set btnCancel::"Location" to new "System.Drawing.Point"( 147 165)
set btnCancel::"Name" to "btnCancel"
set btnCancel::"Size" to new "System.Drawing.Size"( 75 23)
set btnCancel::"TabIndex" to 5
set btnCancel::"Text" to "Cancel"
set btnCancel::"UseVisualStyleBackColor" to True
invoke btnCancel::"add_Click"(new "System.EventHandler"(self::"btnCancel_Click"))
*> label1 *>
set label1::"AutoSize" to True
set label1::"Location" to new "System.Drawing.Point"( 12 56)
set label1::"Name" to "label1"
set label1::"Size" to new "System.Drawing.Size"( 58 13)
set label1::"TabIndex" to 0
set label1::"Text" to "&User name"
*> label2 *>
set label2::"AutoSize" to True
set label2::"Location" to new "System.Drawing.Point"( 12 110)
set label2::"Name" to "label2"
set label2::"Size" to new "System.Drawing.Size"( 53 13)
set label2::"TabIndex" to 0
set label2::"Text" to "&Password"
*> tbUserName *>
set tbUserName::"Location" to new "System.Drawing.Point"( 12 75)
set tbUserName::"Name" to "tbUserName"
set tbUserName::"Size" to new "System.Drawing.Size"( 210 20)
set tbUserName::"TabIndex" to 1
invoke tbUserName::"add_KeyPress"
(new "System.Windows.Forms.KeyPressEventHandler"(self::"LoginForm1_KeyPress"))
*> tbPassword *>
set tbPassword::"Location" to new "System.Drawing.Point"( 12 126)
set tbPassword::"Name" to "tbPassword"
set tbPassword::"PasswordChar" to '*'
set tbPassword::"Size" to new "System.Drawing.Size"( 210 20)
set tbPassword::"TabIndex" to 3
invoke tbPassword::"add_KeyPress"
(new "System.Windows.Forms.KeyPressEventHandler"(self::"LoginForm1_KeyPress"))
*> tbRepo *>
set tbRepo::"Location" to new "System.Drawing.Point"( 13 23)
set tbRepo::"Name" to "tbRepo"
set tbRepo::"Size" to new "System.Drawing.Size"( 210 20)
set tbRepo::"TabIndex" to 1
*> label3 *>
set label3::"AutoSize" to True
set label3::"Location" to new "System.Drawing.Point"( 13 7)
set label3::"Name" to "label3"
set label3::"Size" to new "System.Drawing.Size"( 57 13)
set label3::"TabIndex" to 0
set label3::"Text" to "&Repoistory"
*> LoginForm1 *>
set self::"ClientSize" to new "System.Drawing.Size"( 236 201)
invoke self::"Controls"::"Add"(tbRepo)
invoke self::"Controls"::"Add"(label3)
invoke self::"Controls"::"Add"(tbPassword)
invoke self::"Controls"::"Add"(tbUserName)
invoke self::"Controls"::"Add"(label2)
invoke self::"Controls"::"Add"(label1)
invoke self::"Controls"::"Add"(btnCancel)
invoke self::"Controls"::"Add"(btnOK)
set self::"Name" to "LoginForm1"
set self::"Text" to "SVN Login"
invoke self::"add_KeyPress"(new "System.Windows.Forms.KeyPressEventHandler"
invoke self::"ResumeLayout"(False)
invoke self::"PerformLayout"
end method "InitializeComponent".

*> Clean up any resources being used.      method-id. "Dispose" override protected.
procedure division using by value disposing as condition-value.
   if disposing then
     if components not = null then
       invoke components::"Dispose"()
   invoke super::"Dispose"(by value disposing)
end method "Dispose".

end object.
end class LoginForm1.


class-id. Main as "COBOLSVNBrowser.Main".
environment division.
configuration section.
method-id. Main
   custom-attribute is type "System.STAThreadAttribute".
local-storage section.
01 mainForm type "COBOLSVNBrowser.MainForm".
procedure division.
   set mainForm to new "COBOLSVNBrowser.MainForm"()
   invoke type "System.Windows.Forms.Application"::"Run"(mainForm)

end method "Main".
end static.
end class Main.


$set sourceformat(variable).
class-id. MainForm as "COBOLSVNBrowser.MainForm" is partial
         inherits type "System.Windows.Forms.Form".
working-storage section.
   01 root     string.
   01 userName string.
   01 password string.

method-id. NEW.
local-storage section.
   01 node type "System.Windows.Forms.TreeNode".
   01 login  type "COBOLSVNBrowser.LoginForm1".
procedure division.
   invoke self::"InitializeComponent"()
   set login to new type "COBOLSVNBrowser.LoginForm1"
   invoke login::"ShowDialog"
   if login::"username" equals null then
      move login::"repo" to root
   if not root::"ToLower"::"StartsWith"("http://") then
      move String::"Format"("http://{0}" login::"repo") to root
   if not root::"ToLower"::"EndsWith"("/") then
      move String::"Format"("{0}/" root) to root
   move login::"username" to userName
   move login::"password" to passWord
   set node to new "System.Windows.Forms.TreeNode"("root")
   invoke self::"treeView1"::"Nodes"::"Add"(node)
   invoke self::"AddToTree"(self::"GetUrls"(root) node)
end method NEW.
method-id GetUrls.
   01 getter type "COBOLSVNBrowser.HTTPGetter".
   01 txt    string.
procedure division using by value url as string
          returning urls as string occurs any.
   set getter to new "COBOLSVNBrowser.HTTPGetter"()
   set txt To getter::"DoRSSRequest"(url userName password)
   set size of urls to 0
   if url::"ToLower"::"EndsWith"("html") or
            url::"ToLower"::"EndsWith"("htm") then
       set self::"MainViewerBrowser"::"DocumentText" To txt
       if url::"EndsWith"("/") then
           set self::"MainViewerBrowser"::"DocumentText" To
               Directory:</h2>{0}</body></html>" url)
           set urls to getter::"ParseHtml"(txt)
           set self::"MainViewerBrowser"::"DocumentText"  To String::"Format"
                txt::"Replace"("<" "<"))
end method GetUrls.
method-id. AddToTree.
local-storage section.
   01 tnode type "System.Windows.Forms.TreeNode".
   01 url string.
procedure division using urls as string occurs 
         any node as type "System.Windows.Forms.TreeNode".
   perform varying url through urls
       if url not equals("../") then
           invoke node::"Nodes"::"Add"(new 
end method AddToTree.

method-id.  "treeView1_AfterSelect" final private.
local-storage section.
   01 url string value "".
   01 node type "System.Windows.Forms.TreeNode".
   01 ex type "System.Exception".
procedure division using by value sender as object e as
         type "System.Windows.Forms.TreeViewEventArgs".
   set node to e::"Node"
   perform until exit
       if node::"Parent" equals null then
           exit perform
       set url to string::"Concat"(node::"Text" url)
       move node::"Parent" to node
   set url to string::"Concat"(root url)
   invoke self::"AddToTree"(self::"GetUrls"(url) e::"Node")
end method "treeView1_AfterSelect".
end object.
end class MainForm.


class-id. MainForm as "COBOLSVNBrowser.MainForm" is partial
                 inherits type "System.Windows.Forms.Form".
   environment division.
   configuration section.
   working-storage section.
   01 splitContainer1 type "System.Windows.Forms.SplitContainer".
   01 treeView1 type "System.Windows.Forms.TreeView".
   01 MainViewerBrowser type "System.Windows.Forms.WebBrowser".
   01 components type "System.ComponentModel.IContainer".

  *> Required method for Designer support - do not modify   *> the contents of this method with the code editor.    method-id.  "InitializeComponent" private.
   procedure division.
   set splitContainer1 to new "System.Windows.Forms.SplitContainer"
   set treeView1 to new "System.Windows.Forms.TreeView"
   set MainViewerBrowser to new "System.Windows.Forms.WebBrowser"
   invoke splitContainer1::"Panel1"::"SuspendLayout"
   invoke splitContainer1::"Panel2"::"SuspendLayout"
   invoke splitContainer1::"SuspendLayout"
   invoke self::"SuspendLayout"
  *> splitContainer1   *>
   set splitContainer1::"Dock" to type "System.Windows.Forms.DockStyle"::"Fill"
   set splitContainer1::"Location" to new "System.Drawing.Point"( 0 0)
   set splitContainer1::"Name" to "splitContainer1"
  *> splitContainer1.Panel1   *>
   invoke splitContainer1::"Panel1"::"Controls"::"Add"(treeView1)
  *> splitContainer1.Panel2   *>
   invoke splitContainer1::"Panel2"::"Controls"::"Add"(MainViewerBrowser)
   set splitContainer1::"Size" to new "System.Drawing.Size"( 800 364)
   set splitContainer1::"SplitterDistance" to 266
   set splitContainer1::"TabIndex" to 0
  *> treeView1   *>
   set treeView1::"Anchor" to
    type "System.Windows.Forms.AnchorStyles"::"Top" b-or
    type "System.Windows.Forms.AnchorStyles"::"Bottom" b-or
    type "System.Windows.Forms.AnchorStyles"::"Left" b-or 
    type "System.Windows.Forms.AnchorStyles"::"Right" as 
          type "System.Windows.Forms.AnchorStyles"
   set treeView1::"Location" to new "System.Drawing.Point"( 0 0)
   set treeView1::"Name" to "treeView1"
   set treeView1::"Size" to new "System.Drawing.Size"( 263 364)
   set treeView1::"TabIndex" to 0
   invoke treeView1::"add_AfterSelect"
      (new "System.Windows.Forms.TreeViewEventHandler"
  *> MainViewerBrowser   *>
   set MainViewerBrowser::"Anchor" to
    type "System.Windows.Forms.AnchorStyles"::"Top" b-or
    type "System.Windows.Forms.AnchorStyles"::"Bottom" b-or
    type "System.Windows.Forms.AnchorStyles"::"Left" b-or 
    type "System.Windows.Forms.AnchorStyles"::"Right" as
        type "System.Windows.Forms.AnchorStyles"
   set MainViewerBrowser::"Location" to new "System.Drawing.Point"( 3 0)
   set MainViewerBrowser::"MinimumSize" to new "System.Drawing.Size"( 20 20)
   set MainViewerBrowser::"Name" to "MainViewerBrowser"
   set MainViewerBrowser::"Size" to new "System.Drawing.Size"( 527 364)
   set MainViewerBrowser::"TabIndex" to 0
  *> MainForm   *>
   set self::"ClientSize" to new "System.Drawing.Size"( 800 364)
   invoke self::"Controls"::"Add"(splitContainer1)
   set self::"Name" to "MainForm"
   set self::"Text" to "COBOLSVNBrowser"
   invoke splitContainer1::"Panel1"::"ResumeLayout"(False)
   invoke splitContainer1::"Panel2"::"ResumeLayout"(False)
   invoke splitContainer1::"ResumeLayout"(False)
   invoke self::"ResumeLayout"(False)
   end method "InitializeComponent".
  *> Clean up any resources being used.         method-id. "Dispose" override protected.
   procedure division using by value disposing as condition-value.
       if disposing then
         if components not = null then
           invoke components::"Dispose"()
       invoke super::"Dispose"(by value disposing)
   end method "Dispose".

end object.
end class MainForm.


I am now a Software Systems Developer - Senior Principal at Micro Focus Plc.

My past includes a Ph.D. in computational quantum mechanics, software consultancy and several/various software development and architecture positions.

For more - see



