Posts tagged with 'Software Development'

DokuWiki

At work, we just switched to a new Wiki application for our team documentation: DokuWiki. So far it has been a pretty big hit. (For a wiki, anyway.)

It has several advantages over our old Wiki app:

  1. It is super easy to install. Drop it on disk, hit the install page, configure a few items, and you're up and running. It's PHP, so no container or application server is necessary.
  2. It is easy to administer. So far, the development team that I'm on has been able to administer the DokuWiki wiki ourselves. This makes our operations team happy since the old wiki required a bit of care and feeding to upgrade and to administer users. With DokuWiki, user admin is really easy.
  3. The syntax is simple and readable as text. I'm a big Markdown fan myself so I installed a Markdown plugin to allow me to create some pages in the format, but the rest of my team is using the vanilla DokuWiki syntax, and they seem to like it.
  4. It supports namespaces. The namespace support is not quite as intuitive as I'd like, but it's simple enough for us to have figured it out. I don't expect us to mave many namespaces, but it's nice to be able to create a new space if another team in our department wants their own corner of the wiki.

DokuWiki screenshot


Scraping Data from a Website using JSoup (Lightly Seasoned with SQLite)

Here's some Java code I wrote to scrape links from a site for insertion into a SQLite database. I use JSoup to do the scraping. It uses JQuery selectors to select content to scrape. I think I prefer XPath, but it's good to have another tool in the toolbox. I use a Google Chrome extension called Scraper to help me form the JQuery selectors. In this example, I'm using the SQLite FTS4 Fulltext Search extension to make the values easily searchable in the DB.

package com.sample;

import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

/**
 *
 * @author Oscar Grouch
 */
public class SampleScraper {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) 
    {
        /* Pull the all posts page */
        Document doc = null;
        try 
        {
            doc = Jsoup.connect("http://localhost/all.html").get();
        } 
        catch (IOException ex) 
        {
            Logger.getLogger(Scraper.class.getName()).log(Level.SEVERE, null, ex);
        }

        /* Get a list of all of the links on the all posts page */
        Elements links = doc.select("h2 > a");
        if (links != null)
        {
            /* Initialize DB driver. */
            try 
            {
                /* load the sqlite-JDBC driver using the current class loader */
                Class.forName("org.sqlite.JDBC");
            } 
            catch (ClassNotFoundException ex) 
            {
                System.err.println("Class not found: org.sqlite.JDBC: " + ex.getMessage());
            }

            Connection connection;
            try
            {
                /* create a database connection */
                connection = DriverManager.getConnection("jdbc:sqlite:C:/Temp/sample.db");
                Statement statement = connection.createStatement();
                statement.setQueryTimeout(30);  // set timeout to 30 sec.
                statement.executeUpdate("drop table if exists content");
                statement.executeUpdate("CREATE VIRTUAL TABLE content " + 
                                        "USING fts4(title, url, post)");
            }
            catch(SQLException e)
            {
                // if the error message is "out of memory", 
                // it probably means no database file is found
                System.err.println(e.getMessage());
                return;
            }


            /* Loop through the links */
            for (Element link : links)
            {
                /* Print out the link text */
                System.out.println("Processing " + link.text() + "...");

                Document currentDoc = null;
                try 
                {
                    /* Pull the current link page */
                    currentDoc = Jsoup.connect(link.attr("abs:href")).get();
                } 
                catch (IOException ex) 
                {
                    Logger.getLogger(Scraper.class.getName()).log(Level.SEVERE, null, ex);
                }

                /* Get the post Text */
                Element post = currentDoc.select("div.post").get(0);
                String postText = post.text();

                /* Print the post text */
                System.out.println("Post Text: " + postText);

                try
                {
                    PreparedStatement preparedStatement 
                      = connection.prepareStatement("insert into content values(?, ?, ? )");
                    preparedStatement.setString(1, link.text());
                    preparedStatement.setString(2, link.attr("abs:href"));
                    preparedStatement.setString(3, postText);
                    boolean isResultSet = preparedStatement.execute();
                    if (isResultSet)
                    {
                        System.out.println("A ResultSet was returned.");
                    }
                    else
                    {
                        System.out.println("No ResultSet was returned.");
                    }
                }
                catch(SQLException e)
                {
                    // if the error message is "out of memory", 
                    // it probably means no database file is found
                    System.err.println(e.getMessage());
                }

            }
        }
    }
}

Modern.ie: Internet Explorer Testing & Troubleshooting

Microsoft has created a site with tools and information to aid in testing and troubleshooting web sites in Internet Explorer. Check it out here: http://www.modern.ie/.


Git Stuff

One of my co-workers was looking to begin managing source code for his personal projects using Git. Here are the links I sent to him to help get him started:

Tutorials & Reference

GUI Tools


Software Development Blogs and Sites that I Follow

  • Martin Fowler - This guy talks a lot about continuous integration, continuous delivery, and continuous deployment. The leading edge of the industry is using these techniques along with DevOps to improve reliabilility. He also talks about NoSQL and design patterns a lot. This guy is one of the pioneers of design patterns.

  • Rands in Repose - This guy is great. He has lots of insight about being about being a development manager. I've read both of his books. If I get a chance, I'll list some of his better blog posts here.

  • Ajaxian and Webappers - Blogs about free and open source web development resources. I've found some good tools here.

  • Assembla.com blog - They talk a lot about Git and Continuous integration/delivery/deployment. They also talk about distributed teams quite a bit.

  • Web Resources Depot - They list free development tools. Lots of good stuff here.

  • Artima - Software development blog.

Podcasts

Tools for subscribing to news feeds

Tools for listening to Podcasts