<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Open Source Solutions for Small Business Problems &#187; Technical</title>
	<atom:link href="http://opensourcesmall.biz/category/technical/feed/" rel="self" type="application/rss+xml" />
	<link>http://opensourcesmall.biz</link>
	<description>The living site of the book by John Locke</description>
	<lastBuildDate>Fri, 05 Dec 2008 23:00:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.2</generator>
		<item>
		<title>SOAP, Web Services, and PHP</title>
		<link>http://opensourcesmall.biz/2008/08/soap-web-services-and-php/</link>
		<comments>http://opensourcesmall.biz/2008/08/soap-web-services-and-php/#comments</comments>
		<pubDate>Sat, 23 Aug 2008 17:52:04 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[soap]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/?p=266</guid>
		<description><![CDATA[One of my projects in the past few weeks has been to put together a SOAP server for a client. So suddenly I&#8217;ve had to learn a lot of the nitty gritty details about what works and what doesn&#8217;t&#8230; While they&#8217;re fresh, let me jot them down here. WARNING: Extremely technical content ahead. First of [...]]]></description>
			<content:encoded><![CDATA[<p>One of my projects in the past few weeks has been to put together a SOAP server for a client. So suddenly I&#8217;ve had to learn a lot of the nitty gritty details about what works and what doesn&#8217;t&#8230;</p>
<p>While they&#8217;re fresh, let me jot them down here. WARNING: Extremely technical content ahead.<br />
<span id="more-266"></span><br />
First of all, SOAP is supposed to stand for &#8220;Simple Object Access Protocol.&#8221; It&#8217;s anything but simple. There is a lot of SOAP software out there, but subtle implementation gotchas that can be quite difficult to figure out.</p>
<p>We chose the native PHP SoapServer in PHP 5.2 to implement the project, mainly because we&#8217;re a PHP shop, and a little smoke testing revealed it was quite quick to get set up and going. It turns out that it&#8217;s quite hard to debug. For its good points, it can read in a WSDL and automatically map methods to methods on a class, and it converts arrays, simple objects, or complex objects to a valid response object, and request objects into simple or complex objects on the incoming side.<br />
<strong><br />
Problems with PHP&#8217;s SOAP Server:</strong></p>
<ul>
<li>No validation of incoming or outgoing documents.</li>
<li>No warnings, exceptions, or errors if it can&#8217;t convert a document to fit the schema&#8211;it just dies.</li>
<li>No debugging information about what it&#8217;s doing.</li>
<li>No ability to manage namespaces, especially if they need to be copied from the SOAP envelope into the payload.</li>
<li>Difficult to test.</li>
<li>No access to the raw XML of either the request or the response.</li>
</ul>
<p>Using PHP&#8217;s SoapServer is quite simple, except when things aren&#8217;t perfect&#8230;</p>
<p>Here&#8217;s what the code looks like for the simple case:</p>
<p><code><br />
< ?php<br />
  $xml = $GLOBALS['HTTP_RAW_POST_DATA']; // or file_get_contents('php://input');<br />
  // make sure you have something to process, throw an error if $xml is empty<br />
  $soap = new SoapServer('http://path/to/your.wsdl');<br />
  $soap->setClass('mySoapClass');<br />
  $soap->handle($xml);<br />
?><br />
</code><br />
That&#8217;s basically it. You declare methods on &#8216;mySoapClass&#8217; that correspond to the SOAP methods. These handler methods receive a simple object as a parameter, and you can do whatever you need to do with that data. Then it needs to return some data structure that can be serialized to the expected type defined in the WSDL. The return data structure can be an array, a simple object, or an object of a class you define that can serialize appropriately.</p>
<p>Great. With this much, it took me about a day to have a working web service with 8 methods and a bunch of complex data objects. The problems started when people connected with different SOAP software.</p>
<p>The web service I was implementing defines a specific SOAP Fault document, so if I did run across a problem, I could simply throw an exception of that type. My wrapper object kindly passed the custom fields defined in the WSDL.</p>
<p><strong>Problem #1: Validation</strong></p>
<p>As I said before, there is none. If the SoapServer gets anything it doesn&#8217;t like, it doesn&#8217;t send any response at all. And since all you get inside your method handlers is an already-converted object, you don&#8217;t have any way to validate the response without using a global variable or a call to a singleton.</p>
<p>In our case, the project specified that we strip out the payload from the SOAP header and store the payload XML on the disk as a document for several methods, and do processing on other methods. Processing a SOAP request was not a problem. Storing a valid XML document was. Several methods just stored the XML on the disk, with another method retrieving it and returning it to the caller. The problem we had was that the Soap Response was more picky than the Soap Request&#8211;so documents that we loaded from the disk and returned as the response would fail with no explanation.</p>
<p>Our solution was to load the raw XML into a DOM Document, and validate it against the schema. This mostly worked, until we had to deal with a request generated from Jitterbit. More on that later.</p>
<p>The question is, what to validate? We weren&#8217;t supposed to store the entire SOAP envelope&#8211;just the payload. So how to extract it? The simple way was to grab the first child of the Body element, append it to the DOMDocument itself, remove the original root, and call the normalize() method. This did generate a warning, but not a fatal error, and did the right thing. Furthermore, we could also access the raw libxml validation specifics, by calling libxml_use_internal_errors(true), and then when a document fails to validate, using libxml_get_errors() and libxml_get_error() to grab the details.</p>
<p><strong>Problem #2: Returning valid XML responses</strong></p>
<p>The root of all of our problems in this project has to do with where namespaces are defined. One limitation of libxml appears to be that you can only point it to one schema for validation at a time. We can validate against the SOAP Schema, or our custom schema, but not both at the same time, unless one includes the other. So our validation options consist of:</p>
<ol>
<li>Include the SOAP Schema in the custom schema, and validate the entire SOAP body, or</li>
<li>Extract the payload from the SOAP body, and validate only that against our schema.</li>
</ol>
<p>#2 is clearly the correct way&#8211;we really don&#8217;t care about the SOAP envelope once we have the message. But the problem is, many SOAP clients put the namespace declarations on the SOAP Envelope, and not the payload root element. In fact, the XML generated by the PHP SoapServer class does this itself.</p>
<p>So our first task was to generate the proper XML Namespace declarations on our generated payloads.</p>
<p>To do this, we could no longer rely on the PHP SoapServer&#8217;s automatic conversion of simple objects or arrays to XML&#8211;we had to generate our own XML, and tell the SOAP server to use that instead. This turned out to be difficult to track down, so here&#8217;s the answer:</p>
<p><code><br />
< ?php<br />
  class dataClass{<br />
    /* Serialize XML as desired here.<br />
        Omit the XML declaration, start with the root element<br />
        You can also simply build the XML as a string from object properties<br />
    */<br />
     function toXml(){<br />
         $this->myDom->documentElement->setAttribute('xmlns','http://my.custom.namespace/version_1');<br />
         $xml = $this->myDom->saveXML();<br />
         preg_match('%< \?.*=\?>(.*)$%s',$xml,$match); //strip XML declaration<br />
         return $match[1];<br />
   }<br />
}</p>
<p>// snip to end of actual SOAP handler method:<br />
   $out = new SOAPVar($data->toXML(),XSD_ANYXML);<br />
   return $out;<br />
}<br />
</code><br />
The SOAPVar object allows you to control the output of the SoapServer a bit better, with support for namespaces, raw XML, or casting objects in a certain way.</p>
<p><strong>Problem #3: Jitterbit</strong></p>
<p>Validating a clean payload without namespace prefixes seems to work fine. Our technique of moving the nodes around with the DOM seemed to keep most of the namespaces intact, so even if there was no xmlns declaration on the payload root, we could still validate just the payload effectively, and serialize/de-serialize without issue.</p>
<p>Except for Jitterbit.</p>
<p>Now, I loaded up Jitterbit yesterday, and it didn&#8217;t do this&#8211;this might be a problem with an earlier version. The problem is, Jitterbit is extremely verbose, specifying not just a namespace prefix on every element, but also an xsi:type. And even that&#8217;s not enough to break it&#8211;except that the value for its xsi:type also contained a namespace declaration. And if this namespace declaration was not the root namespace, suddenly our validation broke.</p>
<p>It broke for us on types declared as ns:token, ns:string, ns:integer &#8212; the simple types specified by XSD itself, which Jitterbit put into a namespace prefixed with ns: and declared on the SOAP Envelope.</p>
<p>For example, here&#8217;s the start of a problem document:</p>
<p><code><br />
< ?xml version="1.0" encoding="UTF-8" standalone="no" ?><br />
<soapenv :Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"<br />
                  xmlns:ns="http://www.w3.org/2001/XMLSchema"<br />
                  xmlns:oi="http://ws.outdoorindustry.org/v1_2/"<br />
                  xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"<br />
                  xmlns:ws="http://ws.outdoorindustry.org/v1_2/ws/"<br />
                  xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"<br />
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"<br />
><br />
</soapenv><soapenv :Body><br />
<tns1 :submitPO<br />
                  xmlns:tns1="http://ws.outdoorindustry.org/v1_2/"<br />
               xsi:type="tns1:PO"><br />
               </tns1><tns1 :DocumentID>06AB4EF9-6AF157E7-3F8441AB-AB499933</tns1><br />
               <tns1 :POType>Preseason</tns1><br />
               <tns1 :Vendor xsi:type="tns1:VendorType"><br />
                   </tns1><tns1 :VendorID xsi:type="ns:token">999</tns1><br />
                   <tns1 :VendorName xsi:type="ns:token">Sample Vendor</tns1></p>
<p></soapenv></code></p>
<p>The first validation error was on the VendorID, with xsi:type=&#8221;ns:token&#8221;. If I copied xmlns:ns=&#8221;http://www.w3.org/2001/XMLSchema&#8221; into the tns1:submitPO element, it validated fine. The PHP DOMDocument seems to be able to keep track of namespaces on elements and attribute names even after the envelope is gone. But not attribute values.</p>
<p>After hours of banging on this, we came up with 3 workarounds for this:</p>
<ol>
<li>Completely regenerate the XML, after processing. To do this, we would need to create a custom data class for each incoming object, provide a classmap to the SoapServer, and then generate brand new XML out of the data object. This is perhaps the best approach, but I didn&#8217;t think of it until the project was over&#8211;I was thinking about writing out the data to the database and then loading our custom objects and serializing them as we do for our responses. The biggest drawback here is that we need to model the entire complexity of the request, as allowed in the schema. And this was a really complex object&#8230; lots of work to implement, when we&#8217;re only going to store this XML for passing to other systems.</li>
<li>Hack the XML to get the offending namespace into the stored document. This turned out to be easy to program, but uses lots of CPU resources&#8211;DOMDocuments are expensive to use. It&#8217;s also the most brittle approach, only catching this single case&#8211;if the namespace prefix changes, or a different required namespace is necessary, it&#8217;ll break. To do this, we created a new DOM Document, imported the root node of the payload, appended it to the document, and used setAttribute to set an &#8220;xmlns:ns&#8221; attribute on the root. This did not actually get the namespace recognized for validation, and normalizeDocument did not fix it&#8211;but creating a third DOMDocument, and doing $doc->loadXML($doc2->saveXML) did make the namespace recognized by the object so we could successfully validate.</li>
<li>Hack the XSD to validate the entire SOAP request. By including the SOAP schema in our custom schema (using xs:import), we could validate the raw SOAP request, then extract the payload and save it. The saved XML does not validate on its own, but we know it validates. So we can remove our validation check on the outgoing document, and as long as the other system does not explicitly validate the standalone XML, we&#8217;re okay.</li>
</ol>
<p>Whew. Hope this helps somebody&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/08/soap-web-services-and-php/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>What&#8217;s git, and why do you use it?</title>
		<link>http://opensourcesmall.biz/2008/06/whats-git-and-why-do-you-use-it/</link>
		<comments>http://opensourcesmall.biz/2008/06/whats-git-and-why-do-you-use-it/#comments</comments>
		<pubDate>Mon, 30 Jun 2008 19:46:17 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[01. Open Source]]></category>
		<category><![CDATA[09. Document Management]]></category>
		<category><![CDATA[Economic Musings]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[code management]]></category>
		<category><![CDATA[dojo]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[joomla]]></category>
		<category><![CDATA[ledgersmb]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/?p=250</guid>
		<description><![CDATA[At Freelock, we&#8217;re always trying to figure out ways to do things better. Recently I started digging into a developer tool that&#8217;s making, as Bryan over at the Linux Action Show would say, my head explode. For a long time, we&#8217;ve managed our custom code projects and business documents in a central repository, called Subversion [...]]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://freelock.com">Freelock</a>, we&#8217;re always trying to figure out ways to do things better. Recently I started digging into a developer tool that&#8217;s making, as <a href="http://www.lunduke.com/" target="_blank">Bryan</a> over at the <a href="http://www.jupiterbroadcasting.com/?cat=4" target="_blank">Linux Action Show</a> would say, my head explode.</p>
<p>For a long time, we&#8217;ve managed our custom code projects and business documents in a central repository, called Subversion (also known as svn). Subversion is relatively easy to understand&#8211;it&#8217;s like having a library of files you can check a copy out of, do some work on it, and then check it back in. Subversion is the librarian that tracks who has copies of what, and when they checked it out. So if Erik checks in changes to a brochure, and then Jill goes to submit changes to the same document, Subversion will say &#8220;hey wait a minute, that document has already been changed&#8211;you need to make sure you put Erik&#8217;s changes in your document before I&#8217;ll let you put in your document.&#8221;</p>
<p>This is great for managing conflicts between people working on a single team, or for code that is being developed in relative isolation from the rest of the world.</p>
<p>The problem is, we&#8217;re doing more than that&#8211;we&#8217;re taking code from various open source projects and either customizing it or building new applications on top of it. And so when the outside projects get updated, we need to manually update anything we&#8217;ve written that depends on that code. There is no longer a single repository where we control our code&#8211;there is our code library, plus another one for every project we use.</p>
<p>This makes managing add-ons for projects like Joomla or ZenCart quite challenging, because our add-ons get scattered throughout the filesystem to be able to hook into the right place. And if we have to touch a core file, we&#8217;re going to end up needing to re-implement our change with any update to that core file.</p>
<p>There are other issues we run into, managing our code and hosting, all of which take fairly time-consuming, manual intervention. Here&#8217;s the list:</p>
<ul>
<li>Since we host and provide security updates for Joomla, Word Press, Zen Cart, Drupal, and others, we need to upgrade dozens of installations any time there&#8217;s a new release that has a fix for a security vulnerability. With Joomla this has happened quite a lot, and every Joomla installation needs to be upgraded individually&#8211;and tested. And since each installation is slightly different, we can&#8217;t manage them easily within a single repository, while updating the underlying code.</li>
<li>Templates, modules, components, blocks, themes, plugins, and whatever. Developing these types of add-ons are our bread-and-butter. But code for these often get scattered across an installation, making it quite difficult to manage just our add-ons while we develop them, or roll back to earlier versions if there&#8217;s a problem.</li>
<li><a href="http://dojotoolkit.org" target="_blank">The Dojo Toolkit</a>, and builds. We&#8217;re doing a lot of development with Dojo right now, to add desktop-like functionality such as trees, sortable tables, right-click menus, animations, and lots of other really cool things. However, if you don&#8217;t &#8220;build&#8221; the code after you write it, it&#8217;s painfully slow in a web browser. And due to the nature of how Subversion works, you can&#8217;t easily store a built Dojo tree if you ever want to change it again. Which means you&#8217;d need to build it every place you deploy it. And on some computers, it can take a long time to build&#8211;on our demo server, one of our projects currently takes 8 minutes.</li>
<li>As we get more directly involved with open source projects like <a href="http://ledgersmb.org">LedgerSMB</a>, we&#8217;re finding the need to change core files while we hack away at some particular feature. To do this, you create a branch of the code, work on your feature, and then merge your changes back into the &#8220;trunk.&#8221; If you don&#8217;t have access to save directly to the project repository, doing this gets a lot more complicated.</li>
</ul>
<p>Git to the rescue. Git solves all of these issues. Read on for a technical discussion of how.<br />
<span id="more-250"></span><br />
<strong>Managing lots of installations with Git</strong><br />
We haven&#8217;t yet started doing this, but for projects like Joomla and ZenCart, this is going to be really effective. It&#8217;s not quite as helpful for Drupal, which allows for a bunch of sites to be centrally managed out of the box&#8211;though even there it will help us recover quickly from a failed upgrade.</p>
<p>Here&#8217;s the basic approach:</p>
<ol>
<li>Create a master git repository, either by pulling down the Subversion repository using git svn, or just unpacking tarballs of a release and importing into our git repository for the project.</li>
<li>Create a new git clone for each installation, and drop into each installation directory. Create a current branch for that repository, and rebase it to the current installed version.</li>
<li>Commit local modifications to the repository. Make sure the .git directory is getting backed up, and is not accessible through the web server.</li>
<li>When a new release is available, use git pull to update each local installation. Use git reset HEAD^ &#8211;hard to undo if there&#8217;s a problem.</li>
</ol>
<p><strong>Managing submodules, customizations</strong><br />
This we get pretty much for free by having dedicated repositories for particular problems. Managing add-ons we&#8217;d like to use elsewhere can still be challenging, but we can pull customizations from other repositories to assemble custom packages&#8211;it&#8217;s at least easier to manage these across installations.</p>
<p>The main thing to do is to make sure when you&#8217;re developing on a particular add-on, you&#8217;re always doing it in a branch dedicated to those changes. Then you can merge them into other projects more easily. It&#8217;s also possible to &#8220;cherry-pick&#8221; commits involving particular files, and pull those changes into a new branch if development ended up getting mixed in with different features.</p>
<p><strong>Dojo Toolkit builds</strong><br />
We&#8217;re huge fans of the Dojo Toolkit. It makes developing really sophisticated browser-based applications much easier, not just providing a large set of active widgets but also a data abstraction layer and an encapsulation system for your own code.</p>
<p>Its biggest downside is that every feature you want to use means adding another file that the browser needs to load. And we use a lot of features&#8230; the base package loads about 20 files, and in our applications there are well over a hundred. This takes a long time to load, and isn&#8217;t very scalable.</p>
<p>The solution? Use the Dojo build process. A Java program can build all the features you want to use into a single compressed, stripped, javascript file. The difference in speed is incredible&#8211;applications suddenly load right up. And this build system can easily build your Javascript as well.</p>
<p>The problem is, this build system is not compatible with Subversion, because Subversion scatters its working copy data inside each directory, but the build system deletes the directories every time you build. So you can&#8217;t easily manage built code inside Subversion.</p>
<p>Git stores all of its repository information in the top level directory, so you can easily build and re-build Dojo and manage it without any issues.</p>
<p><strong>Do local development on open source projects</strong><br />
Now here&#8217;s the traditional reason to use git: to allow you to manage changes to open source projects without needing write access to their code repositories. We simply set up git repositories to track their Subversion repository, and then we can use all the features of git to manage our customizations.</p>
<p>When the original project has new releases, we just pull them into our repository and git handles most of the merging for us. If we want to contribute back to the project, we can copy them into our Subversion tracking branch and commit them straight back to Subversion&#8211;or send a patch.</p>
<p>We&#8217;re starting to set up public git repositories for projects we work with extensively. You can browse them here:<a href="http://git.freelock.com"> http://git.freelock.com</a>. You can clone and pull updates from here: git://git.freelock.com/git/&lt;projectname&gt;.git. You can read more about how to manage non-standard Subversion repositories in git <a href="http://freelock.com/kb/Git">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/06/whats-git-and-why-do-you-use-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Developing a Simple Workflow within SugarCRM</title>
		<link>http://opensourcesmall.biz/2008/06/developing-a-simple-workflow-within-sugarcrm/</link>
		<comments>http://opensourcesmall.biz/2008/06/developing-a-simple-workflow-within-sugarcrm/#comments</comments>
		<pubDate>Fri, 27 Jun 2008 18:26:31 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[07. CRM]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[SugarCRM]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/?p=249</guid>
		<description><![CDATA[Packtpub is running a sample from a developer&#8217;s guide for customizing SugarCRM. The author describes how to set up hooks for particular modules to build a custom workflow. Custom workflows are a feature that is limited to the proprietary version of SugarCRM&#8211;they have not been available in the open source version. With custom development using [...]]]></description>
			<content:encoded><![CDATA[<p>Packtpub is running a sample from a developer&#8217;s guide for customizing SugarCRM. The author describes how to set up hooks for particular modules to build a custom workflow.</p>
<p>Custom workflows are a feature that is limited to the proprietary version of SugarCRM&#8211;they have not been available in the open source version. With custom development using techniques illustrated here, you can add your own workflows.</p>
<p>This looks to me like it&#8217;s written specifically for versions of SugarCRM before version 5. I haven&#8217;t had a chance to find out whether the same basic techniques would apply&#8211;SugarCRM 5 changes a lot of things from earlier versions, primarily with email handling and storing customizations in the database rather than scattered around files. The basic approach should work, however&#8230;</p>
<p><a href="http://www.packtpub.com/article/developing-a-simple-workflow-within-sugarcrm">Developing a Simple Workflow within SugarCRM</a></p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/06/developing-a-simple-workflow-within-sugarcrm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Technical note: HTTP Auth with AJAX</title>
		<link>http://opensourcesmall.biz/2008/06/technical-note-http-auth-with-ajax/</link>
		<comments>http://opensourcesmall.biz/2008/06/technical-note-http-auth-with-ajax/#comments</comments>
		<pubDate>Sat, 07 Jun 2008 21:06:49 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[ajax]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/?p=246</guid>
		<description><![CDATA[I&#8217;ve been struggling to get Project Auriga to set HTTP Auth from a nice pretty login form, and think I have it working. What follows is a very technical discussion&#8211;if you&#8217;re a business reader, you should probably skip this post&#8230; HTTP Auth is a specific mechanism for handling authentication. HTTP Auth is built into Apache [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been struggling to get <a href="http://projectauriga.org">Project Auriga</a> to set HTTP Auth from a nice pretty login form, and think I have it working.</p>
<p>What follows is a very technical discussion&#8211;if you&#8217;re a business reader, you should probably skip this post&#8230;</p>
<p>HTTP Auth is a specific mechanism for handling authentication. HTTP Auth is built into Apache and IIS, and so the server can handle authentication purely through configuration, offering many different back ends for storing the data. Browsers also handle HTTP Auth natively, popping up a normal login box whenever it gets a Basic Authentication request from the server. But this login box is ugly, and doesn&#8217;t provide a friendly experience to allow people to create an account, get a password resent, or anything&#8211;it falls back to a basic error page. You can, of course, customize the error page, but not necessarily help people with the password login itself.</p>
<p>There are several benefits to using HTTP Auth, though. First of all, other applications on the same server can accept the same credentials, allowing you to sign in once and access multiple applications without having to log into each one. Secondly, you can set up stronger authentication methods, such as client-side certificates. Also, you can configure the server to protect large parts of a web site very easily, reducing exposure to information disclosure.</p>
<p>So how do you make a sign-in form on a web application set http auth? Browsers do not allow you to access these settings via script. You can use an XmlHttpRequest object to set authentication, but only after the proper challenge has been sent from the server. The biggest problem is, if the server sends this challenge twice in a row, your browser will intercept the second request and pop up the ugly password prompt. So designing a form that keeps this login prompt from popping up under most circumstances is quite the challenge.</p>
<p>The gist of the issue is that while you can open an XmlHttpRequest object with a user and password for http authentication, the browser will only actually use those credentials after the server has rejected a request. The process looks like this:</p>
<ol>
<li>Your script creates and sends an XmlHttpRequest with http auth username and password.</li>
<li>The browser submits the request to the server, without sending the username and password.</li>
<li>The server responds with 401 requires authentication, and a WWW-Authenticate header specifying a realm.</li>
<li>The browser looks in its cache to see if it already has http auth set for that domain and realm. If it does, it sends those credentials, NOT THE ONES you specified in your XmlHttpRequest. If it does not have those credentials, only then will it set http auth to what your script asked for.</li>
<li>The server responds. Generally, if the username or password are incorrect, the server will repeat the 401 response, and WWW-Authenticate.</li>
<li>The browser gets its second 401 in a row, and pops up its password box. Your script never gets a chance to intercept this. So if the stored http auth credentials are wrong, or the user mistypes the password, their browser takes over and you get a password prompt.</li>
</ol>
<p>How do you handle this situation? It turns out you need to engage in some trickery on both the client and the server.</p>
<p>Here&#8217;s a basic flow of how you need to handle this, from both the server and the client perspective:</p>
<ol>
<li>First, collect the credentials from the user, and create your request as outlined above.</li>
<li>Browser sends request without credentials.</li>
<li>Server responds with 401 and WWW-Authenticate.</li>
<li>Browser sends cached credentials, if they exist, or your credentials if not.</li>
<li>If credentials are accepted, server allows log in and responds with 200. If credentials are not accepted, server returns an error code OTHER THAN 401, and does not send a WWW-Authenticate:
<ol>
<li>We use 403 not authorized for a credential failure here. You might also use 400 Bad Request.</li>
<li>Because the response was something other than 401, your browser caches the bad credentials.</li>
<li>XmlHttpRequest status reflects the error condition.</li>
<li>Your script checks the result for the error your server has returned. Now comes the crucial part:</li>
<li>Your script submits a new request with different credentials to some server location that will return successfully. For example, we call a login method on our application, passing username &#8220;public&#8221; and password &#8220;?&#8221;.</li>
<li>The browser sends the new credentials and submits the request.</li>
<li>The server returns 200.</li>
<li>The browser updates its http auth cached credentials with the new bogus ones.</li>
</ol>
</li>
<li>Now you can present an error to the user, and ask for new credentials.</li>
</ol>
<p>The key to the above process is that if the browser gets two 401 responses without having a 200 somewhere between, it will pop up its password box and there&#8217;s nothing you can do about it. So the key is to use a different error code to indicate bad credentials, and do an intervening request that will return 200 so that you can re-authenticate.</p>
<p><strong>Logging Out</strong><br />
You cannot really log out of HTTP Auth. But you can change the credentials to a known bad user. That&#8217;s a key technique we use to effectively log out of an application, and we re-use this method to reset after bad credentials.</p>
<p><strong>On the server</strong><br />
I&#8217;m very much still in development with this. You can see the server side code for Project Auriga logins <a href="https://baker.freelock.com/svn/auriga/trunk/include/session_login.php">here</a>.</p>
<p>In this system, we do set a cookie after successful login, to keep from having to check credentials again. This script also allows for cookie-only logins without using http auth. The important bits:</p>
<ul>
<li>action=logout: if this is called, the script always returns successfully. This allows the client script to provide new bogus credentials. It passes a username of &#8220;public&#8221; to log out completely.</li>
<li>action=httpauth: if this is called, and there are no http auth credentials or the http auth username is &#8220;public&#8221;, return a 401 and WWW-Authenticate. This is always the first request from a browser, and triggers the browser to re-request with the credentials.</li>
<li>action=httpauth, with http auth username set, and it&#8217;s not &#8220;public&#8221;: The second or later requests, we never want to return a 401 or the browser will pop up its password prompt. So we return 403 (or 400) if the credentials are bad, or allow the script to continue processing if its good. In this case, our authenticate method returns true if credentials are good, false if the user is not found, and throws an exception if the credentials are bad.</li>
</ul>
<p>That&#8217;s basically what you need to do on the server side. Now for the client.</p>
<p><strong>Client-side logins</strong><br />
We&#8217;re using the <a href="http://dojotoolkit.org">Dojo Toolkit</a> extensively in Project Auriga, so the login functions are using dojo.xhr* requests to wrap the XmlHttpRequest objects and provide convenient callback functions. You can see our login code <a href="https://baker.freelock.com/svn/auriga/trunk/public_html/auriga/Login.js">here</a>. Key items:</p>
<ul>
<li>auriga.login is called by the login form. Note that if this is the first time to this page, the dojo.xhrPost actually happens twice: first time with no credentials, and the second time with them. If the second post is accepted, auriga.login_complete is called. If the second post returns any kind of error, auriga.login_err is called.</li>
<li>auriga.login_complete is easy&#8230; it just redirects to wherever the server response designates.</li>
<li>auriga.login_err is the real trick here. If it detects the error code we&#8217;ve chosen for bad passwords, it immediately calls the server logout method, to get a good response so the next time the browser gets a 401, it won&#8217;t immediately pop up the password box.</li>
</ul>
<p>You can see the code in action on our <a href="http://demo.freelock.com">demo server</a>.</p>
<p><strong>Other notes</strong></p>
<ul>
<li>Actually doing single sign-on is hard. We&#8217;re trying out different strategies for detecting whether a user already has http auth set, by calling our login method once on page load, but haven&#8217;t gotten that figured out. In our current script, just clicking Login with the form blank but authenticated elsewhere on the same domain and Realm, will log you in with your existing credentials.</li>
<li>Because your browser stores credentials based on the domain and the realm together, all applications that you set to share these items must accept the same credentials. If you have a different password on a different system on the same server, you must set a different realm, or logging into one will log you out of the other.</li>
<li>If you want to require http auth, but not Javascript, I suggest submitting something different to the server using Javascript to identify this type of request. Perhaps show your form only when Javascript is available, and when it&#8217;s not, have a link to a protected page to let your browser go ahead and show the password dialog.</li>
<li>Using http auth can actually allow users to disable cookies, if your application is RESTful. In Project Auriga, the session login script supports either&#8211;the client pages and logins work with either a cookie or http auth. The login process attempts to set http auth and a session cookie. On subsequent attempts, it uses the cookie to avoid re-authenticating every request.</li>
<li>Finally, a note on security: Basic Authentication provides no protection against passwords being sniffed over the network. If you need a secure login, be sure the server conversation uses SSL&#8211;otherwise neighbors on your wireless network can easily sniff out your password. HTTP Auth does not make your application more secure&#8211;it just makes it easier to share authentication with other resources on the same server.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/06/technical-note-http-auth-with-ajax/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Mythbusting PHP: 10 common myths about PHP</title>
		<link>http://opensourcesmall.biz/2008/02/mythbusting-php/</link>
		<comments>http://opensourcesmall.biz/2008/02/mythbusting-php/#comments</comments>
		<pubDate>Sat, 02 Feb 2008 23:16:01 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/archives/2008/02/mythbusting-php/</guid>
		<description><![CDATA[PHP development is one of our specialties at Freelock Computing. I&#8217;ve written quite a few PHP applications, some from scratch, some starting with other people&#8217;s code, some as extensions for open source projects. I&#8217;ve also read a lot of criticism of PHP, and while some of it comes from knowledgeable programmers expert at PHP, most [...]]]></description>
			<content:encoded><![CDATA[<p>PHP development is one of our specialties at <a href="http://freelock.com">Freelock Computing</a>. I&#8217;ve written quite a few PHP applications, some from scratch, some starting with other people&#8217;s code, some as extensions for open source projects. I&#8217;ve also read a lot of criticism of PHP, and while some of it comes from knowledgeable programmers expert at PHP, most of it is uninformed hogwash. So in this post, I&#8217;m going to dispel many of the myths about PHP code, and identify its real strengths and weaknesses. Most myths have a kernel of truth in them somewhere, so I&#8217;ll try to set the record straight by identifying why PHP has each myth. Ready? Let&#8217;s get started.</p>
<p><strong>1. PHP is crappy because there are so many crappy PHP programs.</strong><br />
This seems to be the biggest reason people think PHP is a bad language&#8211;there are a lot of bad PHP programs out there. Why is this? Probably because PHP is so accessible and ubiquitous that a lot of people without a programming background use it to learn programming. I&#8217;ve worked with programmers inside software companies who have much more formal background, or at least experience programming with others on a team. With somebody to guide them, they quickly learn the pitfalls to avoid, best coding practices, and development methodologies.</p>
<p>Most PHP coders on the other hand started out as web designers, putting together a web page for their neighbor, or their family, or a club of some kind. They have no formal training, no experience working on a development team, no guidance or knowledge about what makes for quality code. The result is inevitably spaghetti code, chunks cut and paste into place without real understanding of how they work, people fiddling with lines until it gives them the result they&#8217;re looking for.</p>
<p>Naturally, this leads to a lot of crappy software out there, riddled with security holes, maintenance nightmares, poor performance, and many other problems.</p>
<p>That does not mean the language itself is at fault. There are plenty of well-written programs out there that do an excellent job of doing the task they&#8217;re designed to do.</p>
<p><em>Result: Busted. Bad programmers does not mean its a bad language.</em></p>
<p>Let&#8217;s get a bit more specific about these code quality issues.</p>
<p><strong>2. PHP is crappy because it&#8217;s hard to read all that HTML mixed in with programming logic.</strong><br />
Some argue that PHP is this way because it is a template language&#8211;it was designed to be an easy way to add basic programming functionality to a web site. And while that was its heritage, PHP has grown into a full-fledged powerful language capable of most anything you&#8217;d do with any other language.</p>
<p>Nothing in the language dictates that presentation code (HTML, Javascript) needs to intermingle with business logic. I consider the best programs to have a clear division of responsibility between these areas. We use a strong Model-View-Controller (MVC) architecture when creating custom applications, the same architecture provided by many frameworks, and advocated by experts for many other languages. And we&#8217;re hardly alone in this.</p>
<p>We use the Smarty template system to separate out the templates into a presentation layer. Our model is usually made up of fairly lightweight data objects that own the corresponding database tables. The controller layer is typically a dispatcher of procedural code, often with helper controller objects. You can apply most design patterns to PHP as readily as other object-oriented languages.</p>
<p>Now, the tools you use to develop PHP don&#8217;t enforce any of this. Unless you&#8217;re using a framework, you need to create all this structure yourself. But we don&#8217;t like HTML mixed in with our business logic ourselves, so we don&#8217;t do it.</p>
<p><em>Result: Busted.</em></p>
<p><strong>3. PHP is crappy because it&#8217;s easy to hijack with all those global variables.</strong><br />
Funny how people try to create all these really easy ways to do things that turn out to be large mistakes from a security point of view. Microsoft has done this over and over. PHP has two particularly annoying &#8220;features&#8221; that have turned out to be security nightmares, originally there to make it simple to program: register_globals, and magic_quotes_gpc.</p>
<p>register_globals is a setting that takes any parameters passed in a request and automatically turns them into a global variable you can use in your script. The problem with this is that it&#8217;s very easy for an attacker to pre-define a variable that the script assumes to be unset. As I was learning to program in PHP, I wrote a 500-line script to check and double-check that each variable I was expecting from the browser was legitimate, and that all of my other variables were suitably protected.</p>
<p>At its worst, register_globals turned out to allow an attacker to include a malicious PHP file from a remote web server before your script even started, by setting an autoload variable for a particular module.</p>
<p>register_globals is evil.</p>
<p>But its vulnerabilities are widely known, and PHP has been set with register_globals turned off for several years now. It&#8217;s going away entirely in PHP 6.</p>
<p>magic_quotes_gpc is more of a pain. It was added to help prevent SQL injection attacks, and what it does is escape all of the values you receive from GET, POST, and COOKIE parameters, adding backslashes in front of any backslash or quote to make it so programmers who pass these variables straight into a database query have some protection built into the language. But it causes a lot of extra work, because your script doesn&#8217;t know whether this is on or off. If it&#8217;s on and you escape your strings, you end up with extra slashes in front of everything&#8211;and you end up with backslashes scattered all over your pages. We end up checking the setting of magic_quotes_gpc, and if it&#8217;s on, stripping the slashes before the rest of our script interacts with it.</p>
<p>For any experienced PHP programmer, these are solved problems. </p>
<p><em>Result: Busted, but there is valid criticism here.</em></p>
<p><strong>4. PHP code doesn&#8217;t scale well.</strong><br />
Nonsense. This is purely myth. Here are some of the most popular sites on the Internet that run on PHP: Facebook, Flickr, Wikipedia, Digg, parts of Yahoo. All of those are in the top 20 most visited sites on the Internet.</p>
<p><em>Result: Busted. Very busted.</em></p>
<p><strong>5. PHP is mainly a vehicle for Zend to get business.</strong><br />
I didn&#8217;t hear this one until just recently. Zend is a company with a strong stake in PHP. It controls a lot of the code, it has a decent editor with a debugger, a powerful framework, and a PHP accelerator available as proprietary add-ons. I&#8217;ve had a couple developers suggest that Zend has such a controlling interest in the language that it keeps others out, and you have to buy from Zend to make PHP work best.</p>
<p>Yet this ignores the other options out there. Zend does not have a monopoly in any of these areas. There are several other editors with good PHP debugging support, dozens of frameworks, and a handful of PHP accelerators out there, several of them completely free and open source. Now if you&#8217;re trying to change the core PHP language, you may need to work with Zend, and I have heard they aren&#8217;t necessarily the easiest to work with&#8211;they don&#8217;t readily accept changes to core features, and a few developers have left the PHP project because of disagreements over the direction of PHP. And some of these have been serious, related to hardening PHP to prevent some of the preventable security attacks through the language itself.</p>
<p>But as a PHP user, these issues seem far removed. PHP 6 is in development now, promises some decent improvements such as Unicode support and removal of some of the vulnerable settings. </p>
<p><em>Result: plausible, but not relevant to most PHP developers</em></p>
<p><strong>6. You can&#8217;t compile PHP, so it will always be slow.</strong><br />
PHP is an interpreted language, and it doesn&#8217;t have a built-in compiler. The same is true of other web languages, at least Perl. Python has a built-in runtime compiling system, so you get compiled byte-code without having to do anything. I don&#8217;t know that much about Ruby in this area.</p>
<p>But you can accelerate PHP quite similar to Python, by adding an accelerator. Zend has a proprietary one. We use eAcclerator on our servers, and there are several others out there. These provide what is called an &#8220;opcode cache.&#8221; When PHP is executed, the interpreter makes two passes: first a conversion to native bytecode, and then execution of the bytecode. An opcode cache stores that first pass to disk, so subsequent calls can use what is essentially the same as compiled code. While it&#8217;s not permanent, and probably not as efficient as other compiled languages, it does seem to allow our servers to accommodate about 40% more traffic before bogging down.</p>
<p>Combining this with other caching strategies can allow PHP sites to scale up to serve the largest sites.</p>
<p><em>Result: Plausible, but workarounds available.</em></p>
<p><strong>7. You can&#8217;t develop in PHP as fast as other languages. Like Ruby on Rails.</strong><br />
Ok. Now we&#8217;re getting to the ridiculous one. First off, Rails isn&#8217;t a language, it&#8217;s a framework. And by many accounts, it&#8217;s a good one, providing a lot of really powerful features right out of the box. It might have set a new high standard for developer-friendly frameworks. But it&#8217;s hardly the only one out there, and because it&#8217;s open source, many of the conventions it established have spread widely to other frameworks as well. CakePHP is a framework that aims to be the Rails for PHP.</p>
<p>Rails has its downsides as well. The CEO of Dreamhost has an interesting post about his experiences trying to get Rails to scale. While it may be fast to develop in, it may be at the expense of running fast enough to handle large loads. You also need to learn Ruby, which has quite a bit different syntax than PHP. PHP is quite similar to C, Java, Perl, and other popular languages, so it&#8217;s immediately familiar to many other programmers.</p>
<p>The biggest problem I have with Rails is the dogmatic nature of many of its practitioners. And it has gotten such widespread buzz in such a short period of time that in some ways it&#8217;s become the new PHP, the new pet technology by a lot of inexperienced programmers due to a low barrier to entry. If you&#8217;re a web designer and not already a programmer, you would probably choose Rails to get started in today, instead of PHP, because of all the hype. I think that&#8217;s going to lead to the same proliferation of lousy code that permeates the PHP landscape now.</p>
<p>While Ruby may be a nice language, there&#8217;s a lot more support for PHP right now, in available talent, web servers, scaling experience, and breadth of libraries available. And by starting with an application that meets 90% of your needs today, you can work on what makes your particular problem unique. Since so many applications and libraries are available for PHP that need very little customization to meet many business problems, developing from scratch with a powerful framework isn&#8217;t necessarily the fastest way to get the job done.</p>
<p><em>Result: Busted. While Ruby on Rails is nice, it&#8217;s not the only way to build an application quickly.</em></p>
<p><strong>9. PHP is only good for web applications&#8211;it&#8217;s no good for anything else.</strong><br />
PHP was built to be a web application language, but it has a command line interface, a GUI toolkit based on GTK, and other features that mean you can feasibly write just about any kind of application you can think of in PHP.</p>
<p>However, nobody does. I have not seen a single PHP desktop application in use. While we do use it for scripting a few web server-related tasks, they tend to be maintenance tasks or a forked process from a web application. There aren&#8217;t lightweight PHP libraries optimized to run on embedded devices. </p>
<p>As I look over options to write software for the OpenMoko platform, for example, PHP does not appear to be a compelling option. Likewise, it seriously lacks the ability to interact with hardware or much on the OS level without calling a shell to start some other program. Perl has long been used for these environment, but Python has been taking these environments by storm.</p>
<p>So while it&#8217;s possible to use PHP for purposes other than web applications, it&#8217;s not convenient, conventional, easy, or widely done.</p>
<p><em>Result: Confirmed.</em></p>
<p><strong>10. It&#8217;s not a real language&#8211;you can&#8217;t do proper object-oriented designs, objects are copied, etc.</strong><br />
PHP was never designed by computer scientists. You could argue it wasn&#8217;t designed at all. It was built from the beginning to solve a specific problem: to make active web sites. And it&#8217;s successful because it&#8217;s done that exceedingly well.</p>
<p>Over time, it has accumulated modules that do just about anything you might want to do in a web application, from talking to just about any database system out there to requesting pages from other servers to processing financial transactions to generating images and even PDF files. It added object orientation in PHP 4, and made it much more robust and similar to other languages with PHP 5. While it still doesn&#8217;t do multiple inheritance, or threading, or similar advanced programming techniques, you can implement most common design patterns, objects are now passed by reference, there are constructors and destructors and all sorts of things that give it as much power as any other language for most web applications.</p>
<p>While PHP certainly has its shortcomings, for the vast majority of web applications, it provides exactly the right combination of sufficient power to do the job, and a relatively straightforward way of getting the job done.</p>
<p><em>Result: Busted, for all but the most specialized applications.</em></p>
<p>That&#8217;s enough for now. In a future post, I&#8217;ll discuss the major drawbacks and benefits of PHP. Stay tuned!</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/02/mythbusting-php/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>The three spheres of web application platforms</title>
		<link>http://opensourcesmall.biz/2008/02/the-three-spheres-of-web-application-platforms/</link>
		<comments>http://opensourcesmall.biz/2008/02/the-three-spheres-of-web-application-platforms/#comments</comments>
		<pubDate>Sat, 02 Feb 2008 17:42:33 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/archives/2008/02/the-three-spheres-of-web-application-platforms/</guid>
		<description><![CDATA[There are thousands of languages out there, but only a couple handfuls are used for web applications. Of these, PHP is a runaway success. Yet I constantly see it criticized by developers of other languages, often for completely untrue reasons. PHP has a bad rap, and while it certainly has its pitfalls, there&#8217;s many good [...]]]></description>
			<content:encoded><![CDATA[<p>There are thousands of languages out there, but only a couple handfuls are used for web applications. Of these, PHP is a runaway success. Yet I constantly see it criticized by developers of other languages, often for completely untrue reasons. PHP has a bad rap, and while it certainly has its pitfalls, there&#8217;s many good reasons it has become such a popular language for web applications.</p>
<p>I consider there to be three major sets of languages currently used for web development. When talking with developers, you&#8217;ll usually find them gravitating to one of these three spheres: the Windows world of Microsoft ASP, ASP.NET, Cold Fusion, C#; the Java world; and the LAMP world. While some programmers cross between these, you&#8217;ll usually find people that are best in one particular area.</p>
<p>The Microsoft world grew out of ASP and Cold Fusion into the current .NET technologies. There is now an open source version of .NET called Mono, backed by Novell, which makes these technologies cross-platform. They&#8217;re mainly used by Microsoft and its partners, and small proprietary software companies in all sorts of vertical industries. Very few .NET applications are open source, compared to the other technologies.</p>
<p>The Java world seems to dominate the large enterprises. Companies that work with IBM extensively end up with Java-based enterprise applications&#8211;and there are a lot of them. Java was the &#8220;next big thing&#8221; in the second half of the 1990s, but it only seemed to gain a real foothold in large business. Quite a few of these applications are open source, and there&#8217;s a lot of applications large and small you can download freely and deploy&#8211;or pay thousands of dollars to a middleware vendor to have them get you running. Java has a wide mix of open source and proprietary applications available.</p>
<p>LAMP stands for &#8220;Linux, Apache, MySQL, and PHP,&#8221; though there are other P&#8217;s out there like Perl and Python. This describes the other major technology stack used in the web world, and follows the Unix design of small pieces loosely joined&#8211;you can substitute MySQL with Postgresql, Apache for another web server, and many other languages for PHP. There are far more open source applications available on the LAMP stack than the other two combined, mainly because the barrier of entry is really low&#8211;all you need is a spare old computer to install the stack, and all the software is free.</p>
<p>There used to be another popular language, TCL, running on the AOLServer, but you really don&#8217;t see much in that these days.</p>
<p>If you&#8217;re developing a web application, you can use any of these technology platforms to get the job done&#8211;in a web environment, they are all pretty much equivalent. Java and .NET have better support for desktop applications, but if your main interface is a web browser, there&#8217;s nothing you can&#8217;t do in LAMP that you can in the others.</p>
<p>LAMP is a family of technologies, with more variety than the other stacks. For the language, besides the &#8216;P&#8217; languages of PHP, Perl, and Python, there&#8217;s also Ruby that has gained a lot of popularity lately. MySQL and Postgresql regularly vie for the database slot. Apache pretty much has the web server part locked up, but Linux can even be replaced with Windows to make it the WAMP stack and you can still run most of the same programs.</p>
<p>So why group technologies into these stacks? Mainly because they work well together on the same system. This boils down to the web server part of the system. If you&#8217;re using Microsoft IIS for your web site, you&#8217;ve got .NET, and while it&#8217;s possible to add PHP or Perl, it&#8217;s not commonly done. For Java, you need an application server. But Apache makes it pretty easy to plug in all sorts of the open source languages as modules, and run a bunch of them simultaneously. Much of these differences are due to historical and cultural differences, not really technical. It&#8217;s just that these particular sets of technology are regularly used with each other, so they&#8217;re going to be easier to get running and working correctly.</p>
<p>Let&#8217;s take a closer look at the LAMP family. Like many families, there&#8217;s in-fighting and bickering over who is best at what job. Postresql people look down their noses at MySQL, which they clearly consider to be inferior in just about every way (with some justification). Perl people wonder why others program in anything else, Python people think the other languages make programming too difficult, and Ruby programmers pride themselves in writing the shortest code to get the problem solved. They all sneer at PHP, regarding it as a toy language not capable of real programming. Yet you&#8217;ll find more open source web applications written in PHP than all of the rest of them. Why is this?</p>
<p>Read <a href="/archives/2008/02/mythbusting-php/">my next post</a> to find out why.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/02/the-three-spheres-of-web-application-platforms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Reliable code: building in robustness</title>
		<link>http://opensourcesmall.biz/2008/01/reliable-code-building-in-robustness/</link>
		<comments>http://opensourcesmall.biz/2008/01/reliable-code-building-in-robustness/#comments</comments>
		<pubDate>Sat, 19 Jan 2008 20:08:09 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[01. Open Source]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[reliability]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/archives/2008/01/reliable-code-building-in-robustness/</guid>
		<description><![CDATA[Ok. Last post on the quality code series. One of the downsides of getting older is realizing you do have shortcomings. You know how when you&#8217;re young, going into a job interview, the toughest question is the one about your weaknesses? We&#8217;re all quite blind to our weaknesses, until experience comes up and forces you [...]]]></description>
			<content:encoded><![CDATA[<p>Ok. Last post on the quality code series. One of the downsides of getting older is realizing you do have shortcomings. You know how when you&#8217;re young, going into a job interview, the toughest question is the one about your weaknesses? We&#8217;re all quite blind to our weaknesses, until experience comes up and forces you to realize you&#8217;re not perfect. Sometimes this happens early, sometimes late, but it happens to everyone sometime.</p>
<p>My coding weakness, it turns out, is reliability. I&#8217;m terrible at handling errors, building test frameworks, doing unit testing. I find all of that stuff quite boring. But it&#8217;s essential to building a reliable application.</p>
<p>Reliability and security go hand in hand. In security, you&#8217;re looking at the attacks, and making sure your code is secure against them. In reliability, you&#8217;re identifying what each chunk of code expects to get, and then define how to handle exceptions, unexpected input. Done correctly, reliable code is secure. But it&#8217;s a total pain to do, and it takes a lot longer to get there.</p>
<p>One of the code samples I examined recently was set up in a completely class-driven way, though I would not call it object oriented because none of the classes extended other classes. It was a rather simple, flat collection of objects and helpers and interfaces. It was not powerful. My guess is, it is not fast. It did not look very customizable. But it was certainly clear, and every single method inspected every single parameter, making sure the input was valid. Calls to other objects had extensive error handling built-in &#8212; this application looked like it could not fail without notifying the programmer exactly where the failure was, with helpful feedback.</p>
<p>This is tedious work. I save it for the polishing phases of a project, focusing on getting things to work in the first place. But there&#8217;s a strong argument to be made for building reliability into each module from the start. It&#8217;s a very different style of programming, and takes a lot longer to get there, but the end result will inevitably be more secure, less buggy, and more able to account for every possible scenario&#8211;even if it handles a scenario by saying &#8220;I can&#8217;t do that yet.&#8221;</p>
<p>I think there&#8217;s a personality difference between these development styles. The artist figures out some innovative way of solving the problem, gets a proof-of-concept working brilliantly quickly, and cranks through code producing a huge amount in a short amount of time. The craftsman takes a slower, methodical approach, crafting each module individually, building unit tests to make sure it works correctly as he goes, and building a system piece by polished piece.</p>
<p>Successful projects need both. The artist/hacker provides vision, drive, and momentum. The craftsman makes sure the system can handle the load, and can prove it&#8217;s doing what it&#8217;s designed to do.</p>
<p>The 80/20 rule comes into play here. 80% of the features can be hacked together very quickly, in the first 20% of the project. To make the project stand the tests of time, handle everything that might be thrown at it, and act as a foundation for a business or a mission-critical part, you need the craftsman to do the remaining 80% of the work to finish the job and get that final 20% of the functionality complete.</p>
<p>So here&#8217;s a checklist for evaluating reliability of a project:</p>
<ul>
<li>Is the program broken up into discrete modules that can be completely tested one at a time?</li>
<li>Are there unit tests built for each module, testing the output for normal and exceptional conditions?</li>
<li>Is the input to each module validated and properly tested to handle all possible things that may be passed to it?</li>
<li>Does the module handle non-normal input, and raise the appropriate errors?</li>
<li>Are there regular tests of the software as a whole, and each module, to identify tests that fail, or regressions in the code?</li>
</ul>
<p>The only way to ensure reliability is through rigorous testing. Some of the newer programming practices rely on test-driven development&#8211;first you define what a module does, then you develop a test for it, and then only after all that do you finally develop the module until it passes all the tests.</p>
<p>In a small business environment, this all may be too much overhead. 80% of an application may be enough, and at 20% of the cost, much more inline with the budget. But when you need something to be completely reliable, take a look at the testing framework, how much it covers, and how much of the application passes the tests.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/01/reliable-code-building-in-robustness/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Customizable code: writing future-proof code</title>
		<link>http://opensourcesmall.biz/2008/01/customizable-code-writing-future-proof-code/</link>
		<comments>http://opensourcesmall.biz/2008/01/customizable-code-writing-future-proof-code/#comments</comments>
		<pubDate>Sat, 19 Jan 2008 17:15:29 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[01. Open Source]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[customizing]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/archives/2008/01/customizable-code-writing-future-proof-code/</guid>
		<description><![CDATA[Before code can be customizable, it must be clear. But clarity is not enough, if you&#8217;re going to be using a codebase in multiple places. Many open source projects excel at customization. People have enough different uses for an application that very few work perfectly out of the box for everybody. Most companies want to [...]]]></description>
			<content:encoded><![CDATA[<p>Before code can be customizable, it must be <a href="/archives/2008/01/clear-code-building-understandable-applications/">clear</a>. But clarity is not enough, if you&#8217;re going to be using a codebase in multiple places.</p>
<p>Many open source projects excel at customization. People have enough different uses for an application that very few work perfectly out of the box for everybody. Most companies want to apply their branding to the software we use. Some people need an application localized and translated for their audience. Sometimes a company just needs a small change to make the software better fit their needs.</p>
<p>It&#8217;s relatively simple to customize any application, if you have the source code. What becomes a huge challenge is maintaining your customizations when the underlying software is updated. If the software is not designed with specific ways of customizing it, it&#8217;s going to end up being difficult to maintain, unless you have gotten your changes incorporated back into the original software.</p>
<p><strong>Architecting for customization</strong><br />
Applications that are designed for customization have clear divisions of code. This can happen for several different areas:</p>
<ul>
<li>Templates or Themes. Most people want to be able to change the look and feel of a web application. If it has a template or theme system, you can just create a new theme and turn it on. Upgrades can then happen without clobbering your changes.</li>
<li>Language. Most successful open source projects have separate language files containing all of the labels, instructions, menus, and other text the application shows. Many come with multiple translations, and accept others as people contribute them.</li>
<li>Add-ons, plugins, modules, and components. Content management systems like Joomla and Drupal are particularly strong at this. SugarCRM is, too. They have a well-defined way of adding new functionality to the application, keeping it self-contained in a separate unit of code that a site administrator can manage through the interface.</li>
<li>An override mechanism. Some programs make it easy to replace the default behavior with your own version. ZenCart does this well&#8211;you can take many different core files, copy them into a particular directory associated with your site, and change them to make it do what you want. Upgrades to ZenCart will still use your versions of the files, even if the underlying file changes.</li>
</ul>
<p>When you&#8217;re customizing an application, all of the other aspects of <a href="/archives/2008/01/quality-code-how-do-you-judge/">quality code</a> apply to your customizations, as well as the original code. Your add-on is faster and more secure if you use the application&#8217;s interface for retrieving data instead of including your own. Your add-on is more powerful,  clear, maintainable, and reliable if it uses the application&#8217;s defined ways of customizing it.</p>
<p>While not all open source is designed to be customized, it&#8217;s a strong consideration we&#8217;re looking at when we evaluate a project. So what do you do if you need to customize something that&#8217;s core to an application?</p>
<p><strong>Customizing software not designed to be customized</strong><br />
If you need to make changes to the core part of an open source project, you&#8217;re setting yourself up for a maintenance nightmare. All active server software has updates. No program is perfect. Somebody, somewhere, will find a way to crack into it, and if you have business data or unethical competitors or disgruntled customers or employees, you will get targeted eventually. In the security community, people publish vulnerabilities to programs, so that they may be fixed. That means if you&#8217;re using common software packages, somebody needs to maintain it.</p>
<p>If you&#8217;re using software designed to be customized, and all your customizations are outside of the core code, this is not a major problem. A system administrator updates the core software, and if any of your customizations break, your developers update your customizations. However, if you had to make a lot of changes to core files, you&#8217;re in trouble. You either need to re-implement the security fixes in your code, or re-implement your customizations in the updated code.</p>
<p>There are basically 3 strategies for minimizing these issues:</p>
<ol>
<li>Use strong source-code management tools to manage your changes as patch-sets, and re-apply them at each upgrade, rewriting sections that no longer work.</li>
<li>Fork the project, and take over responsibility for managing your branch. You&#8217;ll need to track the vulnerabilities in the parent project, and re implement security fixes in your own.</li>
<li>Contribute your changes back to the original project, and persuade the maintainer to incorporate them into the main code tree.</li>
</ol>
<p>When you look at these alternatives, clearly #3 is far less expensive for you than the other two&#8211;your customizations are no longer customizations, but part of the core software. This is actually how open source develops, and how you may change from being an open source consumer to an open source contributor.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/01/customizable-code-writing-future-proof-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Clear code: Building understandable applications</title>
		<link>http://opensourcesmall.biz/2008/01/clear-code-building-understandable-applications/</link>
		<comments>http://opensourcesmall.biz/2008/01/clear-code-building-understandable-applications/#comments</comments>
		<pubDate>Wed, 16 Jan 2008 06:18:09 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[01. Open Source]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/archives/2008/01/clear-code-building-understandable-applications/</guid>
		<description><![CDATA[Programming is an exercise in understanding a problem. To program effectively, you need to fully understand, in intricate detail, the problem your program is solving. Sometimes as a programmer you don&#8217;t fully understand the problem until you&#8217;ve wrestled with it a few times in code. Most experienced programmers will tell you that when creating a [...]]]></description>
			<content:encoded><![CDATA[<p>Programming is an exercise in understanding a problem. To program effectively, you need to fully understand, in intricate detail, the problem your program is solving. Sometimes as a programmer you don&#8217;t fully understand the problem until you&#8217;ve wrestled with it a few times in code.</p>
<p>Most experienced programmers will tell you that when creating a large program, you almost always have to scrap your work at least once. At some point, you find that you&#8217;ve programmed your way into a dead end, that you just can&#8217;t quite get where you&#8217;re trying to go without doing it again. This is part of the process of understanding the problem, and usually once you&#8217;ve made this leap, you can visualize the whole thing laid out before you, and the next go around leads to a useful, functioning program. Not only that, but the next go-around has a much higher percentage of clear, understandable code.</p>
<p>Clarity in code is a sign of the maturity of the application. It&#8217;s also a sign of requirements that haven&#8217;t changed from the original. Inevitably, in the real world, code accumulates hairy sections to deal with changing requirements, accreting moss, dirt, and all sorts of cruft as the real world steps in to make things messy. The more clear, organized, well-defined, and well-documented a code base is, the longer it will last in the real world before needing a major revision.</p>
<p>If you see a project that seems completely transparent, easy to figure out, and easy to change, you&#8217;re probably looking at code that has been through some serious revision, and has been recently refactored to reflect the problem it&#8217;s trying to solve. As long as the fundamental assumptions of the design do not change, clean code is easy to enhance, extend, and otherwise adjust to meet new requirements. Until it gets hairy again and is time to start again.</p>
<p>Clean code is elegant. Clean code is flexible. Clean code is related to powerful code, but <a href="http://99-bottles-of-beer.net/language-perl-737.html">code can be powerful without being clean</a>.</p>
<p>Here are some principles we use to develop or identify clean code.</p>
<p><strong>Use a good overall architecture for your application.</strong><br />
Like many other software companies, we use a Model-View-Controller architecture for most of our projects. The Model defines the problem space, what data needs to be stored, and how it&#8217;s broken down. The View is the human interface, the presentation of the software to the user. The Controller connects the model to the view, and often enforces authorization rules and the interface to other systems.</p>
<p>In our applications, the model is almost always object-oriented. We build up classes of objects that correspond to what we&#8217;re modeling. We like using template systems like Smarty for the view, so our designers and front-end coders can change the presentation without affecting core business logic. Our controllers are a mix of objects and functional code, whatever seems most appropriate for the overall system.</p>
<p><strong>Normalize data as much as practical.</strong><br />
In database terms, normalization is the process of identifying all the properties of all the objects that have a one-to-one relationship to each other, that fit cleanly in the same database table. For example, a contact has only one first name and one last name, one father, and one mother (at least in the biological sense), but might have more than one email address, mailing address, and phone number. When modeling this data structure, you might decide to have one contact table that allows for 3 email addresses. Or you might have a separate email address table that allows any number of email addresses associated with a contact. If you were going to fully normalize this data, you would have separate email address tables, phone number tables, and physical address tables. But is this really practical? Does your particular system need to track all the email addresses of a user, or is one (or two) enough? If you can limit it to one email address, it might make a fine unique identifier for your system, if you know your users don&#8217;t share email addresses.</p>
<p>But if you&#8217;re going to track three contacts for a company, why not normalize this into a separate table, and remove the arbitrary limitation? I shudder when I see fields named &#8220;email1, email2, email3, email4.&#8221;</p>
<p><strong>Each database table should be owned by a single class.</strong><br />
If you have a contact table, you should probably have a contact class to manage it. While other classes may query this table in a join, those classes should be getting only specific fields from the table. Only the contact class should write to the contact table, and in most cases, all requests for any contact details should go through the contact class. The rest of your application should talk to a contact object, rather than the underlying data, except when you&#8217;re trying to optimize for speed.</p>
<p>The main benefit of this approach is that you can more easily change the structure of your database tables with minimal impact to your application. If you decide that you really do need more than one email address for a contact, you can do most of the heavy lifting in the contact class, and only need to make small changes to the template to show the new data. The other parts of your application should be unaffected, because they simply request the default email address from your contact object&#8211;which is smart enough to know that&#8217;s now coming from a different table.</p>
<p>If you really need to do sophisticated table joins to make your application fast, consider setting up a query builder structure. We sometimes set up static methods on a class that modify the different parts of a query to add the desired fields and do the appropriate joins. </p>
<p><strong>Define who is responsible for what.</strong><br />
I&#8217;m not talking about people here&#8211;I&#8217;m talking about classes, files, and functions. Just like classes in the model own particular database tables, you should define which part of the application is responsible for all of the major parts of an application: authentication, authorization, state, the structure of the URL, form handling, initialization, etc. Each one of these functions should be owned by a particular part of the application. This &#8220;meta&#8221; stuff about the system we usually leave in the controller, often with included files dedicated to particular features. We usually build helper methods into base classes inherited by all of our data objects in the model, specifically for state and authorization.</p>
<p>Authentication, verifying that a user is who they say they are, should be consistent across your application. You usually have people log in with a username and password. The problem is, because the Web is stateless, you need to verify that you&#8217;re still talking with the same user on every single request. To do this, you either use http authentication, which passes the same credentials with each request, or you give the browser a token that you match up in a session. Your web application needs to verify the session or credentials with every single request, if it does anything that you don&#8217;t want the Internet at large to be able to do.</p>
<p>Authorization, granting access to particular objects and methods for particular users, can be a bit more complicated. There are several different models for authorization: simple ownership, group ownership, user levels, and full-fledged access control lists. Authorization can either be handled by the controller or by the model itself. If the code is clear, it should be apparent where authorization is handled, and how it may be changed.</p>
<p><strong>Small Pieces Loosely Joined.</strong><br />
Even more than powerful programming, clear programming means breaking things up into manageable, understandable chunks. Each class in the model should correspond to the objects in the real world you&#8217;re modeling. The typical method on classes in our models are usually between 5 to 25 lines of PHP code. Some reach 30 or 40 lines, and only the really ugly ones reach 100 lines. If a method is reaching that threshold, it can probably be broken into several smaller helper methods that make the main method more readable. If these helper methods can be reused by other methods, well, you&#8217;re killing two birds with one stone. More often that not, this level of refactoring distills the essence of the problem down into components that make your code more powerful.</p>
<p>Most of the long methods in our code seem to be related to form processing, parsing different parameters to insert or update data across multiple database tables. Through a combination of setting up property maps inside the object, clever getter and setter methods, and utility methods that iterate across relevant properties, these long methods can be decimated to a few calls that make the method much more portable, resilient to bad data, and more easily overridden from subclasses, too.</p>
<p><strong>Create effective documentation.</strong><br />
I&#8217;m just starting to get into the habit of creating JavaDoc/PHPDoc style of comments, documenting each function and method. I&#8217;m a long time user of the Komodo IDE from ActiveState, and it kindly shows you the comment immediately preceding a function you type, in a tooltip as you provide parameters. Being able to see what parameters your method is expecting, what it returns, and any gotchas about using it without opening the file containing the class, saves a lot of time during development. Those kinds of comments I consider to be required.</p>
<p>On the other hand, a comment that states the obvious is a waste of space. Comment anything unusual or unexpected. For example, if I assign a variable in an &#8220;if&#8221; expression, I&#8217;ll put a comment that I meant to assign it, that it&#8217;s not just missing the extra =.<br />
<code>if ($a = $b->value) // assigns value to $a, skips section if value is false</code></p>
<p>Related to inline code comments, use descriptive variable names, and consistent placeholders. I use $i, $j, $k for loops, $ar for generic arrays in helper functions, $obj for an unknown object, $t for a global Smarty template object. Otherwise I&#8217;m referring to $task, $oldtask, $project, $user, and $todotomorrow.</p>
<p>For complex projects, inline comments are not enough. You need a solid architectural document that illustrates objects and their relationships, workflow, and how to customize. Diagrams are good.</p>
<p>Finally, clear code is tidy code. While PHP isn&#8217;t as picky about tabs and whitespace as Python, properly nested code blocks promote readability, help keep your code valid, and gives you a quick indication about how deep you are inside a function.</p>
<p>Clear code invites customization, enhancement, and further development. Clear code is maintainable, and a sign that an application can likely be kept up-to-date for quite a while to come. Clear code takes more time to develop, but usually indicates a better understanding of the problem. Clear code is more portable, more reusable for other purposes, and more powerful.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/01/clear-code-building-understandable-applications/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Powerful code: Get more out of every line</title>
		<link>http://opensourcesmall.biz/2008/01/powerful-code-get-more-out-of-every-line/</link>
		<comments>http://opensourcesmall.biz/2008/01/powerful-code-get-more-out-of-every-line/#comments</comments>
		<pubDate>Mon, 14 Jan 2008 22:28:50 +0000</pubDate>
		<dc:creator>freelock</dc:creator>
				<category><![CDATA[01. Open Source]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[power]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://opensourcesmall.biz/archives/2008/01/powerful-code-get-more-out-of-every-line/</guid>
		<description><![CDATA[Programming borrows a lot from the construction industry. Many programming terms derive from construction: hacking, builds, development, architecture, scaffolding, frameworks, and dozens of others. But in some ways, programming has an element of power beyond construction. Take, for example, a building. When you build a building, you start by pouring a foundation. On top of [...]]]></description>
			<content:encoded><![CDATA[<p>Programming borrows a lot from the construction industry. Many programming terms derive from construction: hacking, builds, development, architecture, scaffolding, frameworks, and dozens of others. But in some ways, programming has an element of power beyond construction.</p>
<p>Take, for example, a building. When you build a building, you start by pouring a foundation. On top of that, you construct a skeleton, add walls, a roof, sheetrock, siding, and all the plumbing and electrical. Each one of these details needs to be built by somebody&#8211;all four walls of each room needs to be framed in, wired, and finished.</p>
<p>In the world of programming, however, you really only need to build one wall, and then the computer can create as many copies as you need. So when building your program, you might create a &#8220;wall&#8221; class, which is comprised of a bunch of two by fours, sheathing, sheet rock, wiring, and outlets. You might give your wall a set of properties: width between studs, overall width, overall height, position of outlets, the number and dimensions of windows and doors, etc.</p>
<p>Once you have a wall defined with a bunch of appropriate variables, you can then work up to defining a room. Your room might have four walls, with windows and doors in particular positions. Obviously, there&#8217;s new levels of complexity here, but you don&#8217;t have to build every single wall if you can just specify a new wall with particular characteristics.</p>
<p>Now that we have a generic room, we can extend our room model by creating specific types, or sub-classes, of rooms: bedroom, bathroom, kitchen, utility room. And then we can define an apartment as a particular combination of rooms, and an apartment building as a particular combination of apartments.</p>
<p>A powerful program is one that allows you to say, &#8220;give me an apartment building with this many apartments of this base floorplan, and put it here.&#8221; A few lines of code specifying any details that vary from your standard, and you&#8217;re done with the basic system&#8211;you can start creating custom trim.</p>
<p>Object-oriented programming is powerful because it lets you start with a basic model, and extend it to create variations. Each variation (or subclass) inherits all the hard work that went into the underlying class, but adds only the details that make it different. The bathroom extends a generic room by adding plumbing and fixtures. </p>
<p>To me, this ability to inherit properties from other objects is the main reason to write object-oriented code. Some languages (like Java) force you to do everything in an object-oriented way, which strikes me as less practical&#8211;you need to find design patterns that work with that model to accomplish what you&#8217;re trying to do. But object orientation provides a powerful way of modeling a system.</p>
<p>When I review code, I&#8217;m looking for object orientation used in an effective, sensible way. Each real world object being modeled in a system should have a corresponding class in the underlying system. Classes should extend some basic data class to avoid repeating the same methods in a bunch of separate classes. Code should be built up into units that can become parts of other units, so that individual chunks can be kept small and understandable. If any PHP file ends up longer than a thousand lines, I start looking for ways of simplifying, streamlining, sharing code with other modules. If any individual method ends up longer than a hundred lines, it should be doing something extremely unusual that isn&#8217;t necessary anywhere else.</p>
<p>The Unix architecture is often summarized as &#8220;small pieces loosely joined.&#8221; Each identifiable chunk should be small and have a clearly defined purpose. Assembling these small pieces into a larger system results in great power while also allowing for reliability, security, and actually getting the project finished.</p>
<p>It&#8217;s all a matter of scope. When you&#8217;re looking at a wall object, you are working with two by fours, nails, and sheetrock. When examining a room, you&#8217;re working with walls, a ceiling, and a floor. Programming should hide the details of lower layers, and allow the programmer to focus on the necessary detail for the scope of the module she&#8217;s working on. The result is powerful code.</p>
<p><strong>Why would you not need powerful code?</strong><br />
Pascal (and <a href="http://dangerousintersection.org/?p=84">many others</a>) is credited with the idea that it takes longer to write shorter code. This series of blog entries certainly illustrates the concept&#8230; The same principle holds true in code. If you&#8217;re creating a web application that&#8217;s never going to need revision, it can be much quicker to just write as you go and end up with some big long pile of spaghetti code. The instant you need to change it, or worse, somebody else needs to change it, fast, long-winded coding takes a lot more time to update.</p>
<p>As far as I&#8217;m concerned, the only reason to not take a structured, measured, powerful approach to coding is that you need something temporary working today, and don&#8217;t care that you&#8217;ll probably have to scrap it and do it right later.</p>
<p><strong>How do you create powerful code?</strong><br />
Powerful code comes from structure. Frameworks deliver structure. This does not mean a particular framework is powerful for your application.</p>
<p>A skyscraper needs a much stronger foundation, and far better design to prevent collapse than a house. In programming, you can either use somebody else&#8217;s framework, or build or grow your own.</p>
<p>Developers love building frameworks. It&#8217;s fun to think of all the things that people might someday do with your framework, and build in a mechanism that provides useful ways of doing those things. The problem is, build in too many features to the framework and you just end up with a large bloated blob of code that nobody uses entirely, that nobody even knows how to use properly. Make your framework too small, and people end up having to do more work in the actual application.</p>
<p>The hot framework right now is Rails. It has a lot going for it&#8211;a solid philosophy of convention over configuration, auto-creation of all sorts of things like database tables you otherwise have to build yourself, and other features I&#8217;m sure you&#8217;ve heard about already from all the Rails developers out there.</p>
<p>Personally, I think frameworks like Rails are overrated, hiding too much of the implementation to be valuable. The perfect analogy for this is photography. If you take a basic photography course, you learn about the basic fundamentals: lens focal length, aperture, shutter speed, focus distance, and film speed. That&#8217;s all you need to know to take great pictures with any camera&#8211;at least any that allows you to set these things manually. Most cameras these days try to automate all of this for you, and most of the time they do a reasonable job. But most cameras also have a whole set of special settings. My Casio has a &#8220;Best Shot&#8221; mode, designed to set the camera up for different scenarios: landscapes, portraits, evening shots, indoors, backlit, etc. Some of these modes do really sophisticated things, but is it better for a photographer to understand all the different programmed modes, or the fundamentals of photography? I would argue the latter&#8211;with an understanding of how photography works, you can operate any camera. With an understanding of the programmed settings of a particular camera, you&#8217;re lost as soon as you move to another.</p>
<p>That&#8217;s the problem with frameworks&#8211;you spend more time learning all the ins and outs and arbitrary ways of tweaking it, instead of focusing on the actual task at hand&#8211;taking good pictures. Then again, I prefer a stick shift to an automatic every time&#8230;</p>
<p>When it comes to frameworks, less is more. The simplest possible framework that fits your application requirements is the one to use. If you can&#8217;t find one that fits, start with some simple data objects, an effective template library, and build your own, but don&#8217;t spend too much time on it&#8211;let it grow as you need it.</p>
<p>In the grand scheme of things, I don&#8217;t need a framework to create a database table for me&#8211;that&#8217;s a lot of extra code for something that only happens once. But for all those things you do need more than once&#8211;for the walls, rooms, and apartments in your building, design with care and power in mind.</p>
<p>For more about power, go read Paul Graham&#8217;s essay, <a href="http://www.paulgraham.com/power.html">Succinctness is Power</a>. Then follow it up with <a href="http://www.paulgraham.com/head.html">Holding a Program in One&#8217;s Head</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://opensourcesmall.biz/2008/01/powerful-code-get-more-out-of-every-line/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

