01 Oct 2007

Anything For Sale By Owner

1 Comment Uncategorized

Anything For Sale By Owner LogoAs I alluded in a post a couple of weeks ago, I have been a bad blogger. And I have neglected my community of readers. However I would like to tell you what I have been doing in the last couple of months while I have been neglecting my blog.

I recently got involved in creating a startup as the lead developer for an online classifieds site called Anything For Sale By Owner. From the ground up this was conceived as a middle-ground between craigslist and ebay where every listing would be charged at a static rate of $1.00/month. The $1.00 is a way to week out the crap from craigslist and the death-by-fees from ebay.

The description is pretty straight forward but the more interesting tid-bits that I know my readers are interested in are as follows:

.NET/C#

We choose .NET mostly because it is what I knew and .NET for me has always been stable, predictable, and performed really well on server-side applications. The alternative was PHP and even through we wrote many of the processing layers by our self (i.e. REST Web Service Handler) the time to deployment was greatly accelerated because of all the work the Microsoft ASP.NET Team has put in to the product. The user of Master Pages and Web Services made for an easy separation between content and display.

MySQL

We choose MySQL for a whole host of reasons mostly based around the costs associated with a single Microsoft SQL Server Standard Edition license. Other reasons we choose MySQL was for scalability, because not only could we install 5 database servers (hardware included) for the same price we could purchase 1 MSSQL database server for but also because the master/slave replication of the databases seemed to be an easier process when we needed to scale horizontally.

REST Web Services/AJAX

We followed the Digg Model for exposing web services and each web service could be changed around to provide output through JSON, RSS, ATOM, KML (where applicable), and XML. I even did a write up about a month ago on the JSON Serializer that I developed for this website. This was very important for the AJAX we needed to control many aspects of the user experience.

Open Search

Open search was one of the value add features that wasn’t in the original spec but was an easy add-on because we already had the Web Services in place to leverage. You can view our open search XML definition file here. If you are unfamiliar with Open Search this is how A9.com defines it:

OpenSearch is a set of simple formats for the sharing of search results. Any website that has a search feature can make their results available in OpenSearch format. Other tools can then read those search results.

So that little search box in the upper right hand corner of your browser is an example of an Open Search client.

SEO

Search Engine Optimization is a very important part of any website today. Because for most new sites and even some of the older ones, Google is going to be the largest most ignored user of the site. Not only will Google look at every single page on your site every week, which I dare any human to try and accomplish, they will also be the largest organic promoter of your site.

One of the corner stones of SEO is easily readable URL’s that contain descriptive keywords. This means you have to have a good URL Rewriter in place. We choose Ionic’s ISAPI Rewriter that integrated nicely with IIS 6.0. However that left a big gap between using Visual Studio’s Intigrated Web Server (which doesn’t support ISAPI) and a full blown IIS Server. The benifits are pretty obvious for running the Intigrated Web Server that comes with Visual Studio, for one you don’t always have to have IIS that comes with Windows XP always running and the host of security problems that comes with it, two I was running Windows Vista and IIS 7.0 has some huge differences from IIS 6.0.

So I sat down one night and wrote my own Apache mod_rewrite compatible HttpModule for running the same rule-set that I defined for Ionic’s ISAPI Rewriter, to fill in the gap and make developing on my local machine as close as running the live web site on IIS 6.0.

If anybody is interested I offer the Url Rrewriter as a free download:

Front End Design

I am a software developer and usually don’t get invovled in the artsy end of the web site design. So I will let my co-worker Tom Lauck describe how he developed the front-end for Anything For Sale By Owner.

So all in all I believe that this website has a very good chance of making in the Wild Wild West that the Internet is, however that is probably just the ramblings of a proud father. Be on the look out for some major marketing campaigns in the NYC Times Square region.

31 Jul 2007

Control Google Bot With The New X-Robots-Tag

No Comments Uncategorized

Google has extended its support for Google Bot restriction by giving us web developers a new tool to stick in our belt. It was announced today on the Google Blog that you can now control access to your non-HTML files on your website with a simple header. The header X-Robots-Tag will allow you to do everything the normal Robots Meta tag will, but now you can do it for the PDF, Word, Image, and any other document you can think of that is served via HTTP. They also announced on the same post a new type of exclusion cause that lets you set when the document will be unavailable, see below for more information on this new feature as well as currently supported ones for use with X-Robots-Tag:

  • INDEX|NOINDEX – Tells whether the page may be indexed or not
  • FOLLOW|NOFOLLOW – Tells whether crawlers may follow links provided on the page or not
  • ALL|NONE – ALL = INDEX, FOLLOW (default), NONE = NOINDEX, NOFOLLOW
  • NOODP – tells search engines not to use page titles and descriptions from the ODP on their SERPs.
  • NOYDIR – tells Yahoo! search not to use page titles and descriptions from the Yahoo! directory on the SERPs.
  • NOARCHIVE – Google specific, used to prevent archiving (cached page copy)
  • NOSNIPPET – Prevents Google from displaying text snippets for your page on the SERPs
  • UNAVAILABLE_AFTER: RFC 850 formatted timestamp – Removes an URL from Google’s search index a day after the given date/time

So how can X-Robots-Tags help you better control the content that is indexed by Google? Well you can now tell the Google Bot that you do not want specific non-HTML documents like PDF, Word, and Image documents that you don’t want them cached on the Google Server or that a paper you have released on your website in PDF format should only be good until a specific date. So now you just need to force you server to include an addition X-Robots-Tag in the header which can be done with any of the modern languages and server, the header would look something like this:

Date: Tue, 31 Jul 2007 21:41:38 GMT
Server: Apache/1.3.37 (Unix) PHP/4.4.4
X-Powered-By: PHP/4.4.4
X-Robots-Tag: index, noarchive, nosnippet
Connection: close
Transfer-Encoding: chunked
Content-Type: application/pdf

You can do this with anything that can be served over HTTP now, so this is a huge boost for any of us control freaks that like to have our content easily organized and controlled on what is searchable on Google.

15 May 2007

A Blog Owners Best Friend Google Analytics

No Comments Uncategorized

A major update has been pushed out for Google Analytics, as described in a post on Google Webmaster:

Webmaster tools from Google are indispensable for people who optimize their site for indexing in Google. Eighteen months ago, Google launched another free tool for webmasters – Google Analytics – which tells you about your visitors and the traffic patterns to your site using a JavaScript code snippet to execute tracking and reporting. This past Tuesday, Google Analytics launched a new version, with an easier-to-use interface that has more intuitive navigation and greater visibility for important metrics. We also introduced some collaboration and customization features such as email reports and custom dashboards.

I simply love this tool, and the data it provides is invaluable to my day to day operations of this website.

New Google Analytics