In the (good ol') days when FTP and archie were king, it was fairly simple for developers to spread their offerings far and wide. I had scripts set up to drop the right files in the right locations, and it didn't much matter if there were two or twenty archives.
Enter the Web, and the focus shifts from pushing software out to archives in favor of pulling people into web sites. I think that's a good thing because it puts more information into the users hands, but in the process developers have lost the ability to easily push anything out. Instead, we have to manually go to a number of tracking sites (the more the better, usually), set up accounts, and edit essentially the same information on all those sites.
So my long-winded question is essentially this: Is there any interest in automating this process? I am proposing an XML file format be published which contains all or nearly all the information that is gathered at your various software tracking sites. If a general software description file format can be agreed on, simply making that file available would give your sites all the information they need to update their database entries. No fuss, no muss. Minimizing the administrative efforts will really lower the barrier of entry for all sites.
You can find the SPIF XML DTD and some reference files for Subsume Technologies software at
Keep in mind that this is a work in progress and represents only a first pass effort at a format that contains sufficient information to satisfy most tracking software. For our purposes, a piece of software seems to have the following basic elements: description, version, author, system, license, package, changelog, external, category, icon, purchase info.
The description element identifies the software in increasing levels of detail. Currently that is done with 3 sub-elements: name, short, and long. It is possible that it would be cleaner to break out the name element and have multiple simple description elements with a type attribute of short, long, etc. Any preference?
The version element identifies the version of the packaged software. It also has a status attribute to identify how mature the software is (e.g., "Beta").
The author element identifies who owns the software. This is done with 2 sub-elements: name and location.
The system element identifies what operating systems the software is for. This is done with 2 sub-elements: name, and version. It may also be a good idea to add further system requirements (RAM, HD, etc.), but that does not seem to be a major consideration from a tracking point of view.
The license element identifies the license the software is distributed under. This is done with 2 sub-elements: name, and location.
The package element identifies all the files that are associated with this software. Each type of package is identified (logically) with the type attribute (currently restricted to binary, source, and info; should it just be left open?). Identifying sub-elements of a package element are location, size, checksum, and contact. I also included contact information, just in case the contact for, say, the binary might be different from the contact for the source, but I'm not sure that's really necessary (maybe just an email element in the author element?).
The changelog element identifies the changes that have been made since the previous version. This is done with one or more change sub-elements.
The external element identifies other software that is in some way related to this software. That relationship is identified with the dependency attribute (currently restricted to requires, suggests, and conflicts; it will probably get more choices or left open in the future). Identifying sub-elements of an external element are name, version, and location. In the future, maybe we can modify the DTD to also allow external elements to be identified with just a spif sub-element.
The category element identifies how the software should be organized. In looking at the various tracking sites, there was really no consistency as far as the arrangement or naming of software categories. Additionally, most sites had additional fields for keyword descriptions of software (for search purposes). I'm hoping these features can be subsumed by this one category element. It should be considered a prioritized list of organization and search keywords. The software that scans the file should be able to look at this list and determine where it fits with all the other software that is being tracked. If it fails, I suppose it would be up to the tracker to either adjust the scanning software to be more robust or inform the developer of the error. Maybe attributes can be defined in cases where specific (read: lazy) sites need specific category names.
The icon element identifies a graphic associated with the software. This is done with 3 sub-elements: icon/location, icon/height, and icon/width. I'm thinking height and width should maybe be attributes. Thoughts?
The purchase element gives information on purchasing the software. This is done with 2 sub-elements: price, and location.
Conclusion? We've just begun!