Project HOSPEXerver ===================== written by: Ian Feldman Sept-Oct 94 Abstract ---------- HOSPEX is a database of individuals (from all parts of the world) who have pledged to host one another during tourist stays in own homes. Anyone may join by mailing a filled form to the database owner/ maintainer, who then validates and appends it to a log file. Information about new applicants is disseminated manually to all members via a related HOSPEX-L mailing list. However, currently the only way for members to search the database and extract records is by fetching the entire accumulated logfile via FTP, then searching it manually (grep etc) for entries of hosts in countries/ regions/ areas of interest. Ie. a highly tedious and restrictive procedure. Objective ----------- To widen the appeal of the HOSPEX service by linking it to the WorldWideWeb, which should simplify access to it for members and non-members alike. It would allow realtime browsing and, for members, retrieval of individual records. An additional target is to alleviate the administration of the service by providing the Hospex-adm. with a front-end capable of automating most of the administrative chores (record validation, posting, updating). Problem description --------------------- * Technically HOSPEX is a database of page-size ASCII records containing a number of free-length fields indexed by plaintext named labels. A label is delimited from the content of its field by a trailing colon. There are no explicit field separators, only implicit ones. * Because access to the database is restricted to its members, any linking of it to the WWW will by necessity require _selective_ suppression (or other transformation) of parts of the output. Else there would be no incentive for non-members to join up. * Any changes **must** still allow for non-anon FTP accesses using the simplest of tools: a TTY ftp client and automatic ftpmail operations (non-anonymous). * Although it is possible to configure an HTTP server to provide full user authentication to allow or deny access to a service in full, our solution must be able to distinguish between privileged and non-privileged accesses. On detection of the latter the database must still be readable but perhaps with certain key member-identifying fields suppressed or encrypted. The proposed solution ----------------------- * The records will continue to be plaintext files, one record/ member, not marked-up beyond what's already in place (this takes care of simple-ftp accesses). * Instead of current flat-file organization, existing records will be stored in individual files (= nodes in the future record tree), and new records _automatically_ added to relevant directories. Such files, one per member, will unformly be named '[nn].txt' where nn = a two-digit --leading 0 where so required-- sequential number of records in its directory. * Files will be arranged in a nested, hierarchical directory structure reflecting the geographical and regional divisions in the real world. The organization will allow parallel control/ navigation files, so browsing can be done along the lines of 'Continent/ Country/ Town' as well as by 'Continent/ Region/ Country/ State/ Town/' where so required. * The navigation/ browsing structure will be stored in separate documents marked up as HTML and updated/ regenerated automatically by the HOSPEXerver each time the database has been changed. In this manner the database will always stay up to date. The entire _primary_ structure will be made up ONLY of text/plain member- records named '[nn].text and 'index.html' at _all_ levels of it. * Some indexes, eg. those for geographic regions like 'Scandinavia' in Europe and 'EastCoast' in the USA will not EVER be changed/ updated. Such documents can be considered 'static', and should therefore stay locked to prevent their accidential regeneration when the whole control structure is being updated. * When database accessed by a member, the HOSPEXerver will in realtime transform the requested text/plain record into a fully- qualified HTML data stream, with the EMAIL: field in the record made into a clickable HTML-mailto: anchor. * When database accessed by a non-member the Hospexerver will in realtime selectively suppress the content of the NAME:, ADDRESS: and PHONE fields, while changing the EMAIL: into a clickable HTML- mailto: anchor pointing to the hospex-request or equivalent address. See the samples (ALL leading-dot-files are simulations of the relevant [nn].txt and index.html files). Proposed directory structure ------------------------------ Continent1/ Continent2/ ContinentN/ index.html Country1/ Country2/ Country3/ RegionA.html RegionB/ | index.html CountryN/ | index.html | Town1/ | Town2/ | RegionA/ | | index.html | RegionB.html | TownN/ | | index.html | | 01.txt | | 02.txt | | 03.txt | | nn.txt * Continents: Africa, AU-NZ, Asia, Europe, NAmerica, SAmerica * Country-Level Regions: eg. Scand{inavia} in Europe; E{ast}Coast in NAmerica: FarEast in Asia. (more) * Town-Level Regions: contain mainly indexes pointing to countries one level up (eg. American states), and towns on the same level. (more). * Observe that there may both be Region-directories with a single index.html document in them --AND-- RegionN.html indexes in their own right. Maintainer usage scenario --------------------------- ( 1) A new HOSPEX member application arrives via mail. The maintainer ('M') saves it in a separate file, here called file0. ( 2) Using a dedicated shell script or command M submits file0 to the HOSPEXerver for validation, cleaning up and reformat- ting. Because records later will be partly enhanced/ trans- formed to HTML in realtime, all files must be reformatted acc. to supplied (or other) samples. Output is 'file1' ( 3) As part of validation process the HOSPEXerver extracts a few keywords with which to construct the suitable path to the file-record. If the database does not contain the necessary directories it asks the maintainer whether to create them. Else it checks the number of [nn].txt files in the target directory, increments the nn by 1, stores the file1 auto- matically as [nn+1].txt and sets up a flag for update of the relevant 'index.file' or files along the path. ( 4) Periodically M may issue an 'update [dir]' command to rewrite the index file(s), which will then reset the flag and clear any associated memory registers etc. Alternatively, on quit the HOSPEXerver checks the status of the flag and asks whether to update/ rewrite the indicated file(s). ( 5) After file1 has been incorporated in the database the HOSPEXerver automatically posts it to the HOSPEX-L mailing list. ( 6) Occassionally M will need to remove members, for which there also will be a special command. End-usage scenario -------------------- * Members are sent instructions on how to configure their WWW or other front-ends to access the database in privileged fashion. More specifically, they are told how to construct a base URL with their password already inserted, and how to ensure that any {Mosaic | browser} .global-browse-history files or equivalent, normally readable by all, are made accessible but to themselves. * Non-members are provided with the URL to the top-level index. * The HTTP server (not HOSPEXerver!) listens to both the default HTTP and FTP ports, recognizes and notes which is which (member or non-member, here referred to as ftp- and http-type accesses). As long as the URL of the pointed-to file does not end in a '.txt', it provides normal http-server services (navigation and delivery of 'index' and other .html documents). Once the URL is that of a .txt-suffixed file it calls the HOSPEXerver with, for type of current access, correct argument and the URL. * The HOSPEXerver extracts the requested .txt file and transforms it into an HTML datastream acc. to supplied rules (samples), which it then returns to the HTTP server for forwarding to the WWW client. HOSPEXerver scripts--commands ----------------------------- REALTIME cmds issued by the ! HTTP server in response to ! request from HTTP / FTP port ! % enhance -http path-to-file # on detection of a request for a HOSPEX file # incoming from the HTTP port (default '80'?), # ie a common, non-privileged access attempt, # the HTTP server calls the HOSPEXerver with # the 'http' argument and the partial-URL % enhance -ftp path-to-file # on detection of a request for a file coming # from the FTP port (default '40'?), ie access # from a HOSPEX member, the HTTP server issues # a call for acceptable MIME types, and checks # if 'text/html' is returned. If not then it # assumes that the request came from an FTP # client, calls the ftp daemon and delivers the # file as plaintext. No transformations are # attempted. Else it calls the HOSPEXerver with # the 'ftp' argument and the partial-URL. HOSPEXerver BATCH Commands ! Raw source file is assumed to be a single RFC suitable for assembly into ! 822 mail-message with a HOSPEX member form. an executable shell script ! for web-administration duty ! ____________________________ Unix file syntax % formatRecord file # verify/ clean-up/ modify a raw source text; # save in work directory (name doesn't matter). # Cleaning up means stripping off unwanted # mailheaders and reformatting the text acc. # to sample. Output file is all-plaintext, # not HTML. % addRecord file [ dir ] # if dir not present parse the content of the # COUNTRY: and TOWN: items in , extract # and construct a path in the form of # /hospex/country/town/, then attempt to store # file as "n.txt", where n = number of files # in that directory; automatically create new # subdirectories as needed, named after each # (new) country/ region/ town; if unable to # determine the destination return a verbose # error message "Don't know where to place # country/town" % deleteRecord file [ dir ] # parse the content of Reply-To: or, if absent, # From: header in , extract and search # the database for presence of it and, if found, # delete the record in question and set up a # flag for later regeneration of Index.html # in found directory. If unable to find, exit # with a verbose error message "Member #
not found in database" % update [ file ] # if unlocked regenerate the index.html for # the given dir; differing formatting rules # depending on the level on which update is # being attempted. % update all # regenerate the entire control hierarchy # of the database, but replace only the # unlocked index.html documents. Differing # formatting rules depending on the level # on which update is being attempted. % lock [ file ] [ dir ] # toggle lock of (any index) in current # or indicated directory; return a verbose # message "country/town/Index.html NOW locked" # or "country/town/Index.html NOW unlocked" % announce file # sends to the HOSPEX-L mailing list % quit # check status of the regenerate-structure? # flag prior to exiting. If set, ask whether # to update/ regenerate the control structure # of the database (unlocked documents only) $$