The HTML version of this document has - of course - been produced using AscToHTM itself. No post-processing has been done to the HTML pages produced. The contents list, the navigation bar and all the hyperlinks have been generated from a single [[SOURCE_FILE]] a2hdoco.txt and a number of small configuration files. The source text file for this manual is over 5,000 lines and *still* growing having spawned the 6,500 line [Policy Manual] and a 3,800 line [Tag Manual]. See section 6.1 of this document to see a list of the actual files involved. Any RTF version has been generated by the new text-to-RTF program AscToRTF which uses the same analysis engine as AscToHTM. This document describes AscToHTM version [[TEXT 4.1]], which is available from August 2001 onwards. $_$_CONTENTS_LIST 1 Introduction ------------------ AscToHTM is an ASCII to HTML conversion tool. It has, of course, been used to generate the HTML version of this document from the text file a2hdoco.txt (see [[GOTO an example conversion]] for more details). The HTML version of this document is presented "as is". That is, *no post-production of the HTML has occurred*. This should give you a flavour of what AscToHTM is capable of. Any RTF version of this document will have been made by [AscToRTF], the sister product that shares the same text analysis engine. AscToHTM is made available for download via the Internet from [download location]. 1.1 AscToHTM's design objectives 1.1.1 Intelligent analysis. AscToHTM is designed to analyse a document to determine its structure and layout. This analysis allows AscToHTM to decide how best to mark up the HTML so as to accurately represent the author's original meaning as far as possible. This analysis helps AscToHTM to reduce errors by allowing it to spot anomalies in the document source. This is important in minimising the amount of any post-production work required to fix errors. 1.1.2 Human-readable HTML AscToHTM tries to create HTML that can be easily read and modified in an editor. This is useful if corrections are necessary, or further development is required. For example AscToHTM a) produces short (usually <80 character) output lines b) attempts to indent the HTML to match the output indentation. c) adds comments to the HTML to indicate include files etc. d) uses
tags for indentation, rather than placing the whole file in ...
tags. e) produces "clean" HTML without large numbers of unnecessary tags. Note, later moves to make more standards-compliant and browser-compatible HTML code tend to work against making user-readable code. For example most browsers have rendering problems when newline characters are placed in certain key locations, whereas adding newline characters can make the HTML easier to read. 1.1.3 Simple user input Inevitably users have supply additional information to tell AscToHTM where its analysis has gone wrong and to add additional information such as a document title etc. AscToHTM offers a large number of options (also known as "policies") that the user can modify. Broadly speaking, these policies fall into two camps - *Analysis policies*. These policies affect the way AscToHTM analyses your file, and can be used to disable searches for things like bullets, or to specify whether or not underlined headings are to be expected. - *Output policies*. These policies influence the types of HTML markup that are produced. They also allow you add colour, headers footers, background images and much more to your pages. AscToHTM can save your policies to a file, so that next time you run it you can load this information back from the "policy" file. This also allows you to create different sets of policies (e.g. to use different colour schemes). Policies are described fully in the [Policy Manual]. You can further refine the conversion by placing special lines and tags into your source file. These are known as pre-processor commands (see [[GOTO Using the preprocessor]]) and in-line tags (see [[GOTO In-line tags]]). The preprocessor tags are described fully in the [Tag Manual] To help users formulate and modify their document's policy, AscToHTM can be made to create an output policy file (see 4.2.2.9). Users can then simply edit this file and feed it back into the conversion process. A summary of the recognised policy lines is given in the [Policy manual]. 1.1.4 Standards compliance. Earlier versions of AscToHTM (before version [[TEXT 3.2]]) made no real attempt to be standards compliance. Now standards compliance is a stated goal or the program. Sadly I can't _guarantee_ standards compliance because the HTML generation is so complex that errors can and do occur, but it _is_ a goal, and usually documents will validate with few problems. Compliance has proved to be vital to get cross-browser compatability, and to stand a chance of successfully applying CSS to created pages. Original versions of AscToHTM were (loosely) targeted at producing HTML [[TEXT 3.2]] code. Currently the software is targeted at "HTML [[TEXT 4.0]] Transitional", which allows CSS, but also permits tags (although these are deprecated). This is a compromise standard that is best placed to be well viewed by V3 and V4 browsers. Future versions of the program may attempt to generate stricter HTML [[TEXT 4.0]] code, while still offering production of the earlier HTML standards. The policy [[HYPERLINK POLICY,"HTML version to be targeted"]] offers some ability to choose the style of HTML generated. 1.2 Expected uses of AscToHTM - Placing text files quickly and easily on the web Plain text is still a very popular data format. It is easy to generate, and easy to read. However text files when placed on the web don't look as nice as normal web pages. AscToHTM will allow you to quickly add the HTML markup required to turn a plain text page into a nice looking HTML page. Because it is an automated conversion it will save you time, and ensure you avoid typos in HTML tags that could stop the page displaying wrongly in some web browsers. - Migration of "legacy" text to HTML. Large amounts of unconverted text exist. As people plan to put this information on the Web, conversion to HTML will become necessary. This can be a tedious and time-consuming task. AscToHTM will do much of the work for you. AscToHTM is priced to be worth an hour of two of your time. This means that the "pay back" time is negligible (we only mention this in case you have bean-counters to convince :). If you don't think AscToHTM will save you hours, then by all means don't buy it. - Facilitate mastering of HTML pages in ASCII The HTML created by AscToHTM may not be as pretty or as clever as that generated by a full blown HTML editor (read as "bloated"). But... It'll be easier to write, edit and spell-check, and it may have a hyperlinked contents list generated. - Automated conversions AscToHTM can be used to automatically convert text documents that you receive. For this we usually suggest you run in command line mode. - Conversion of reports to HTML Many people have legacy systems that generate printed reports that may be saved to file. AscToHTM can help extend the lifetime of such systems by turning their output to HTML. It may be you'll need some help in getting the best results from the program in such cases, since many reports consist of complex tables. - Conversion print spool files to HTML Printer spool files are not strictly speaking plain text, but often - especially in older software systems - these files are plain text with a few printer controls added. Some users have had great success converting such files using asctohtm, and to support this we have added a limited ability to recognise and strip out Unix control characters, VT escape sequences and PCL printer codes. If you have a requirement in this area, contact the author at jaf@jafsoft.com to discuss whether the software can be made to meet your needs. 1.3 Other uses of AscToHTM - Convert Word documents Please note, AscToHTM *DOES NOT* convert Word's .doc or .rtf file formats. AscToHTM was _never_ intended to handle Word documents. We fully expect HTML export and import filters to appear (they have in Word '97), and we would advise anyone whose master document is in Word to search out these filters and give them a try. That said... a lot of people seem unhappy with what's already available, and AscToHTM does a reasonable job if you save the file as text with line breaks, though obviously tables and figures will get lost (in the case of tables, because Word throws them away). The main problem is that Word produces lousy looking text. This is one area where AscToHTM does a little better than "garbage in, garbage out" - Pre-process text for import to Word. (This is a bit cheeky, but does actually work.). Use AscToHTM to convert text to HTML, then import this into your word processing package. Since the text analysis engine in AscToHTM out-performs that in Word in many respects (URL, table and heading detection to name but three), you can often get better results than importing from text direct.. That's because AscToHTM's analysis engine is *smarter*. That's not just our view (see http://www.jafsoft.com/asctohtm/reviews.html) NOTE: The same text analysis engine is used in the text-to-RTF program [AscToRTF], which is more suited to this purpose. - Pre-process text for printing Use AscToHTM to convert text to HTML, then print the file from within Netscape or whatever. The result is a much nicer looking document with fonts'n'stuff. - Add hyperlinks to fairly ordinary pages. AscToHTM has a "link dictionary" feature that can be used to add hyperlinks to any word or phrase (see the [Policy Manual]). This can greatly enhance an otherwise dull set of text pages. 2 Installation ------------------ The shareware version of AscToHTM is made available over the web from [Download location]. Once you register you can download the full version (no nags, no limits), and will be notified of upgrades. To date all upgrades have been free, but we cannot guarantee that this policy will continue for ever. So far I've *never* requested payment for any [updates] in over 4 years of continuous development. You can see this history in the [[GOTO Change History]] Installation will vary according to the type of install kit you've downloaded, but in each case you first download the .ZIP file appropriate to your system and unzip. 2.1 VMS installation Unzip the files. If you've taken an executable version, that's it. If you've taken an object library version, execute the build command file. You might want to define a foreign command to get better use out of the program. Have a cup of coffee and relax :) 2.2 Windows installation The current version of the software makes updates to your Registry. See the Install notes that come with the software for a description of the registry settings used. The Windows version requires certain .DLLs. In earlier versions these were supplied separately, with the result that the download size was doubled. The software is "statically linked", meaning you don't need to worry about having the correct DLLs. The price is a slightly larger .exe file. 2.2.1 Install/uninstall version If you've taken a version with install/uninstall help, unzip the file and then run the setup.exe program. This will move the files to a directory, and create all icons etc. Once installed, your computer will offer an uninstall option. You can access this via Control Panel | Add/remove software. The install/uninstall version is substantially larger than the manual install versions. This means some people experience problems downloading the file. For this reason smaller kits without the install/uninstall help are available (see 2.2.2) 2.2.2 Simple .ZIP file version This version should be used by anyone who anticipates problems with the install/uninstall version, or who simply wishes to upgrade an existing version. It's several times smaller than the install/uninstall version. Once you've unzipped the files, move them to your preferred directory. There will be no uninstall command in this case (unless you are overlaying an earlier installed version). 2.2.3 Console application Originally AscToHTM was only available as a console application. Now it is available as a fully-windowed application. Consequently this version is not so widely available. The console version is made available to registered users who wish it, as it is more suited for automated conversions inside batch jobs etc. The conversion engine is identical in each case, it's just GUI-less. Please note, this is *not* the same as the DOS version that we once to produce, and will require a Win32 environment. The DOS version is no longer available. Note: In later versions of the program the Windows version of supports both the console and Windowed interfaces in a single program. As a result only the Windows version is shipped by default. A separate console version (A2HCONS) will be supplied free of charge to registered users who have a specific requirement (e.g. they want to run AscToHTM from a command file) 3 How AscToHTM works ------------------------ 3.1 The big assumption AscToHTM makes one big assumption :- *Each text file has been laid out in a consistent manner by its author in a way that makes it easy for a human reader to understand.* Given this, AscToHTM tries to read the text file and mark it up in HTML accordingly. This is achieved by making three passes through the document, an analysis pass (see 3.2), a collating pass (see 3.3), and an output pass (see 3.4). Note: Sadly this assumption is not always true 3.2 The analysis pass During the analysis pass AscToHTM gathers together all the statistics that it needs to analyse how the author has laid out the file. For example, the distribution of line indentations and line lengths is observed, together with the number and types of bullets, section headings and lots of other stuff. Once this has been done, the program uses this data to determine the rules used by the author in structured their document. For example are the section headings underlined, capitalised or numbered? If numbered, what style of numbering is used, and by how many characters is each type of heading indented? This information is then used to set the analysis polices (see the [Policy Manual]) which may then be overridden by the user (to correct errors), or by loading a policy file with different values. 3.3 The collating pass Having performed the analysis, the program makes a second "collating" pass. This is effectively a dry run for the output pass. During this pass the program determines how the file will be output into one or more output files and where certain key in-line tags occur. It also assembles any contents list. This information is then used during the output pass to reduce the likelyhood of errors, and to ensure all internal hyperlinks are valid and will point to the correct anchor point in the correct output file. 3.4 The output pass During the output pass AscToHTM - generates the HTML and (optionally) - creates a suite of inter-linked HTML pages - creates a set of FRAMES to place the HTML pages into - copies the HTML to the Windows clipboard - generates a contents list - generates a directory page 3.4.1 Generating HTML The HTML generated depends on - the original document, including any preprocessor tags placed in the source document. - the calculated document policy, modified by any user policies supplied - any [HTML fragments] that are defined [[GOTO HTML markup produced]] describes the markup produced in more detail. 3.4.2 Generating a contents list AscToHTM can detect the presence of a (numbered) contents list in the original document. Alternatively you can choose (see [[GOTO Contents generation policies]]) to have AscToHTM to generate a contents list for you, in which case any original list is omitted from the output HTML document. Regardless of whether the original or generated contents list is used, AscToHTM will turn the contents list into hyperlinks that will take you to the correct HTML file and location. There is a fuller discussion of [[GOTO contents lists]]. The policies that influence contents list production are listed in [[GOTO Contents generation policies]], whilst the pre-processor commands are described in 7.1.3. 3.4.3 Splitting the document into many HTML pages By default AscToHTM creates a single .HTML file. However, through file organisation document policies (see [[GOTO File generation policies]]) it is possible to a) Split the document into a number of smaller .HTML files (see the policy [[HYPERLINK POLICY,"Split Level"]]). b) Insert standard JavaScript into the ... section of each page (see also the policy [[HYPERLINK POLICY,"HTML script file"]]). c) Add a HTML "header" to the top of each generated file (see also the policy [[HYPERLINK POLICY,"HTML header file"]]) d) Add a navigation bar at the foot of each page with links to the Next/Previous .HTML page and the contents list (see also the policy [[HYPERLINK POLICY,"Add navigation bar"]]). e) Add a HTML "footer" to the end of each generated page (see also the policy [[HYPERLINK POLICY,"HTML header file"]]) 3.4.4 Generating a set of FRAMES *New in version 4* AscToHTM can place the HTML into a set of FRAMES. This is described fully in the chapter on [[GOTO Frames]] 3.4.5 Generating HTML for the Windows clipboard *New in version 4* The Windows version of the software can place the HTML generated into the clipboard, rather than outputting it into a file. This makes it easier to paste the HTML into another application (such as a HTML editor). When this code of conversion is selected, the and tags are omitted from the output. The use of the clipboard is made even more powerful if a clipboard extender such as ClipMate is used. See http://www.jafsoft.com/clipmate.html 4 Running AscToHTM ---------------------- 4.1 Windows version 4.1.1 Launching the program 4.1.1.1 Normal activation Just run the program as you would any other Windows program, i.e. by clicking on its icon, or launching it from the Start menu. 4.1.1.2 Execution from a command line From a Windows console command prompt you can type C:> AscToHTM or C:> AscToHTM ... In the first case, AscToHTM is launched as normal. In the second case AscToHTM will convert the specified files, briefly displaying a status window, and then exiting. In this case, one of the named files can be a .pol policy file. Note, a console version of AscToHTM (A2hCONS) exists that can only be run from the Command line prompt. The Windows version also supports this command syntax, but users wanting to do large conversions, or conversions from within a batch file are advised to use the console version. Currently the console version is only being made available to registered users. 4.1.1.3 Drag'n'Drop execution Create an Icon for AscToHTM, and simply drag'n'drop files onto it. The results are identical to those obtained by typing in the filenames as described in 4.1.1.2. Alternatively, run the program as normal and then drag files onto the running program. You can configure the program's behaviour in drag'n'drop operation by using the Settings | Drag'n'Drop menu. One useful suggestion is to add AscToHTM to your "SendTo" menu (shown when you right-click on a file). See the Windows help file for more details. See 4.4.6 for more suggestions on using desktop icons. 4.1.1.4 Output to the Windows clipboard *New in version 4* You can use the program to convert the file to HTML, and to copy the HTML into the Windows clipboard, ready for pasting into other applications (such as a HTML editor). To do this, launch the Windows program as normal, and set the Conversion Type on the main screen to "Output HTML to clipboard". The HTML copied to the clipboard will be without the and tags to make the HTML more suited to pasting into an existing HTML page. 4.1.2 Using the Windows Interface The Windows interface was re-vamped in version [[TEXT 3.0]] and further enhanced in version [[TEXT 3.2]]. The main changes are - Introduction of a Windows Menu to replace the old button bar - More options on this menu to make locating the correct features more accessible. - The policy property sheets no longer have to be closed when doing a conversion. - Use of DDE to view results in browsers already running. 4.1.2.1 Doing a straightforward conversion To do a simple conversion, simply enter the name of the file to be converted or use the "Browse" button to locate the file to be converted. Then press the "Convert file(s)" button. A status screen is displayed whilst the conversion is in progress. For small files this may flash up so fast you can't actually read it. (If you want to see what it said go to the View...Messages menu option) To view the HTML, press the "View results" button. This should launch your preferred HTML viewer to display the newly created HTML page. If you want to automate that process, edit the program's viewer settings (see 4.1.2.4). 4.1.2.2 The File menu The File menu has the following options: - *Convert* Initiates the conversion. If you already have a file selected, this file is converted. If you don't, then a browse window will open allowing you to choose a file to convert. This option is identical to pressing the "Convert files" button. - *Load policies from file* This option allows you to load a set of policies previously saved to a policy file. This allows a conversion to be repeatedly done the same way, or a set of conversions to be done the same way (see 6.5) Note, you can set a policy file to be used by default see 4.1.3. - *Save policies to file* This option allows you to save your current set of policies to a policy file for later re-use. It is recommended that only a partial set of policies (i.e. any loaded policies and manually set policies) be saved to allow the program maximum flexibility when converting future files. See section 6.5 and the discussion in 6.5.2.1 - *Exit* Exits the program 4.1.2.3 The Conversion options menu AscToHTM offers the advanced user a large number of program options. These are called policies, and may be saved in policy files for later re-use. Policy files are described in detail in Chapter 6 of this document. Policies broadly come in two sorts. - *Analysis policies* represent a description of what the source file does and does not contain. These policies are usually set to default values and/or calculated by analysing the source document. They should only ever need to be manually adjusted if you wish to correct the analysis, or override the detection of certain typographical features. - *Output policies* represent styling and other options that cannot be inferred from the source document. These include styling and markup options, and allow the user to "add value" to the HTML generated. The Conversion Options menu has options to allow you to view and change many of the program's policies (but not all, see the [Policy Manual] for details). The menu also has options to - *Load, or change, the policy file used* These options allow you to browse for and open the policy file that you want to use. Essentially they are identical to the load option on the File menu (see 4.1.2.2) - *Reload policies from file* This allows you to reload the policy file (e.g. because you've just edited it by hand) - *Save policies to file* This allows you to save the current policies to file for later reuse. - *Re-analyze the file* This option forces AscToHTM to re-analyse the current source file to (re-)calculate the analysis policies. - *Reset to defaults* This option forces all policies back to their AscToHTM defaults. This will negate the effect of any manually set policies, or policies loaded from a policy file. 4.1.2.4 The Settings menu The Settings menu allows you to tailor the way in which the program executes. These settings will usually be saved in your Registry so that they are remembered for next time. The Settings menu includes options for - *Documentation* Specifies the location of the program's documentation on your hard disk (see 4.1.3.1). - *Diagnostics* Specifies the level and type of error reporting wanted during the conversion (see 4.1.3.2) - *Drag and drop execution* Specifies the program's behaviour when invoked by dragging files onto the program's icon (see 4.1.3.3.). - *Viewers for results files* Specifies the browser to be used to view results files, and how it should be invoked (see 4.1.3.4). - *Use of policy files* Specifies any default policy file to be used (see 4.1.3.5). 4.1.2.5 The Language menu The language menu allows you to change the language used in the user interface. These translations are provides by a group of volunteers. Currently translations exist for :- - English - German - *New in Version 4* Italian - Spanish - Portuguese - *New in Version 4.1* Swedish The software supports the concept of language "skins" which allow the user interface to be exported to an external file which may be edited and then re-loaded. This allows users to offer their own translations, or to correct errors in the existing translations. You can read more in [[GOTO Language support]] 4.1.2.6 The View menu - *Messages from last conversion* This option allows you to re-view the Messages window displayed during file conversion. On small files this window can sometime be shown too briefly to view the messages. - *Results of last conversion* This option will launch the preferred browser for the last file converted. If a wildcard conversion was done, the last file in the group is shown. This option has the same effect as the "View results" button. 4.1.2.7 The Help menu - *Contents* This option brings up the Windows help file. This offers a lot of context-sensitive help which can usually be accessed by pressing F1 or "Help" anywhere in the program. Over time the Windows Help file has adopted a secondary role compared to the HTML documentation. - *Register/check updates* This option takes you to the web page offering registration details or (if you've already registered) listing recent updates. Currently the registration page is http://www.jafsoft.com/asctohtm/register_online.html?from=doco - *HTML documentation* This option allows you to view the HTML documentation for the software. You can either view your local copy from your hard disk, or read the version on the web site (you'll need to connect to the Internet in that case). Each installation of AscToHTM comes complete with HTML documentation. Should you decide to move your copy of the documentation, you'll need to alter the settings (see 4.1.3) - *Other products* A list of web pages describing other JafSoft Limited products that may be of interest to you. - *About* This option launches the About screen. This gives program version information, shows your registration status, and provides a couple of buttons to access the home page and other pages on the Web. 4.1.3 Program settings These settings allow you to customize the program's behaviour to a limited extend. 4.1.3.1 Documentation Allows you to specify the location of the HTML documentation for the program on your machine. By default this is the same as the program directory, and you should only need to change this if you move it. 4.1.3.2 Diagnostics Allows you to select the level of detail you want in the messages displayed during conversion. You can also elect to suppress messages by type. 4.1.3.3 Drag and drop execution Allows you to specify how you want the program to behave when it is launched by dropping files into the program, or its icon on the desktop. 4.1.3.4 Results viewers Allows you to specify the HTML browser to be used to view the created HTML. You can elect to always invoke a results viewer after conversion, and to use DDE to achieve this. DDE allows the program to tell an existing browser to display the results. Without DDE a new instance of the browser is launched each time. The behaviour of the browser when sent the results file varies from browser to browser. If the DDE call fails for any reason, a new instance of the default browser is launched, so you should ensure this is the same browser as that identified for DDE. This dialog also allows an RTF viewer to be selected. This may be used for viewing RTF files, although it's possible that at present your version of the program doesn't require this yet. NOTE: On some systems DDE doesn't always work properly. This would cause the program to hang when it attempted to display results. In such cases you would need to stop the program from the task manager. The program will now detect when this has happened and disable use of DDE next time it runs. You can re-enable it using the _Settings | Viewers_ menu option NOTE: Whereas DDE works fine with Netscape versions up to and including [[TEXT 4.7]], it doesn't work with Netscape [[TEXT 6.0]] since initial versions of that browser don't support DDE under Windows 4.1.3.5 Use of policy files Allows a default policy file to be specified. This is not normally desirable, but if you always use the same policy file, this will save you having to load it each time you run the program. You can also elect to always reload the policy file during conversion. This should only be necessary if you're repeatedly changing the policy file in a text editor between conversions while the program is running. Note, an alternative to using a default policy file is to define a desktop icon with the policy file specified on the command line (see 4.4.6) 4.1.4 Language support There is an ongoing effort to make AscToHTM available in more languages. This effort is being undertaken by a number of volunteers. It's unlikely that full translations of User Interface, error messages, help files and documentation will be available in all languages, but the hope is to make the program a little easier for those whose first language is not English. Please note that the author only speaks English, and thus can only offer support in English. Elements that may be converted include :- - Menu Text - Window Text - ToolTips Less likely to be translated are :- - Messages - Windows help file - HTML documentation 4.1.4.1 Existing translations Depending on how far the process has gone (and how many changes have been made recently) not all text may be in your selected language. Currently translations exist for :- - English (British) - German - Portuguese - Spanish - *New in version 4* Italian - *New in version 4.1* Swedish My thanks to all those involved. If you'd like to get involved in this effort, visit http://www.jafsoft.com/products/translations.html 4.1.4.2 Adding translations using "Language skins" The program now supports the concept of language "skins". This allows the existing translation to be exported to a text file called a "language skin". This file consists of one line per translation per line, with a unique number at the start of the line identifying the text, and then the text itself. You can edit this file - conventionally with a .lng extension - to contain your own versions of the strings, and then reload this back into the program via the Language menu. The changes will take immediate effect, and are remembered next time you run the program. Using this technique we can now offer via and American spellchecker and the translation services of http://babelfish.altavista.com/ three new languages :- - American English - "Babelfish" French These translations are not expected to be ideal, but they offer a starting point. If anyone cares to mail me corrected version I'll happily make these available... with full credit to the translator. Mail updates to translations@jafsoft.com 4.2 VMS and console application versions The VMS version and windows console version behave identically in terms of their use of command arguments. A Linux version has also been beta tested. The Windows console version performs identically to the Windows version (which supports the command line operation), but is more suited to use inside batch operations. For example the Windows version is likely to gain focus when it executes, which can be distracting. The Windows console version is called A2HCONS (to distinguish it from the fully windowed version AscToHTM), but is only available to registered users of the software. 4.2.1 Command line arguments The command line should be of the form AscToHTM [] [] Where Filespec Any valid file specification for the system you're using. This can include wildcards. In the Windows version, this can also be space separated lists of files Policy_file The name of any policy file (see [[GOTO Using Document Policy files]]) you want to use for the conversion. Policy files are recognised by having a .pol file extension. For this reason you cannot convert .pol files to HTML. Qualifiers Extra commands that may be passed in via the command line. In most cases these are equivalent to policies, they're just made available on the command line for your convenience. 4.2.2 Command line qualifiers Certain aspects of AscToHTM's behaviour can be changed by adding qualifiers to the command line. Qualifiers must begin with the slash (/) character but may be of mixed case and may be shortened provided they remain unique. So /H will get you help, whereas you can't use /S since that could be /SILENT or /SIMPLE 4.2.2.1 The /COMMA qualifier *New in version 4* Specifies that the source file is a comma-delimited table. In this case each line will become a row in a table, and each value separated by a comma will become a cell in the table. 4.2.2.2 The /CONSOLE qualifier Specifies that the HTML generated should be directed to the output stream, rather than to an output file. This is a step towards making the program more suited for use inside a web server, e.g. to dynamically convert text to HTML on demand, although it is expected this process has some distance to go yet. 4.2.2.3 The /CONTENTS qualifier This has exactly the same effect as the [[HYPERLINK POLICY,"add contents list"]] policy line. 4.2.2.4 The /DEBUG and /LIST qualifiers These qualifiers cause AscToHTM to generate some diagnostic files, which have extensions .LIS1 an analysis before policy is set .LIS an analysis after policy is set .STATS a statistics file The list files can assist in understanding how AscToHTM has interpreted your file. The .stats file is neither pretty, nor easy to read, but can in extreme cases assist in diagnosing faults should you wish to report them. If the /LIST qualifier is used, only the list files are created. If the /DEBUG qualifier is used the .stats file is also created. 4.2.2.5 The /DOS qualifier This has exactly the same effect as the [[HYPERLINK POLICY,"Use DOS filenames"]] policy line 4.2.2.6 The /INDEX qualifier This has exactly the same effect as the [[HYPERLINK POLICY,"Make Directory"]] policy line 4.2.2.7 The /LOG[=filespec] qualifier This specifies that a .log file should be created. This will contain a copy of all messages generated during the conversion, together with some that may have been suppressed. You can specify the log filespec. This can include wildcards, with the input file being used to replace any parts of the filename not specified. If omitted, the default log file name is AscToHTM.log 4.2.2.8 The /OUT=filespec qualifier This specifies where the output file(s) should be placed. It can include wildcards, with the input file being used to replace any parts of the filename not specified. Thus "/OUT=*.shtml" will result in a file with the same name, but a .shtml extension. In VMS "/OUT=[.sub]" will place the output in a sub-directory called "sub". If omitted, the output file is given the same name as the input file but with a .html extension. That behaviour may change dependant on the values of a number of other policies. 4.2.2.9 The /POLICY qualifier This has exactly the same effect as the [[HYPERLINK POLICY,"Output Policy file"]] policy line. When used it will *generate* a new policy file (possibly overwriting an exsisting file) which completely documents the policy used in the conversion. This file will be a *full* policy file, and should not normally be used as an input policy file, as it will overly-constrain the program's ability to adapt. Instead you should edit this file to remove all bar the most important lines. NOTE: If you want to supply an *input* policy file to the conversion you do this by supplying the name of the policy file (which must have a .".pol" extension) after the names of the files to be converted. For example _AscToHTM file.txt input_policy.pol /pol=output_policy.pol_ See the discussion of "full" and "partial" policy files in Chapter 3 of the [Policy manual] 4.2.2.10 The /SILENT qualifier This specifies that no messages should be displayed on the console. When used with the /CONSOLE qualifier (see 4.2.2.2) this makes the program suitable for use in a web server, although you may need to use redirection under Windows. 4.2.2.11 The /SIMPLE qualifier This has exactly the same effect as the [[HYPERLINK POLICY,"Keep it simple"]] policy line. 4.2.2.12 The /TABBED qualifier *New in version 4* Specifies that the source file is a tab-delimited table. In this case each line will become a row in a table, and each value separated by a tab will become a cell in the table. 4.2.2.13 The /TABLE qualifier *New in version 4* Specifies that the source file is a plain text table. In this the program will do its best to analyse the table structure, and reproduce it. 4.3 Getting the most from AscToHTM 4.3.1 Making your first attempt 4.3.1.1 From the command line To run AscToHTM simply type AscToHTM Input_file.name at the command line. This will create a file :- - input_file.html An output file which will have the same file name with a .html extension The program may display a number of status messages indicating source lines that it rejects because they "fail policy". Source lines that fail policy are usually simply copied to the output file with no markup applied. These messages are largely informational, and can be ignored if the conversion worked okay. If it didn't, these messages may give a clue as to where the analysis went wrong. 4.3.1.2 From Windows Enter the name of the file to be converted in the text field. If you wish, use the browse button to search for the file to be converted. Once you've chosen the file, the output filename and input and output directories are inferred from the filename. If you wish, you may edit the output filename and directory. Press the *Convert file(s)* button. The Messages window will briefly display. If you wish to view these messages later, select the *View | Show Messages from last conversion* menu option. To view the last file converted, select the *View | Results of last conversion* menu option. This should launch your default browser for the file types (.htm or .html) just created. If you get the message "cannot detect default browser", use the *Settings* menu to set up the path to the browser you wish to use and try again. 4.3.2 Refining your results If all goes well the resultant HTML will be satisfactory and all in one file. You can further refine the conversion by creating your own document policy. In the Windows version, this is done by editing policies via the *Conversion Options* menu, which is fully described in the context-sensitive Windows Help file (press F1 at any point). However, in all versions the policies can be saved to a text policy file and it is the format of that file that is shown and discussed in this document. 4.3.2.1 Using a policy file If your initial results are a little strange, then review the policies calculated by the program, and create a "policy file" to tell the program how to do the conversion differently. You can do this as follows :- a) _By creating a "sample" policy file_ You can create a sample .pol policy file that documents the policies used. Do this either by using the command line AscToHTM Input_file.name /policy or by ticking "Generate a sample policy file" on the Conversion Options->File Generation tabbed dialogue When this is done then the next time you convert the file, in addition to the .html file generated, you will now have an output policy file "input_file.POL" which describes the document policy file calculated by AscToHTM (see 3.2) and used by it during the conversion. This file will contain one line each for all the program policies, *most of which should be correct*. Review the contents of this file, deleting all lines that look correct, and editing all lines that appear to be wrong. Save the modified .POL file which should only contain lines for those policies you think are wrong or want to override. You'll may need to review the [Policy Manual] in order to understand the policies to do this fully. b) _By re-analysing the file_ Under Windows a slightly easier option is to select Conversion Options -> Re-analyse the file. This will analyse the file and change all the policy values currently on display to be the values calculated by the program. You can then review and change these values using the tabbed dialogues. Once you're happy with your changes, select "Save policies to file" from the menu, saving only the changed policies. You can review this file in a normal text editor. Once you've produced your new input policy file, re-run the conversion using the new policy file. The program will now override aspects of the calculated document policy with the input policy you've supplied. Each document policy file consists of a number of lines of data. Each line has the form Keywords : Data value(s) For clarity a number of section headers are added like this : [Analysis] Such headings are ignored, as are any lines whose keywords are not recognised or not yet supported. The order of policies within the file is usually unimportant, and the placement relative to the "headings" is ignored. The Headings are simply there to make the file easier to read in a text editor. A sample fragment from a calculate policy file looks like this $_$_BEGIN_PRE [Hyperlinks] ------------ Create hyperlinks : Yes Create mailto links : Yes Create NEWS links : Yes [Added HTML] ------------ Document Title : (none) $_$_END_PRE These are all default values used by AscToHTM. If, for example you want to add a title to your page and prevent email addresses being turned into hyperlinks, simply create a policy file containing the lines $_$_BEGIN_PRE [Hyperlinks] ------------ Create mailto links : No [Added HTML] ------------ Document Title : Title text for the HTML page $_$_END_PRE (Remember the insertion of section headings is optional, as is the ordering of policies within the file). By refining the input policy file, you can greatly influence the output that AscToHTM generates 4.3.2.2 Using a link dictionary In addition to adding hyperlinks for all URLs, email addresses, section references and contents list entries, AscToHTM allows users to specify key phrases that should be turned into hyperlinks. This is achieved by adding lines to the input policy of the form $_$_BEGIN_PRE [Link Dictionary] ----------------- Link definition : "[AV]" = "AltaVista" + "Using_AltaVista.html" $_$_END_PRE The syntax used here is $_$_BEGIN_PRE Link definition : "match phrase" = "replacement phrase" + "link" $_$_END_PRE In this case the string "[AV]" is replaced by a link to a web page "Using_AltaVista.html" with the text "AltaVista" being highlighted. The link dictionary used for this documentation can be seen in the file A2HLINKS.DAT. 4.3.2.3 Using multiple policy files If you wish to use AscToHTM to support several text files e.g. for a set of Intranet documentation, it may be useful to share some common document policies, e.g. colour, headers and footers and particularly the link dictionary. To support this AscToHTM allows two special types of line in the policy file. a) Include files include file : Link_Dictionary.dat If a line of this type is encountered, the contents of the file Link_dictionary.dat are included in the current policy file. This is the best way of sharing data across many converted files. b) "daisy-chain" files switch to file : Other_policy_file.dat If a line of this type is encountered, the processing of the current file terminates, and continues in the named file. This is a way of "daisy-chaining" policy files together which may be useful if you wish to group files together at different levels. 4.3.2.4 Creating DOS-compatible files Occasionally it may be necessary to create files consistent with the DOS nnnnnnnn.nnn naming convention. This can happen when working on a DOS or windows 3.n machine, or via a network that has this limitation e.g. Pathworks. AscToHTM supports this. There are two ways to achieve this. Either use the command AscToHTM input_file.name /DOS Alternatively, simply add the lines $_$_BEGIN_PRE [File generation] ----------------- Use DOS filenames : Yes DOS filename root : A2H $_$_END_PRE to your policy file. AscToHTM will then create a base file called (in this case) A2H.HTM. If you're splitting a large document into many files, subsequent files have the form _
.HTM When this name becomes two long, AscToHTM will create a name of the form AAANNNNN.HTM Where AAA comes from the file root, and NNNNN is a 5-digit code derived from the rest of the file name. 4.3.2.5 Use the pre-processor and in-line tags AscToHTM has a built-in pre-processor. This allows you to add special codes to your source file that tell the program what you'd like it to do. Examples include delimiting tables, embedding raw HTML or adding a timestamp to the file being converted. See [[GOTO Using the preprocessor]] and [[GOTO In-line tags]] for more details. 4.3.3 Processing several files at once 4.3.3.1 Using wildcards You can convert multiple files at one time by specifying a wildcard describing the files to be converted. The wildcard has to be meaningful to the operating system you are using, and is expanded in alphabetical order. Under Windows this ordering may be case-sensitive. At present we recommend that wildcards are only used on the contents of a single directory. Indeed wildcards spanning directories are probably not supported (let's just say it's untested :-) Note, the same policies will apply to all files being converted. If you wish different policies to apply, use a steering command file (see 4.3.3.2) Note: In the shareware version, wildcard conversions are limited to only 10 files 4.3.3.2 Using a steering command file In the console version you can convert several files at the same time in the order and manner of your choosing. To do this use the command AscToHTM @List.file [rest of command line] Where the file "list.file" is a steering file which contains a list of AscToHTM command, and the "@" in front indicates it is a list file, rather than a file to be converted. An example list file might look like $_$_BEGIN_PRE ! this is the main document DOCO.TXT IN_DOCO.POL /DOS # # These are the other chapters CHAPTER2.TXT CHAPTER3.TXT /SIMPLE $_$_END_PRE Note the use of "!" or "#" at the start of a line signifies it's a comment line to be ignored. Any qualifiers used on the original AscToHTM line are used as defaults for each conversion, but are overridden by any listed in the list file. In this way it would be possible to specify a default policy file for a bunch of similar conversions. Note: In the shareware version, batch conversions are limited to only 10 files 4.3.4 Generating log files If you want a log of what has been done, you can create a log file. This can be done in a number of ways :- - *From the command line* On the command line you can use to launch the program, add the the /LOG= qualifier (see 4.2.2.7). - *From the policy file* Use the [[HYPERLINK POLICY,"Create a log file"]] policy. You will need to manually edit this into your .pol file, as it can't be set via the user interface. - *From the Status Dialog* In the Windows version, the Status Dialog now contains a "Save to file" option to save the displayed messages. This dialog is currently limited to 32,676 characters. 4.4 Other tips and tricks 4.4.1 General - Read the [FAQ] - Browse the [Policy Manual] to gain a feel for what options the program support. Similarly the [Tag Manual] - If you can, try to use as much white space as possible, e.g. before paragraphs and new sections and at the end of the document. This makes it easier for AscToHTM to place things in context, reduces ambiguity and increases the chances of correct HTML being generated. - Ensure you have consistency in your use of indentation, bullets etc. On the output pass AscToHTM rejects lines that "fail policy", so any inconsistencies are liable to lead to errors in the HTML. - Try to avoid lines that may confuse AscToHTM. For example numbers at the start of a line of text may be interpreted as a section heading. If the number is out of sequence, or at an incorrect indentation this will "fail policy". However, it may cause confusion and is best avoided wherever possible. Where a number has to be at the start of a line, try using an indentation level that doesn't match that used by your headings. - Review the messages displayed during conversion. Often these will highlight problems perceived by the software. 4.4.2 Link dictionary - Try to avoid using match words that are substrings of other match words. If you can't avoid this, then list the longer entries first - Try to ensure you match words will only match the places you want them to match. This means avoiding overly short match words. - If you can bracket your match words or phrases [like this]. This makes for less mistakes, and makes it clearer in the original that you expect a link adding at that point. 4.4.3 Contents List detection Contents list detection is tricky at the best of times. It becomes even trickier if a) There isn't one :-) b) The list only contains chapters and no sub-sections If the program wrongly determines that there is/isn't a contents list, use the following policy line Expect Contents List : No to tell AscToHTM how it has gone wrong. The usual error is to decide there is a contents list where none exists. 4.4.4 Using "Send to" in Windows 95/NT AscToHTM can be invoked (without policy file) from windows in a number of additional ways as follows - Drag and drop your text file onto the AscToHTM icon. If you want a policy file, drag that at the same time - Create a shortcut in your "SendTo" folder in Windows. This is under C:/WINDOWS or C:/WINNT/PROFILES/ depending which system you're using. Once this has been done, right-clicking on a file in explorer brings up the "Send to" menu, and you can now "send" your text file to AscToHTM directly. If you want a policy file, add it as an argument to the shortcut's command line. Better still, create a .BAT file to invoke AscToHTM with a default policy file - e.g. with your favourite colour scheme, and some standard link definitions (see [[GOTO Link dictionary policies]]) - and add this the "SendTo" folder. In this way you can easily convert text files in any number of pre-defined manners. 4.4.5 Tables AscToHTM does a reasonable job of detecting and analysing Tables, but the following tips can be useful. - If the extent of the table is wrongly calculated, mark it up using TABLE pre-processor commands described in 7.1.4, or insert an extra blank line before and/or after the table. Tables will rarely bridge a two-line gap. If the table extent is wrong, try adjusting the [[HYPERLINK POLICY,"Table extending factor"]] (shown as "Extend preformatted regions" on the Conversion Options -> Analysis policies -> Tables menu option. - If AscToHTM places a code fragment or diagram in TABLE markup, mark the source using [[HYPERLINK TAGGING,BEGIN/END_CODE]] or [[HYPERLINK TAGGING,BEGIN/END_DIAGRAM]] pre-processor commands - Avoid mixing tabs and spaces. This makes spotting column alignments positions more difficult. If you do mix them, check that the [[HYPERLINK POLICY,"TAB size"]] policy has a suitable value. - If you want the heading in bold, try drawing a line all the way across, separating header from data - If too many columns are created, adjust the [[HYPERLINK POLICY,"Minimum TABLE column separation"]] to be greater than 1, and ensure there are at least two spaces between columns (see also 7.1.4). Alternatively "break formation" by inserting a space at the start of every second or third line. - If too few columns are created try adjusting the [[HYPERLINK POLICY,"Column merging factor]] policy. - If AscToHTM puts lines in tables when they shouldn't be, increase the [[HYPERLINK POLICY,"Minimum automatic
 size"]] value.  This is
        a common problem in email digests with people's .sigs in them.

      - If you wish to fine-tune a particular table, use the pre-processor
        commands described in 7.1.4.

      - If the table layout is approximately correct, switch off the table
      	border (set the value to 0).  Often this will look acceptable, even
        though the analysis has gone wrong.


4.4.6 Using desktop icons and policy files

      *New in version 4*

      AscToHTM can support arguments being passed on the command line.  One
      useful way to use the program is to add an icon to the desktop, allowing
      you to "drop" files onto the icon to get them converted.

      If you use policy files, edit the icon properties so that the
      command line reads something like

      	"c:\program files\jafsoft\asctohtm.exe" "c:\mydir\mypolicy.pol"

      This will ensure the policy file mypolicy.pol is used in the conversion.
      You may also need to set the working directory to something suitable.

      If you have multiple policy files (e.g. different colour schemes), simply
      create additional icons with different policy files.


5     HTML markup produced
--------------------------

5.1   Text layout

5.1.1 Indentation

      AscToHTM performs statistical analysis on the document to determine
      at what character positions indentations occur.  This information is
      used on the output pass to determine the indentation level for each
      source line.

      AscToHTM attempts to indent the HTML code to match the output
      indentation level, to make it easier to read.

      The indentations themselves are marked up using
      
...
tags. Future versions of AscToHTM may offer you the option of using
tags. 5.1.2 Hanging paragraph indents Some documents, especially ones dumped from Word, have hanging paragraph indents. That is, each paragraph starts at an offset to the rest of the paragraph. AscToHTM struggles heroically with this, and tries not to treat this as text at two indent levels, but it does occasionally get confused. If writing a text file from scratch with AscToHTM in mind, then it is best to avoid this practice. 5.1.3 Bullets AscToHTM detects and supports several types of bullets. 5.1.3.1 Bullet chars Bullet chars are lines of the type $_$_BEGIN_PRE - this is a bullet line - this is a bullet paragraph because it carries over onto more lines $_$_END_PRE That is, a single character followed by the bullet line. AscToHTM can determine via statistical analysis which character, if any, is being used in this way. Special attention is paid to the '-' and 'o' characters. Bullets of this type are given a
    ...
  • ...
markup. 5.1.3.2 Numbered bullets AscToHTM can spot numbered bullets. These can sometimes be confused with section headings in some documents. This is one area where the use of a document policy really pays dividends in sorting the sheep from the goats. Numbered bullets are given a
    ...
  1. ...
markup. Note: Not all browsers support this type of markup. In such cases, it's possible that the numbering of bullets will get reset to 1 every so often. However, this isn't a problem with either Netscape or Internet Explorer. 5.1.3.3 Alphabetic bullets AscToHTM detects upper and lower case alphabetic bullets. These are marked up like numbered bullets, with TYPE=a. 5.1.3.4 Roman Numeral bullets AscToHTM detects upper and lower case roman numeral bullets. These are marked up like numbered bullets, with TYPE=i. 5.1.4 Centred text AscToHTM can attempt to spot sections of centred text. However, because this can easily go wrong this option is normally switched off. Centering is only switched on for single isolated lines, or any group of at least two lines.
...
markup is used. See the policy [[HYPERLINK POLICY,"Allow automatic centring"]] [[LINKPOINT "Definitions"]] 5.1.5 Definitions 5.1.5.1 Definition lines A definition line is a single line that appears to be defining something. Usually this is a line with either a colon (:) or an equals sign (=) in it. For example $_$_BEGIN_PRE IMHO = In my humble opinion $_$_END_PRE or $_$_BEGIN_PRE Address : Somewhere over the rainbow. $_$_END_PRE AscToHTM attempts to determine what definition characters are used and whether they are strong (only ever used in a definition) or weak (only sometimes used in a definition). AscToHTM marks up definition lines by placing a
on the end of the line to preserve the original line structure. Where this decision is made incorrectly unexpected breaks can appear in text. AscToHTM offers the option of marking up the definition term in bold. This is not the default behaviour however. 5.1.5.2 Definition paragraphs AscToHTM also recognises the use of definition paragraphs such as :- $_$_BEGIN_PRE Note: This is a "definition" paragraph, i.e. the whole paragraph defines the term shown on the first line. Unfortunately AscToHTM currently only copes with single paragraphs (i.e. not with continuation paragraphs), and only with single word definitions. $_$_END_PRE This gets marked up in a
...
...
sequence Note: This is a "definition" paragraph, i.e. the whole paragraph defines the term shown on the first line. Unfortunately AscToHTM currently only copes with single paragraphs (i.e. not with continuation paragraphs), and only with single word definitions. 5.2 Text formatting 5.2.1 Quoted lines AscToHTM recognises that, especially in Internet files, it is increasingly common to quote from other text sources such as e-mail. The convention used in such cases is to insert a quote character such as > at the start of each line. Consequently, AscToHTM adds a
tag at the end of such lines to preserve the line structure of the original, and marks it up in .. tags to differentiate the quoted text 5.2.2 Emphasis AscToHTM can look for text emphasised by placing asterisks (*) either side of it, or underscores (_). AscToHTM will convert the enclosed text to *bold* and _italic_ respectively using and tags respectively. AscToHTM will also look for combinations of asterisks and underscores which is placed in _*bold italic*_. The asterisks and underscores should be properly nested. The emphasised word or phrase should span no more than a few lines, and in particular should *not* span a blank line. If the phrase is longer, or if AsctoHTM fails to match opening and closing emphasis marks, the characters are left unconverted. To a limited extent the software can detect and handle nested emphasis of different types. For example _a phrase that is mostly in italics may contain *a few bold words* within it_. Nested emphasis of the same type is not supported. Tests are made to - ignore double asterisks and underscores - ignore phrases with underscores in the middle (these may become underlined) - *(new in version 4)* to allow hyphenated words to be *part*-emphasised. 5.2.3 Fonts The program allows you to select the default font for your files. Fonts are implemented Cascading Style Sheets (CSS) by default, although this can be change to the tag should you wish. The tag is allowed in "HTML [[TEXT 3.2]]", but is "deprecated" in favour of CSS in "HTML [[TEXT 4.0]] Transitional". It is completely disallowed under "HTML [[TEXT 4.0]] Strict". Some older features of the software still use tags. These will be changed to also use CSS in later releases as support for "HTML [[TEXT 4.0]] Strict" is added. See [[GOTO Font Policies]] for more details. 5.2.4 Special characters The program will detect special characters and symbols such as á and © and will replace these by the correct HTML entity codes á and ©. By using the correct HTML codes, the HTML produced is guaranteed to display correctly on all computers. Some symbols use different character codes on different machines (e.g. Mac and PC). 5.3 Added hyperlinks 5.3.1 Contents List lines Contents list lines are marked up in bold, and turned into a hyperlink pointing at the section referenced. The text is sized according to heading type in the range +/- 1 font size from normal (3). 5.3.2 Cross-references AscToHTM can convert cross-references to other sections into hyperlinks to those sections. Unfortunately this is currently only possible for second, third, fourth... level numeric headings (n.n, n.n.n, n.n.n.n etc) This is because the error rate becomes too high on single numbers/letters or roman numerals. This _may_ be refined in future releases, although it's hard to see how that would work. 5.3.3 URLs AscToHTM can convert any URLs in the document to hyperlinks. This includes http and ftp URLs and any web addresses beginning with www. The domain name part of the URL will be checked against the known domain name structures and country codes to check it falls within an allowed group. So www.somewhere.thing won't be allowed as ".thing" isn't a proper top level domain. URLs that use IP addresses or some more obscure methods of specifying domain names will also be recognised, but the link will be changed wherever to either a domain name or an IP address. This will de-obfuscate any obscure references so beloved by spammers. 5.3.4 Usenet Newsgroups AscToHTM can convert any newsgroup names it spots into hyperlinks to those newsgroups. Because this is prone to error, AscToHTM currently only converts newsgroups in known USENET hierarchies such as rec.gardens by default. This can be overcome either by a) placing "news:" in front of the newsgroup name (e.g. news:this.is.a.newsgroup.honest) b) relaxing this condition via a document policy (see the policy [[HYPERLINK POLICY,"Only use known groups"]]) c) specifying the newsgroup hierarchy as recognised via a policy [[HYPERLINK POLICY,"Recognised USENET groups"]]. 5.3.5 E-mail addresses AscToHTM can convert any email addresses into hypertext mailto: links. As with URLs (see 5.3.3), the domain name is checked to see it falls into a recognised group. 5.3.6 User-specified keywords AscToHTM can convert use-specified keywords into hyperlinks. The words or phrase to be converted must lie on a single line in the source document. Care should be taken to ensure keywords are unambiguous. Normally I mark my keywords in [] brackets if authoring for conversion by AscToHTM See the discussions on "link dictionaries" in 4.3.2.2 and 4.4.2. 5.3.7 Other sections and URLs AscToHTM offers a number of pre-processor commands and in-line tags that allow you to add hyperlinks commands to your source text. For example the [[HYPERLINK TAGGING,GOTO]] tag allows hyperlinks to be created to named sections in the same document. So that [[OT]]GOTO Other sections and URLs[[CT]] becomes [[GOTO Other sections and URLs]] In the same way the [[HYPERLINK TAGGING,HYPERLINK]] tag allows hyperlinks to be inserted 5.4 Section headings AscToHTM recognises various types of headings. Where headings are found, and deemed to be consistent with the prevailing document policy (correct indentation, right type, in numerical sequence etc), AscToHTM will use the standard ... markup. In addition to this, AscToHTM will insert a named Anchor tag ( ... ) to allow hyperlink jumps to this point. These anchors are used for example in the contents list and cross-reference hyperlinks that AscToHTM generates. 5.4.1 Numbered headings This is the preferred heading type and the type that AscToHTM has most success with. Sections of type N.N.N can be checked for consistency, and references to them can be spotted and converted into hyperlinks. At present more exotic numbering schemes using roman numerals and letters of the alphabet are not fully supported. This is planned to be implemented soon, possibly via user policy files. 5.4.2 Capitalised headings AscToHTM can treat wholly capitalised lines as headings. It also allows for such headings to be spread over more than one line. 5.4.3 Underlined headings AscToHTM can recognise underlined text, and optionally promote the preceding line to be a section header. The "underlining" line should have no gaps in it. 5.4.4 Embedded headings *New in version 4* The program can look for headings "embedded" in the first paragraph. Such headings are expected to be a complete sentence or phrase in UPPER CASE at the start of a paragraph. Where detected the heading will be marked up in bold, rather than markup, although it will still be added to, and accessible from any hyperlinked contents list you generate for the document. At present such headings are not auto-detected... you need to switch on the [[HYPERLINK POLICY,"Expect embedded headings"]] policy. 5.4.5 Key phrase headings *New in version 4* The program can now look for lines that start with particular words or phrases (such as "Chapter", "Part", Title") of your choice and treat these lines as headings. Previously this only worked in a limited way if the heading line was also *numbered* ("Chapter 1") etc. To use this feature, set the policy [[HYPERLINK POLICY,"Heading key phrases"]] 5.4.6 Numbered paragraphs Some types of documents use what look like section numbers to number paragraphs (e.g. legal documents, or sets of rules). AscToHTM can recognise this, and mark up such lines by placing the number in bold, and not using ... markup on the whole line. 5.4.7 Mail and USENET headers Some documents, especially those that were originally email or USENET posts, come with header lines, usually in the form of a number of lines with a keyword followed by a colon and then some value. AscToHTM can recognise these (to a limited extent). Where these are detected the program will parse the header lines to extract the Subject, Author and Date of the article concerned. A 2-line heading containing this information will then be generated to replace all the unsightly header lines. 5.5 Pre-formatted text 5.5.1 Lines and form feeds Lines are interpreted in context. If they appear to be underlining text, or part of some pre-formatted structure such as a table, then they are treated as such. Otherwise they become horizontal rules (
). An attempt is made to interpret half-lines etc as such, although the effect is only approximate. Form feeds or page breaks also become
markups. *New in version 4* You can use the [HTML fragment] HORIZONTAL_RULE to customize how lines are displayed, e.g. you can substitute your own image file. 5.5.2 User defined pre-formatted text AscToHTM normally ignores any HTML markup in the original text. The sole exceptions are any preprocessor tags which a user may insert into their text document (see [[GOTO Using the preprocessor]]). For example :- $_$_BEGIN_PRE The use of BEGIN_PRE and END_PRE preprocessor commands (see 7.1) in the text documents tells AscToHTM that this portion of the document has been formatted by the user and should be left unchanged. $_$_END_PRE 5.5.3 Automatically detected pre-formatted text AscToHTM attempts to spot sections of preformatted text. This can vary from a single line (e.g. a line with a page number on the right-hand margin) to a complete table of data. Where such text is detected AscToHTM analyses the section to determine what type of pre-formatted text it is. Options include - Tables - Code samples - Ascii Art and diagrams - some other formatted text A number of policies allow you to control - whether or not the program looks for such text - how sensitivity it is to "pre-formatted" text - how inclined the program is to "extend" the region to adjacent lines - whether or not table generation should be attempted - various aspects of any table analysis that is carried out. See [[GOTO "Pre-formatted text policies"]] for full details. 5.5.3.1 Tables Tables are marked out by their use of white space, and a regular pattern of gaps or vertical bars being spotted on each lines. AscToHTM will attempt to spot the table, its columns, its headings, its cell alignment and entries that span multiple columns or rows. Should AscToHTM wrongly detect the extent of a table, you can mark up a section of text by using the [[HYPERLINK TAGGING,BEGIN/END_TABLE]] pre-processor commands. Alternatively you can try adding blank lines before and after, as the analysis uses white space to delimit tables. You can alter the characteristics of all tables via the table policies (see [[GOTO Table generation policies]]). You can alter the characteristics of all or individual tables via the table pre-processor commands (see 7.1.4). Or you can suppress the whole thing altogether via the [[HYPERLINK POLICY,"Attempt TABLE generation"]] policy 5.5.3.2 Code samples AscToHTM attempts to recognise code fragments in technical documents. The code is assumed to be "C++" or "Java"-like, and key indicators are, for example, the presence of ";" characters on the end of lines. Should AscToHTM wrongly detect the extent of a code fragment, you can mark up a section of text by using the [[HYPERLINK TAGGING,BEGIN/END_CODE]] pre-processor commands. You can choose what type of markup is used for the code fragment (see the policy [[HYPERLINK POLICY,"Use .. markup"]]). Of you can suppress the whole thing altogether via the policy [[HYPERLINK POLICY,"Expect code samples"]]. 5.5.3.3 Ascii art and diagrams AscToHTM attempts to recognise Ascii art and diagrams in documents. Key indicators include large numbers of non-alphanumeric characters and the use of white space. However, some diagrams use the same mix of line and alphabetic characters as tables, so the two sometimes get confused. Should AscToHTM wrongly detect the extent or type of a diagram, you can mark up a section of text by using the [[HYPERLINK TAGGING,BEGIN/END_DIAGRAM]] pre-processor commands. 5.5.3.4 Text blocks *New in version 4* If AscToHTM detects a block of text at a large indent, it will now place that text in a table in such a way as to preserve as faithfully as possible the original indent. Of course, sometimes this text is, in fact, centred on the page. In that case you could consider switching on the policy [[HYPERLINK POLICY,"Allow automatic centring"]] See 5.1.4 5.5.3.5 Other formatted text If AscToHTM detects formatted text, but decides that is is neither table, code or art (and it knows what it likes), then the text may be put out "as normal", but with
added to each line. In such regions other markup (such as bullets) may not be processed such as it would be elsewhere. 5.6 Added value markup 5.6.1 Document Title AscToHTM can calculate - or be told - the title of a document. This is placed in ... markup in the section of each HTML page produced. The Title is calculated as in the order shown below. If the first algorithm returns a value, the subsequent ones are ignored. 1) If a $_$_TITLE pre-processor command (see 7.1.2) is placed in the source text, that value is used 2) If the [[HYPERLINK POLICY,"Use first header as title"]] policy is set then the first heading (if any) encountered is used as the title. Note: Depending on your document structure, this is prone to give bland tiles like "Introduction" , "Overview" and "Summary" 3) If the [[HYPERLINK POLICY,"Use first line as title"]] policy is set then the first line in the file is used as the title. 4) If the [[HYPERLINK POLICY,"Document title"]] policy is set then this value is used. Note: If this is the value you want, ensure the other policies outlined above are disabled. 5) Finally, if none of the above result in a title the text "Converted from " is used. 5.6.2 Contents lists AscToHTM can detect the presence of a contents list in the original document, or it can generate a contents list for you from the headings that it observes. There are a number of policies that give you control over how and where a contents list is generated (see [[GOTO Contents generation policies]]). There are four different situations in which contents lists may, or may not be generated. These are :- - Default conversions (see 5.6.2.1). - Conversion to a single file, using policies (see 5.6.2.2). - Conversion to a multiple files, using policies (see 5.6.2.3). - Conversion to a set of frames (see 5.6.2.4) 5.6.2.1 Contents lists in default conversions By default AscToHTM will not generate a contents list for a file unless it already has one. If it should detect a contents list in the document, then that list is changed into hyperlinks to the named sections. This only works currently for files with numbered headings. Where an existing list is detected, headings shown in the contents list are converted into links, and the link text is that in the original contents list, and not the text in the actual heading (often they are different). Note: AsctoHTM currently only detected numbered contents lists, and is occasionally prone to error when they are present. If you experience problems, either delete the contents list and get AscToHTM to generate one for you, or mark up the existing list using the contents pre-processor commands (see 7.1.1) 5.6.2.2 Contents lists in conversions to a single HTML file As described in 5.6.2.1, AscToHTM will not generate a contents list by default unless it already has one. *Requesting a contents list* [[BR]] You can request that a contents list is always generated, by using the [[HYPERLINK POLICY,"Add contents list"]] policy. In this case a contents list is either a) made from the existing contents list, or b) generated from the observed headings. in this case the contents list will only be as good as the detection of headings in the rest of the document permits. *Forcing a generated contents list* [[BR]] You can force a generated list to be used by disabling the [[HYPERLINK POLICY,"Use any existing contents list"]] policy. If an existing contents list is present, it is deleted from the output. Normally it's best to either use the existing contents list, or to delete it from the source text and request a generated list. *Contents lists placement* [[BR]] By default the contents list is placed at the top of the output file. In earlier versions of AscToHTM the contents list was always placed in a separate file. You can cause contents lists to be placed wherever you want by using the [[HYPERLINK TAGGING,CONTENTS_LIST]] preprocessor command. If you do this, then contents lists is placed *only* where you place CONTENTS_LIST markers. *Generating a contents list in a separate file* [[BR]] If you select the [[HYPERLINK POLICY,"Generate external contents list"]] policy the contents list is placed in a separate file, and a hyperlink to that file called "Contents List" is placed at the top of the HTML page generated from the document. You can choose the name of the external file using the [[HYPERLINK POLICY,"External contents list filename"]] policy. If omitted, the file is called "Contents_", where is the name of the document being converted. 5.6.2.3 Contents lists in conversions to multiple HTML files AscToHTM can be made to split the output into many files. At present this is only possible at detected section headings. Each generated page usually has a navigation bar, which includes a hyperlink back to the following section in any contents list. The behaviour is identical to that in 5.6.2.2 expect that a) the output is now split into several files. b) the options to generate an external contents list in a separate file are no longer available. c) if the contents list is being generated, it is now placed at the foot of the first document, rather than at the top (unless the CONTENTS_LIST preprocessor command is used) This is usually *before* the first heading (which now starts the second document), and *after* any document preamble. Note: Where the original contents list is used when splitting files it is possible that not every file is directly accessible from the contents list, and that the back links to the contents list may not function as expected. In such cases you can go from the contents list to a major section, and then use the navigation bars to page through to the minor section. 5.6.2.4 Contents lists in conversions to frames *New in version 4* Contents list generation for the main document will proceed as described in the previous sections. When making a set of frames, you can elect to have a contents frame generated (the default behaviour), and this will have a generated list placed in a frame on the left. This can mean you have a contents list in the contents frame on the left, and also at the top of the first page in the main document. For this reason the main frame often starts by displaying the second page. The number of levels shown in the contents frame list can be controlled by policy. Alternatively you can replace the whole contents of the contents frame by defining a CONTENTS_FRAME [HTML fragment]. 5.6.3 Directory page When converting several files at once, AscToHTM can be made to generate a "Directory Page". This is an HTML index of all the files converted and their contents. The policies available for controlling generation of a directory page are explained in "[[GOTO Directory Page policies]]". The directory page will consist of an entry for each file converted, in the order that files are converted (usually alphabetic). Each entry will (optionally) contain :- - A link to the file being converted. The link will either be the converted file's HTML title, or failing that, the filename itself. - Links to each of the sections of the converted file as detected by AscToHTM. 5.6.4 Headers, footers and JavaScript AscToHTM can be made to add standard header, footers and JavaScript to each page generated. It does this by allowing you to specify include files to be copied into the generated HTML. These include files can contain any valid HTML commands. The program supports three types of such files :- i) Header files. These contain any HTML you want placing immediately after the output's tag. A good example might be a standard header, with a logo and links back to the home page. ii) Footer files. These contain any HTML you want placing immediately before the closing tag. iii) Script files, These contain any HTML you want placing inside the ... portion of the generated file. Such tags are not usually visible. You should place in here any JavaScript you want, although it is difficult to make this apply to the converted text. You can specify include files for the converted files, as well as for any directory page (see 5.6.3) that you create. If you don't specify values for the directory page, then it will use the same files as the generated files. HTML headers and footers can also be defined as [HTML fragments] (see 5.6.5). 5.6.5 HTML fragments *New in version 4* The "HTML fragments" feature allows you to define a block of HTML that you can elect to have used in place of the default HTML that the program would generate. Some reserved names are recognised by the software and used to override the default HTML that is generated. For example the fragment HORIZONTAL_RULE can be used to define a tag that displays an image and which will be used in a number of situations where a simple
tag would have been used otherwise. They are described fully in the Tag Manual (see [HTML fragments]). $_$_TABLE_WIDTH 60% 6 Using Document Policy files --------------------------------- *This chapter has been largely superceded by the [Policy Manual]* Document policy files are ordinary text files that list the "policies" that AscToHTM should implement when converting your document. The file can have added comment lines (starting with a "!" or "#" character) and headings for clarity. A summary of the recognised policy lines is given in the [Policy manual]. In most cases recognised policy lines are identical to those listed in the generated policy file (see 4.1). This is usually a good place to start when making your own policy. Only those lines that are recognised policies are acted upon. To use a policy file, simply list it on the command line after the name of the file being converted (see 4.2.2.3). Document policies have two main uses : a) To correct any failure of analysis that AscToHTM makes. Hopefully this won't be needed too much as the core analysis engine improves. Examples include page width, whether or not underlined section headings are expected etc. b) To tell the program how to produce better HTML end product in ways that couldn't possibly be inferred from the original text. Examples include adding colour and titles to the page, as well as requesting a large document is split into several pages, and a contents list created. The document sections in this chapter that described the policies in detail have been moved to a standalone document called the "[Policy Manual]". That document describes the scope, effect, location and default values for all policies recognised by the program. 6.1 An example conversion This documentation has itself been converted using AscToHTM. The files used were - a2hdoco.txt. This is the text version of the documentation. The text version is kept as the master copy and updated as required. It's then converted to HTML. - ia2hdoco.pol. This is the policy file used to create the HTML version of this document. Only those policies that differ from the defaults have been added. This policy file "includes" the link dictionary A2HLINKS.DAT. - a2hlinks.dat. This is the link dictionary used for this document and is used to add hyperlinks to the main text file. - html_fragments.inc. This file contains the definitions of the [HTML fragments] used in this conversion. These files are included in the distribution kit as an example set of documentation. You can, of course, use AscToHTM to convert this doco into whatever format, colour etc that you wish. 6.2 Analysis policies These policies are used to control and correct the analysis of files during conversion. Full descriptions of these policies can be found in the [Policy Manual]. 6.2.1 Overview ("look for") policies The following analysis policies help give you an overview of what the program is looking for, and to enable/disable what is being looked for. [[HYPERLINK POLICY,"Look for indentation"]] [[BR]] [[HYPERLINK POLICY,"Look for hanging paragraphs"]] [[BR]] [[HYPERLINK POLICY,"Look for white space"]] [[BR]] [[HYPERLINK POLICY,"Look for short lines"]] [[BR]] [[HYPERLINK POLICY,"Look for horizontal rulers"]] [[BR]] [[HYPERLINK POLICY,"Minimum ruler length"]] [[BR]] [[HYPERLINK POLICY,"Look for bullets"]] [[BR]] [[HYPERLINK POLICY,"Search for definitions]] [[BR]] [[HYPERLINK POLICY,"Look for quoted text"]] [[BR]] [[HYPERLINK POLICY,"Look for MAIL and USENET headers"]] [[BR]] [[HYPERLINK POLICY,"Look for preformatted text"]] [[BR]] [[HYPERLINK POLICY,"Attempt TABLE generation"]] [[BR]] [[HYPERLINK POLICY,"Look for diagrams"]] 6.2.2 General Layout policies The following analysis policies help control general layout parameters:- [[HYPERLINK POLICY,"Page width"]] [[BR]] [[HYPERLINK POLICY,"TAB size"]] [[BR]] [[HYPERLINK POLICY,"Short line length"]] [[BR]] [[HYPERLINK POLICY,"Min chapter size"]] [[BR]] [[HYPERLINK POLICY,"Expect blank lines between paras"]] [[BR]] [[HYPERLINK POLICY,"Hanging paragraph position(s)"]] [[BR]] [[HYPERLINK POLICY,"Search for Definitions"]] [[BR]] [[HYPERLINK POLICY,"New Paragraph Offset"]] [[BR]] [[HYPERLINK POLICY,"Definition Char"]] [[BR]] [[HYPERLINK POLICY,"Indent position(s)"]] [[BR]] 6.2.3 Bullet policies AscToHTM has the following bullet point policies that will normally be correctly calculated on the analysis pass :- [[HYPERLINK POLICY,"Look for bullets"]] [[HYPERLINK POLICY,"Expect alphabetic bullets"]] [[BR]] [[HYPERLINK POLICY,"Expect numbered bullets"]] [[BR]] [[HYPERLINK POLICY,"Expect roman numeral bullets"]] [[HYPERLINK POLICY,"Recognise '-' as a bullet"]] [[BR]] [[HYPERLINK POLICY,"Recognise 'o' as a bullet"]] [[BR]] [[HYPERLINK POLICY,"Bullet char"]] AscToHTM tries hard not to get confused by the "1", "a" and "I" that happen to end up at the start of lines by random. These could get mistaken for bullet points. 6.2.4 Contents analysis policies There is only one analysis contents policy:- [[HYPERLINK POLICY,"Expect contents list"]] This is described together with all the output contents list policies in [[GOTO Contents generation policies]] For more information on content list generation see 5.6.2. 6.2.5 File Structure policies AscToHTM has the following file structure policies that will normally be need to be set manually :- [[HYPERLINK POLICY,"Keep it simple"]] [[HYPERLINK POLICY,"Expect code samples"]] [[BR]] [[HYPERLINK POLICY,"Input file contains DOS characters"]] [[BR]] [[HYPERLINK POLICY,"Input file contains MIME encoding"]] [[BR]] [[HYPERLINK POLICY,"Input file contains PCL codes"]] [[BR]] [[HYPERLINK POLICY,"Input file contains Japanese characters"]] [[BR]] [[HYPERLINK POLICY,"Input file has change bars"]] [[BR]] [[HYPERLINK POLICY,"Input file has page markers"]] [[BR]] [[HYPERLINK POLICY,"Page marker size (in lines)"]] [[BR]] [[HYPERLINK POLICY,"Text Justification"]] [[BR]] [[HYPERLINK POLICY,"Input file is double spaced"]] 6.2.6 Heading policies AscToHTM has the following section heading policies that will normally be correctly calculated on the analysis pass :- [[HYPERLINK POLICY,"Expect Numbered Headings"]] [[BR]] [[HYPERLINK POLICY,"Expect Underlined Headings"]] [[BR]] [[HYPERLINK POLICY,"Expect Capitalised Headings"]] [[BR]] [[HYPERLINK POLICY,"Expect Embedded Headings"]] [[BR]] [[HYPERLINK POLICY,"Heading key phrases"]] [[HYPERLINK POLICY,"Check indentation for consistency"]] [[HYPERLINK POLICY,"Expect Second Word Headings"]] [[BR]] [[HYPERLINK POLICY,"First Section Number"]] [[BR]] [[HYPERLINK POLICY,"Smallest possible section number"]] [[BR]] [[HYPERLINK POLICY,"Largest possible section number"]] [[BR]] [[HYPERLINK POLICY,"Preserve underlining of headings"]] Section headers are far and away the most complex things the analysis pass has to detect, and the most likely area for errors to occur. AscToHTM will also document to a policy file the headings it finds. This is still to be finalised, but currently has the format $_$_BEGIN_PRE We have 4 recognised headings Heading level 0 = "" N at indent 0 Heading level 1 = "" N.N at indent 0 Contents level 0 = "" N at indent 0 Contents level 1 = "" N.N at indent 2 $_$_END_PRE AscToHTM will read in such lines from a policy text file, but does not yet fully supported editing these via the Windows interface. The syntax is explained below, but this will probably change in future releases. You can edit these lines in your policy file, and through the policy options in Windows. The lines are currently structured as follows $_$_BEGIN_TABLE Line component Value -------------------------------------------- xxxx Either "Heading" or "Contents" according to the part of the policy being described Level n Level number, starting at 0 for chapters 1 for level 1 headings etc. "Some_word" Any text that may be expected to occur before the heading number. E.g. "Chapter" or "Section" or "[". The case is unimportant. N.Nx The style of the heading number. This will ultimately (in later versions) be read as a series of number/separator pairs. The proposed format is "N" = number "i" / "I" = lower/upper case roman numeral with an 'x' at the end signalling that trailing letters may be expected (e.g. 5.6a, 5.6b) at indent n The indentation that this heading is expected at. This is important in helping to eliminate false candidates. $_$_END_TABLE 6.2.7 Pre-formatted text policies AscToHTM has the following section heading policies that will normally be correctly calculated on the analysis pass :- [[HYPERLINK POLICY,"Minimum automatic
 size"]]


6.2.8 Table analysis policies

      *New in version 4*

      AscToHTM uses the following policies to control the detection and
      analysis of tables :-

      [[HYPERLINK POLICY,"Attempt TABLE generation"]]

      [[HYPERLINK POLICY,"Table extending factor"]]

      [[HYPERLINK POLICY,"Expect sparse tables"]] [[BR]]
      [[HYPERLINK POLICY,"Ignore table header during analysis"]] [[BR]]
      [[HYPERLINK POLICY,"Column merging factor"]] [[BR]]
      [[HYPERLINK POLICY,"Minimum TABLE column separation"]]

      [[HYPERLINK POLICY,"Default TABLE layout"]] [[BR]]
      [[HYPERLINK POLICY,"Tables could be blank line separated"]]


6.3 Output policies

      These policies are used to output and generation of files 
      during conversion.  Full descriptions of these policies can be
      found in the [Policy Manual].

6.3.1 Added HTML policies

      AscToHTM has the following HTML policies that will only ever take effect
      if supplied in a user policy file :-

      [[HYPERLINK POLICY,"Use first heading as title"]] [[BR]]
      [[HYPERLINK POLICY,"Use first line as title"]] [[BR]]
      [[HYPERLINK POLICY,"Document title"]]

      [[HYPERLINK POLICY,"Document description"]] [[BR]]
      [[HYPERLINK POLICY,"Document keywords"]] [[BR]]
      [[HYPERLINK POLICY,"Background Image"]] [[BR]]

      [[HYPERLINK POLICY,"HTML header file"]] [[BR]]
      [[HYPERLINK POLICY,"HTML footer file"]] [[BR]]
      [[HYPERLINK POLICY,"HTML Script file"]]

      [[HYPERLINK POLICY,"Omit  and  from output"]] [[BR]]
      [[HYPERLINK POLICY,"Document Base URL"]] [[BR]]
      [[HYPERLINK POLICY,"Comment generation code"]] [[BR]]
      [[HYPERLINK POLICY,"HTML fragments file"]] 

      These "polices" allow you to start "adding value" to the HTML generated.
      That is, they allow to specify things that cannot be inferred from the
      original text.

      You can also add HTML to your files by using the HTML preprocessor
      command (see 7.1.1)


6.3.2 Cascading Style sheet policies (CSS)

      *New in version 4*

      AscToHTM has the following HTML policies that influence the use
      of CSS in the HTML generated :-

      [[HYPERLINK POLICY,"Document Style Sheet"]]

      Not visible in the user interface is :-

      [[HYPERLINK POLICY,"Create embedded style sheet"]] [[BR]]


6.3.3 Contents generation policies

      AscToHTM has the following HTML policies that influence the detection
      and generation of contents lists :-

      [[HYPERLINK POLICY,"Expect Contents List"]] [[BR]]

      [[HYPERLINK POLICY,"Add contents list"]] [[BR]]
      [[HYPERLINK POLICY,"Maximum level to show in contents"]]

      [[HYPERLINK POLICY,"Use any existing contents list"]]

      [[HYPERLINK POLICY,"Generate external contents file"]] [[BR]]
      [[HYPERLINK POLICY,"External contents list filename"]]

      [[HYPERLINK POLICY,"Hyperlinks on numbers"]] [[BR]]

      See also the discussion in 5.6.2


6.3.4 Document Colour policies

      *New in version 4*

      AscToHTM has a large number of HTML policies that can control the
      colouring of the files.  These policies are spread across a number
      of areas of functionality.

      _General_

      [[HYPERLINK POLICY,"Suppress all colour markup"]]

      [[HYPERLINK POLICY,"Active Link Colour"]] [[BR]]
      [[HYPERLINK POLICY,"Background Colour"]] [[BR]]
      [[HYPERLINK POLICY,"Text Colour"]] [[BR]]
      [[HYPERLINK POLICY,"Unvisited Link Colour"]] [[BR]]
      [[HYPERLINK POLICY,"Visited Link Colour"]]

      _Frames_

      [[HYPERLINK POLICY,"Header frame background colour"]] [[BR]]
      [[HYPERLINK POLICY,"Header frame text colour"]] [[BR]]
      [[HYPERLINK POLICY,"Contents frame background colour"]] [[BR]]
      [[HYPERLINK POLICY,"Contents frame text colour"]] [[BR]]
      [[HYPERLINK POLICY,"Footer frame background colour"]] [[BR]]
      [[HYPERLINK POLICY,"Footer frame text colour"]]

      _Tables_

      [[HYPERLINK POLICY,"Colour data rows"]] [[BR]]
      [[HYPERLINK POLICY,"Default TABLE border colour"]] [[BR]]
      [[HYPERLINK POLICY,"Default TABLE colour"]] [[BR]]
      [[HYPERLINK POLICY,"Default TABLE even row colour"]] [[BR]]
      [[HYPERLINK POLICY,"Default TABLE odd row colour"]]


6.3.5 Directory Page policies

      AscToHTM has the following policies that can be used to influence
      whether or not AscToHTM will attempt to generate a Directory page
      for the files being converted.  This is really only appropriate when
      converting more that one file at once (see 4.3.3)

      The Directory Page will consist of entries for each file being
      converted (in order of conversion), and can have hyperlinks to the
      files, and to recognised headings in the files.  This makes it suitable
      for use as a master index to a set of files converted in a single
      directory.

      [[HYPERLINK POLICY,"Make Directory"]] [[BR]]
      [[HYPERLINK POLICY,"Indent headings in Directory"]] [[BR]]
      [[HYPERLINK POLICY,"Show file titles in Directory"]] [[BR]]
      [[HYPERLINK POLICY,"Directory filename"]]

      [[HYPERLINK POLICY,"Directory title"]] [[BR]]
      [[HYPERLINK POLICY,"Directory description"]] [[BR]]
      [[HYPERLINK POLICY,"Directory keywords"]] [[BR]]
      [[HYPERLINK POLICY,"Directory return hyperlink text"]]

      [[HYPERLINK POLICY,"Directory header file"]] [[BR]]
      [[HYPERLINK POLICY,"Directory footer file"]] [[BR]]
      [[HYPERLINK POLICY,"Directory script file"]]


6.3.6 File generation policies

      AscToHTM has the following HTML policies that affect the file generation
      process :- 

      [[HYPERLINK POLICY,"Input directory"]] [[BR]]
      [[HYPERLINK POLICY,"Output directory"]] [[BR]]
      [[HYPERLINK POLICY,"Use .HTM extension"]] [[BR]]
      [[HYPERLINK POLICY,"Output file extension"]]

      [[HYPERLINK POLICY,"Preserve file structure using 
"]] [[BR]]
      [[HYPERLINK POLICY,"Preserve line structure"]] [[BR]]
      [[HYPERLINK POLICY,"Treat each line as a paragraph"]]

      [[HYPERLINK POLICY,"Generate diagnostics files"]] [[BR]]
      [[HYPERLINK POLICY,"Output policy file"]] [[BR]]
      [[HYPERLINK POLICY,"Output policy filename"]] [[BR]]

      [[HYPERLINK POLICY,"DOS filename root"]] [[BR]]
      [[HYPERLINK POLICY,"Use DOS filenames"]] [[BR]]

      [[HYPERLINK POLICY,"Split level"]] [[BR]]
      [[HYPERLINK POLICY,"Min HTML File size"]] [[BR]]
      [[HYPERLINK POLICY,"Add navigation bar"]] [[BR]]
      [[HYPERLINK POLICY,"Minimise HTML file size"]] [[BR]]

      [[HYPERLINK POLICY,"Break up long HTML lines"]] [[BR]]

      These policies specify how your document is divided into one or more HTML
      files, and how those files are to be named and linked together with
      hyperlinks.


6.3.7 Font policies

      AscToHTM supports the implementation of fonts via either Cascading 
      style sheets (CSS) or via the  tag.

      Related policies are :-

      [[HYPERLINK POLICY,"Use CSS to implement fonts"]] [[BR]]
      [[HYPERLINK POLICY,"Default font"]] 


6.3.8 Frames policies

      *New in version 4*

      From version [[TEXT 4]] onwards AscToHTM will support the output of HTML
      as a set of HTML FRAMES.  A large number of policies support this
      process.

      _General_

      [[HYPERLINK POLICY,"Place document in frames"]]

      [[HYPERLINK POLICY,"Output frame name"]]
      [[HYPERLINK POLICY,"Add Frame border"]]

      [[HYPERLINK POLICY,"Open frame links in new window"]] [[BR]]
      [[HYPERLINK POLICY,"New frame link window name"]]

      [[HYPERLINK POLICY,"Add NOFRAMES links"]] [[BR]]
      [[HYPERLINK POLICY,"NOFRAMES link URL"]]

      _Header and Footer frame policies_

      [[HYPERLINK POLICY,"Use main header in header frame"]] [[BR]]
      [[HYPERLINK POLICY,"Header Frame depth"]]

      [[HYPERLINK POLICY,"Use main footer in footer frame"]] [[BR]]
      [[HYPERLINK POLICY,"Footer Frame depth"]]

      _Contents frame_

      [[HYPERLINK POLICY,"Add contents frame if possible"]] [[BR]]
      [[HYPERLINK POLICY,"Contents Frame width"]] [[BR]]
      [[HYPERLINK POLICY,"Number of levels in contents frame"]]

      _Main Frame_

      [[HYPERLINK POLICY,"Split level"]] [[BR]]
      [[HYPERLINK POLICY,"Min HTML File size"]] [[BR]]
      [[HYPERLINK POLICY,"First frame page number"]]

      _Frame colours_

      [[HYPERLINK POLICY,"Header frame background colour"]] [[BR]]
      [[HYPERLINK POLICY,"Header frame text colour"]] [[BR]]
      [[HYPERLINK POLICY,"Contents frame background colour"]] [[BR]]
      [[HYPERLINK POLICY,"Contents frame text colour"]] [[BR]]
      [[HYPERLINK POLICY,"Footer frame background colour"]] [[BR]]
      [[HYPERLINK POLICY,"Footer frame text colour"]]


6.3.9 Hyperlink policies

      AscToHTM has the following hyperlink policies set as defaults :-

      [[HYPERLINK POLICY,"Create hyperlinks"]] [[BR]]
      [[HYPERLINK POLICY,"Create mailto links"]] [[BR]]
      [[HYPERLINK POLICY,"Allow email beginning with numbers"]] [[BR]]
      [[HYPERLINK POLICY,"Check domain name syntax"]]

      [[HYPERLINK POLICY,"Create gopher links"]] [[BR]]
      [[HYPERLINK POLICY,"Create FTP links"]] [[BR]]
      [[HYPERLINK POLICY,"Only allow explicit FTP links"]]

      [[HYPERLINK POLICY,"Create NEWS links"]] [[BR]]
      [[HYPERLINK POLICY,"Only use known groups"]] [[BR]]
      [[HYPERLINK POLICY,"Recognised USENET groups"]]

      [[HYPERLINK POLICY,"Add 
to lines with URLs"]] [[HYPERLINK POLICY,"Cross-refs at level"]] [[HYPERLINK POLICY,"Open link in new browser window"]] [[BR]] [[HYPERLINK POLICY,"new browser window name"]] Hyperlinks can also be added by using a link dictionary (see 4.3.2.2 and 4.4.2). 6.3.10 Link Dictionary policies Link definitions appear in a policy file as follows :- $_$_BEGIN_PRE [Link Dictionary] ----------------- Link definition : "a2hdoco.txt" = "Source text" + "/~jaf/A2HDOCO $_$_END_PRE That is, the text to be matched, the text to be used in its placed as the highlighted text, and the URL this link is to point to (in this case a relative URL). See the discussions in 4.3.2.2 and 4.4.2. 6.3.11 Preprocessor policies AscToHTM has the following policies that can be used to influence the preprocessor (see [[GOTO Using the preprocessor]]), and hence the HTML output :- [[HYPERLINK POLICY,"Use Preprocessor"]] [[BR]] [[HYPERLINK POLICY,"Include document section(s)"]] [[HYPERLINK POLICY,"Allow definitions inside PRE"]] 6.3.12 HTML styling policies AscToHTM has the following "styling" that can be used to influence the HTML output :- [[HYPERLINK POLICY,"Allow automatic centring"]] [[BR]] [[HYPERLINK POLICY,"Automatic centring tolerance"]] [[BR]] [[HYPERLINK POLICY,"Ignore multiple blank lines"]] [[BR]] [[HYPERLINK POLICY,"Highlight definition text"]] [[BR]] [[HYPERLINK POLICY,"Use
markup for defn. paras"]] [[BR]] [[HYPERLINK POLICY,"Largest allowed tag"]] [[BR]] [[HYPERLINK POLICY,"Smallest allowed tag"]] [[BR]] [[HYPERLINK POLICY,"Headings colour"]] [[BR]] [[HYPERLINK POLICY,"Preserve underlining of headings"]] [[HYPERLINK POLICY,"Search for emphasis"]] [[HYPERLINK POLICY,"Use and markup"]] [[HYPERLINK POLICY,"Preserve New Paragraph Offset"]] Also, not available in the user interface is :- [[HYPERLINK POLICY,"First line indentation (in blocks)"]] 6.3.13 Table Generation policies AscToHTM has the following policies that can be used to influence whether or not AscToHTM will attempt to detect and generate HTML tables, and the attributes of any tables generated. Tables may be tailored individually by adding pre-processor commands to your source text (see 7.1.4) [[HYPERLINK POLICY,"Attempt TABLE generation"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE cell spacing"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE cell padding"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE border size"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE width"]] [[HYPERLINK POLICY,"Default TABLE colour"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE border colour"]] [[BR]] [[HYPERLINK POLICY,"Colour data rows"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE even row colour"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE odd row colour"]] [[HYPERLINK POLICY,"Default TABLE alignment"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE cell alignment"]] [[HYPERLINK POLICY,"Convert TABLE X-refs to links"]] The following policies can only be changed through policy file, but are probably best not used in favour of the their equivalent preprocessor tags. [[HYPERLINK POLICY,"Default TABLE caption"]] [[HYPERLINK POLICY,"Default TABLE header rows"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE header cols"]] [[HYPERLINK POLICY,"Column boundaries have zero width"]] [[HYPERLINK POLICY,"Use .. markup"]] 6.3.14 Miscellaneous policies AscToHTM supports the following policies which currently can only be added by editing the .policy file _Contents List_ [[HYPERLINK POLICY,"Add mail headers to contents list"]] _CSS_ [[HYPERLINK POLICY,"Create embedded style sheet"]] _File generation_ [[HYPERLINK POLICY,"Break up long HTML lines"]] [[BR]] [[HYPERLINK POLICY,"HTML version to be targeted"]] [[BR]] [[HYPERLINK POLICY,"Lines to ignore at end of file"]] [[BR]] [[HYPERLINK POLICY,"Lines to ignore at start of file"]] _Fonts_ [[HYPERLINK POLICY,"Suppress all font markup"]] _Headings_ [[HYPERLINK POLICY,"Expect Second Word Headings"]] [[HYPERLINK POLICY,"First Section Number"]] [[HYPERLINK POLICY,"Number of words to include in filename"]] _HTML Generation_ [[HYPERLINK POLICY,"HTML version to be targeted"]] _Style_ [[HYPERLINK POLICY,"First line indentation (in blocks)"]] _Tables_ [[HYPERLINK POLICY,"Default TABLE caption"]] [[HYPERLINK POLICY,"Default TABLE header rows"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE header cols"]] [[HYPERLINK POLICY,"Column boundaries have zero width"]] [[HYPERLINK POLICY,"Use .. markup"]] 6.4 Settings policies *New in version 4* These policies are used to control the behaviour of the program during the conversion process. Most program setting are not available as policies, but those that are are listed here. Full descriptions of these policies can be found in the [Policy Manual]. 6.4.1 Error reporting The following policies can be used to tailor the number and type of messages displayed during conversion. [[HYPERLINK POLICY,"Error reporting level"]] [[HYPERLINK POLICY,"Suppress INFO messages"]], [[HYPERLINK POLICY,"Suppress TAG ERROR messages"]] [[HYPERLINK POLICY,"Suppress URL messages"]] [[HYPERLINK POLICY,"Suppress WARNING messages"]] [[HYPERLINK POLICY,"Suppress program ERROR messages"]] 6.5 Saving and loading policy files This section has been copied into the Policy manual section on [[HYPERLINK URL,"policy_manual_3.html#Section_3.1","placing policies in a file"]] 6.5.1 Overview AscToHTM allows you to save policies to file so that you can later reload them. This allows you to easily define different ways of doing conversions, either for different types of files, or to produce different types of output. The policy files have a .pol extension by default, and are simple text files, with one policy on each line. You can, if you wish, edit these policies in a text editor... this is sometimes easier that using all the dialogs in the Windows version. When editing policies, it is important not to change the key phrase (the bit before the ":" character), as this needs to be matched exactly by AscToHTM. For best results, it is advisable to put in your policy file only those policies you want to fix. This leaves AscToHTM to calculate document-by-document policies that suit the files being converted. Note: Avoid using "full" policy file for your conversions. Such files prevent the program from adjusting to each source file, often leading to unwanted results. 6.5.2 Generating policy files for your document The normal way to create a policy file is by setting options and them saving them using the "save policy file" dialog. This will offer you the choice of creating a partial policy file or a full policy file (see 6.5.2.1 and 6.5.2.2). Alternatively, you can set the [[HYPERLINK POLICY,"Output policy file"]] policy which will generate a full policy file resulting from the analysis of the converted document. Once a file is generated you can either edit them in a text editor - deleting policies that are of little interest to you, and editing those that are - or reload them into the program, change them and save them again. 6.5.2.1 Partial policy files Partial policy files are files which have values for some, not all, policies. These are recommended, because the it leaves AscToHTM free to adjust all the other policies not set in the file, allowing it to adapt to the details of the document being concerned. For example, you should only set the indentation policy if you *know* what indents you are using, or if you want to override those calculated by AscToHTM. Normally it is best to omit this policy, and allow AscToHTM to work it out itself. When you save a policy file from inside AscToHTM, a partial policy file will contain - all policies loaded from the current policy file (if any) - all policies changed in AscToHTM during the current session (if any) 6.5.2.2 Full policy files A "full" policy file contains a value for almost every possible policy. Such files are usually only useful for documentation and analysis reasons, and should almost never be expected to be reloaded as input into a conversion, as this would totally fix the conversion details. 6.5.3 Naming policy files Whenever the [[HYPERLINK POLICY,"Output policy file"]] policy is set the generated "full" policy file is usually called .pol where is the name of the file being created. When this happens any existing file of that name will be overwritten. For this reason we *strongly* advise you adopt a naming convention of the form in_.pol or i.pol or place your input policies in a different directory and ensure they are backed up. $_$_TABLE_WIDTH 7 Using the preprocessor ---------------------------- The preprocessor was introduced to allow users more flexibility in the HTML they generate. As such it allows AscToHTM to be used as a HTML authoring tool, as opposed to a simple text conversion or migration tool. The preprocessor looks for i) "Directives". These are lines that begin with a special character sequence. Presently this is "$_$_". ii) "Tags" that are enclosed in "[[OT]]" and "[[CT]]". A separate document - the [Tag Manual] - has been produced to describe the pre-processor commands in detail. That document replaces much of the contents that were originally in this section and the next. Some commands may be either tags or directives, this is explained more in the tag manual. 7.1 Directives 7.1.1 Marking up sections of text The pre-processor can be used to mark sections in your document so that AscToHTM will process them as you wish. These include :- [[HYPERLINK TAGGING,"BEGIN/END_CODE"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_CONTENTS"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_DIAGRAM"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_HTML"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_IGNORE"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_PRE"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_TABLE"]] [[BR]] [[HYPERLINK TAGGING,"SECTION"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_COMMA_DELIMITED_TABLE"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_DELIMITED_TABLE"]] [[BR]] 7.1.2 Commands that influence the .. of a file Some commands can be used to control the contents of the HTML section. These attributes are largely non-visual, but may influence how the document is indexed (e.g. by search engines). Commands include :- [[HYPERLINK TAGGING,"TITLE"]] [[BR]] [[HYPERLINK TAGGING,"DESCRIPTION"]] [[BR]] [[HYPERLINK TAGGING,"KEYWORDS"]] [[BR]] [[HYPERLINK TAGGING,"STYLE_SHEET"]] [[BR]] [[HYPERLINK TAGGING,"BASEHREF"]] To fully understand how titles are calculated, see the discussion in 5.6.1 7.1.3 One line pre-processor commands These commands exist on a line by themselves. They either cause something to be executed at the point they occur, or are used to mark that location in the file in some way. The commands include :- [[HYPERLINK TAGGING,"CONTENTS_LIST"]] [[BR]] [[HYPERLINK TAGGING,"HTML_LINE"]] [[BR]] [[HYPERLINK TAGGING,"INCLUDE"]] [[BR]] [[HYPERLINK TAGGING,"LINERULE"]] [[BR]] [[HYPERLINK TAGGING,"NAVIGATION_BAR"]] [[BR]] [[HYPERLINK TAGGING,"TOC"]] [[BR]] 7.1.4 The TABLE commands A large number of delimiters influence the detection, analysis and generation of tables. These are discussed in the [Tag manual] in the [[HYPERLINK URL,"tag_manual_2.html#Section_2.5","The TABLE commands"]] section. The table commands include :- [[HYPERLINK TAGGING,"BEGIN/END_TABLE"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_BGCOLOR"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_BORDER"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_BORDERCOLOR"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_CAPTION"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_CELLPADDING"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_CELLSPACING"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_CELL_ALIGN"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_COLO(U)R_ROWS"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_CONVERT_XREFS"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_EVEN_ROW_COLO(U)R"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_HEADER_COLS"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_HEADER_ROWS"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_MAY_BE_SPARSE"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_MIN_COLUMN_SEPARATION"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_ODD_ROW_COLO(U)R"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_COMMA_DELIMITED_TABLE"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_DELIMITED_TABLE"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_ALIGN"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_IGNORE_HEADER"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_LAYOUT"]] [[BR]] 7.1.5 The CHANGE_POLICY command The CHANGE_POLICY command allows policies as described in the [Policy Manual]. You can read more about this in the Tag Manual under [[HYPERLINK URL,"tag_manual_2.html#Section_2.6","The CHANGE_POLICY command"]] 7.1.6 Block definition The programs allows you to define "blocks" of text that can later be inserted or embedded wherever you want in the text. For more details read the Tag Manual section [[HYPERLINK URL,"tag_manual_2.html#Section_2.7","Definition blocks and variables"]] Commands include :- [[HYPERLINK TAGGING,"DEFINE/END_BLOCK and RESET_BLOCK"]] [[BR]] [[HYPERLINK TAGGING,"DEFINE_VARIABLE"]] [[BR]] [[HYPERLINK TAGGING,"EMBED_BLOCK"]] [[BR]] [[HYPERLINK TAGGING,"INSERT_BLOCK"]] [[BR]] [[HYPERLINK TAGGING,"SAVE_CONTEXT...RESTORE_CONTEXT"]] [[BR]] 7.1.7 HTML Fragments *New in version 4* It is possible to define blocks of HTML that the program will use instead of the HTML it normally produces in certain circumstances. These blocks of HTML can themselves contain special HTML fragment tags to allow the HTML to be customized. For example the block HORIZONTAL_RULE can be used to replace the
tag produced wherever lines are detected by a .gif of your choice. If you want this .gif to adopt suitable length and alignment (to match those in the original text) you can use the HTML fragment tags RULEWIDTH and RULEALIGN. A full description of [HTML fragments] can be found in the [Tag Manual]. 7.2 In-line tags In-line tags are introduced in version [[TEXT 3.2]]. They allow you to place special codes in your source text to achieve effects that couldn't be done by converting plain text alone. As such they greatly enhance AscToHTM's ability to be used as a web authoring tool. 7.2.1 Format of in-line tags In-line tags are an extension of the preprocessor tags introduced in earlier versions. As the name implies, in-line tags may be placed "in line" with the source tag, giving greater flexibility for their use. In-line tags are signified by the start and end delimiters "[[OT]]" and "[[CT]]", for example :- [[OT]]HTML_COMMENT this will become an HTML comment[[CT]] The whole tag must be contained within a single line of the source file, that is there cannot be any newline characters in the middle of a tag. You can have as many tags as you like on any given line, but they may not be nested. In-line tags are fully documented in the [Tag Manual]. A summary of some of the in-line tags available is given in 7.2.2. 7.2.2 Summary of in-line tags This is a brief list of the available in-line tags. A fuller description of using in-line tags is available in the [tag manual] Tags include :- [[HYPERLINK TAGGING,"BR (line break)"]] [[BR]] [[HYPERLINK TAGGING,"COLO(U)R"]] [[BR]] [[HYPERLINK TAGGING,"CONTENTS_LIST"]] [[BR]] [[HYPERLINK TAGGING,"ENTITY"]] [[BR]] [[HYPERLINK TAGGING,"FONT"]] [[BR]] [[HYPERLINK TAGGING,"GOTO"]] [[BR]] [[HYPERLINK TAGGING,"HTML"]] [[BR]] [[HYPERLINK TAGGING,"HTML_COMMENT"]] [[BR]] [[HYPERLINK TAGGING,"HYPERLINK"]] [[BR]] [[HYPERLINK TAGGING,"LINKPOINT"]] [[BR]] [[HYPERLINK TAGGING,"SPACES"]] [[BR]] [[HYPERLINK TAGGING,"NB ""non-breaking spaces"""]] [[BR]] [[HYPERLINK TAGGING,"SUPER and SUB"]] [[BR]] [[HYPERLINK TAGGING,"TEXT"]] [[BR]] [[HYPERLINK TAGGING,"TIMESTAMP"]] [[BR]] [[HYPERLINK TAGGING,"TABLE_LAYOUT"]] [[BR]] [[HYPERLINK TAGGING,"FILENAME"]] [[BR]] [[HYPERLINK TAGGING,"FRACTION"]] [[BR]] [[HYPERLINK TAGGING,"IGNORE_THIS"]] [[BR]] [[HYPERLINK TAGGING,"VARIABLE"]] [[BR]] [[HYPERLINK TAGGING,"VERSION"]] [[BR]] 8 Frames ------------ 8.1 Overview *New in version 4* New in version 4 is the ability to have the program generate a set of frames from your source file. The program works to a model set of frames as shown below, but you have a great degree of control over how the frames are laid out, and what their contents are. $_$_BEGIN_DIAGRAM +------------------------------------------------------------+ | Header frame | | (optional) | +-------------+----------------------------------------------+ | NOFRAMES | | | link | | | | | | | | | | | | Contents | Main | | Frame | Frame | | (optional) | | | | | | | | | | | | | | | | | +-------------+----------------------------------------------+ | Footer frame | | (optional) | +------------------------------------------------------------+ $_$_END_DIAGRAM 8.2 The frames generated 8.2.1 The master document Frames are implemented under HTML by having a document that describes the frame layout by using one or more nested tags. These tags group together tags that identify other HTML files that describe the contents of the individual frames or panes. The HTML page containing the doesn't normally contain any visible content. The source of this HTML page looks something like this :- $_$_BEGIN_PRE <BODY> <p>This browser does not support FRAMES<p> <p>Visit <A TARGET="_top" HREF="noframes_main.html">this link</A></p> </BODY> $_$_END_PRE This example produces a layout similar to that shown in the diagram in 8.1. There are four frames as follows :- - "header" at the top of the screen with content taken from the HTML page header.html - "footer" at the bottom of the screen with content taken from the HTML page footer.html - the two frames "contents" and "main" side by side in the middle of the screen, between the "header" and "footer" frames. The "contents" frame is on the left, the "main" frame on the right. The contents of these frames are held in the html files "contents.html" and "main.html". The tag describes the content to be displayed if the browser doesn't support frames. This is less common now, but is still important as many search engines don't understand frames, and will only index the pages linked to in the <NOFRAMES> tag. In HTML the frame names and source file names can be whatever you like. AscToHTM uses the frame names "header", "footer", "contents" and "main", but will vary the source file names according to the name of your input filename. Depending on the details of your conversion, not all of the above frames are generated, in which case the <FRAMESET> tags will look slightly different. You don't need to worry about any of this as AscToHTM will determine what layout is required and will generate the necessary HTML <FRAMESET> code. By default if you convert a file called "myfile.txt" the files created are named as follows:- myfile_frame.html - Master <FRAMESET> file myfile_header_frame.html - "header" source file. myfile_contents_frame.html - "contents" source file. myfile_footer_frame.html - "footer" source file. myfile.html - "main" source file. 8.2.2 The "main" frame The "main" frame will contain the conversion of your source file. If you elect to split a document into many pages, then this will show the start page (which will have links to any next/previous page). See also [[GOTO Splitting the document into many HTML pages]] 8.2.3 The "contents" frame If your document has recognised headings, then the program is able to generate a contents list (see 5.6.2). In such cases a "contents" frame is generated and the contents list is placed in a file called "myfile_contents_frame.html". If no contents list can be generated, then no contents frame is created unless you supply a CONTENTS_FRAME [HTML fragment] to be used as the contents of the "contents" frame (see 8.4) The contents frame is placed to the left of the main frame. It will include a hyperlink labelled "NOFRAMES" (see 8.5) and the generated contents list. This is different from the <NOFRAMES> tag described in 8.2.1. You can use policies (see 8.3) to suppress the creation of a contents frame or to control the following:- - width of the frame - colours of background and text - number of levels shown in the generated contents list - whether a "NOFRAMES" link is shown, and what URL it links to You can also customize the frame's appearance using the following [HTML fragments] (see 8.4) - CONTENTS_FRAME - START_TOC / END_TOC 8.2.4 The "header" and "footer" frames The software cannot "detect" headers and footers in your source text, so you will only get a header or footer frame if you supply the HTML yourself. Header and footer frames can be useful as they provide you with the opportunity to supply titles, navigation links or copyright notices that are always visible. Prior to version 4 the software already had the ability to add HTML headers and footers to each page generated using HTML supplied in separate files identified by policy values. From version 4 onwards [HTML fragments] may also be used. *NOTE: We recommend that, where possible, you use [HTML fragments] to define any header and footer HTML* It's expected that you may want to convert the same source into both frames and non-frames forms, using the same policy file. Given this the program has the ability to "promote" the HTML headers and footers used in non-frames production into their own always-visible frames. Equally there may be times when this behaviour is not wanted. The relationships between headers and footers used in non-frames conversion and those used in frames-based conversion are quite complex. In the following sections we describe how headers (footers) are calculated. The logic is described for headers, but applies equally well to footers if you make the necessary name changes. 8.2.4.1 Non-frames use of HTML headers In non-frames conversion each page created will get a HTML header if a) The policy [[HYPERLINK POLICY,"HTML header file"]] is set b) The [HTML fragment] HTML_HEADER is defined If both are set, the HTML_HEADER fragment is used in preference. The selected header is referred to as the "standard" header in the discussion in the next two sections. Note: For HTML footers the fragment HTML_FOOTER is used, and the policy [[BR]] [[HYPERLINK POLICY,"HTML footer file"]] is tested. 8.2.4.2 "main" frame header In frames conversion the HTML header added to each page is determined by three things - Any "standard" HTML header defined for non-frames conversion (see 8.2.4.1) - the policy [[HYPERLINK POLICY,"Use main header in header frame"]] - whether or not a [HTML fragment] MAIN_FRAME_HEADER is defined If the fragment MAIN_FRAME_HEADER is defined, then that is used. If the fragment MAIN_FRAME_HEADER is not defined, and there is no "standard" header, then the main frame gets no HTML header. If the fragment MAIN_FRAME_HEADER is not defined, and the policy is *not* set then the "standard" header is used as in non-frames conversion. If the fragment MAIN_FRAME_HEADER is not defined, and the policy is set then the "standard" header is promoted into its own "header" frame, and the main frame gets no HTML header. Note: For HTML footers the fragment MAIN_FRAME_FOOTER is used, and the policy [[BR]] [[HYPERLINK POLICY,"use main footer in footer frame"]] is tested. 8.2.4.3 "header" frame In frames conversion whether or not a "header" frame is created is determined by three things - Any "standard" HTML header defined for non-frames conversion (see 8.2.4.1) - the policy [[HYPERLINK POLICY,"Use main header in header frame"]] - whether or not a [HTML fragment] HEADER_FRAME is defined If the fragment HEADER_FRAME is defined, then that is used as the contents of a "header" frame. If the fragment HEADER_FRAME is not defined, and there is no "standard" header, then no "header" frame is created. If the fragment HEADER_FRAME is not defined, and the policy is not set, then no "header" frame is created. If the fragment HEADER_FRAME is not defined, and the policy is set, then the "standard" header is used as the contents of the "header" frame. In other words "standard" header is promoted from the "main" frame into its own "header" frame. Note: For HTML footers the fragment FOOTER_FRAME is used, and the policy [[BR]] [[HYPERLINK POLICY,"use main footer in footer frame"]] is tested. 8.3 Using policies to control the frame structure A large number of policies influence frames generation. These are described more fully in the [Policy Manual]. _general_ [[HYPERLINK POLICY,Place document in frames]] [[HYPERLINK POLICY,Output frame name]] [[HYPERLINK POLICY,Add Frame border]] [[HYPERLINK POLICY,New frame link window name]] [[HYPERLINK POLICY,Open frame links in new window]] _contents frame_ [[HYPERLINK POLICY,Add contents frame if possible]] [[HYPERLINK POLICY,Add NOFRAMES links]] [[HYPERLINK POLICY,NOFRAMES link URL]] [[HYPERLINK POLICY,Number of levels in contents frame]] [[HYPERLINK POLICY,Contents Frame width]] [[HYPERLINK POLICY,Contents frame background colour]] [[HYPERLINK POLICY,Contents frame text colour]] _main frame_ [[HYPERLINK POLICY,First frame page number]] A number of [[GOTO file generation policies"]] affect the main frame's appearance, including :- [[HYPERLINK POLICY,"Split level"]] [[BR]] [[HYPERLINK POLICY,"Min HTML File size"]] [[BR]] [[HYPERLINK POLICY,"Add navigation bar"]] [[BR]] _header and footer frames_ [[HYPERLINK POLICY,Use main header in header frame]] [[HYPERLINK POLICY,Header Frame depth]] [[HYPERLINK POLICY,Header frame background colour]] [[HYPERLINK POLICY,Header frame text colour]] [[HYPERLINK POLICY,Use main footer in footer frame]] [[HYPERLINK POLICY,Footer Frame depth]] [[HYPERLINK POLICY,Footer frame background colour]] [[HYPERLINK POLICY,Footer frame text colour]] 8.4 Using HTML fragments to override frame contents [HTML fragments] were introduced in version 4 as a means of allowing users to customize some the HTML generated by the software. This feature is heavily used in frames generation. The fragment names used in frames production includes HEADER_FRAME If defined, this fragment is used as the contents of a header frame at the top of the screen FOOTER_FRAME If defined, this fragment is used as the contents of a footer frame at the bottom of the screen CONTENTS_FRAME If defined, this fragment is used as the contents of the "contents" frame on the left of the screen. If not defined the "contents" frame will contain a generated contents list MAIN_FRAME_FOOTER If defined, this fragment is used as the HTML footer of each page that appears in the main frame, overriding any HTML_FOOTER or value defined via policy file. MAIN_FRAME_HEADER If defined, this fragment is used as the HTML header of each page that appears in the main frame, overriding any HTML_HEADER or value defined via policy file. Other HTML fragments may have an effect. For example :- START_TOC A fragment to be output before any generated table of contents. If not defined the default behaviour is to output the title "Table of Contents" END_TOC A fragment to be output after any generated table of contents. If not defined the default behaviour is to simply put out a horizontal rule <HR> For a fuller description of [HTML fragments] see the [tag Manual]. 8.5 NOFRAMES tag and NOFRAMES link There are several reasons why providing a non-frames alternative to your pages is a good idea. These include - Not all browsers support frames. This is rarer these days, but there are still people who use text-based or non-visual browsers that can get confused by frames. - Not all people like frames. This is understating it, as many people *loathe* frames. This is because frames pages are hard to bookmark and the navigation can confuse some people. - Many search engines won't access the HTML pages used inside frames. This means your pages will go un-indexed, making it hard for people to find them. To help with these problems the software supplies a <NOFRAMES> tag in the main <FRAMESET> document, and a visible "NOFRAMES" hyperlink in the contents frame. 8.5.1 The "NOFRAMES" hyperlink The program can place a hyperlink in the contents frame. This link is labelled "NOFRAMES" and will link to the first main page. This will allow users who don't like frames to view your pages in a non-frames window. You can control this link to a limited extent using policies (see 8.3) 8.5.2 The <NOFRAMES> tag HTML provides a tag whose contents is displayed to any browser that doesn't support the <FRAMESET> tag. The program will automatically generate a <NOFRAMES> tag that displays a message saying the page requires frames, and offering a link to the first main page. This will allow users with non-frames browsers, and search engines to access your main pages. 8.5.3 Generating frames and non-frames versions You should consider whether or not your pages are suitable for both frames and non-frames viewing. If they are, then you can use the first page displayed in the main frame as your NOFRAMES hyperlink target. This is, in fact, the default behaviour. There are a number of reasons that you might want to maintain two sets of pages :- - You don't want to have the non-frames version split into as many small small pages as the frames version (different [[HYPERLINK POLICY,"Split level"]] policy values) - You want to place different headers and footers on the two versions to allow for different methods of navigation. If you do want two sets of files, simply convert the file twice with and without frames generation selected. You can either move the files into different directories, or change the output filename for one of the sets. Other than these changes you should be able to use the same policy file. If you create two sets of files, make sure you set the [[HYPERLINK POLICY,NOFRAMES link URL]] policy to point to the first non-frames HTML page. 8.6 Hyperlink targets One of the reasons people dislike frames is that when they click on a hyperlink the selected page can end up being displayed inside the frame, rather that in a full window. Alternatively the selected link is displayed in a new browser window. AscToHTM defaults to the following behaviour - links in the "header", "footer" and "contents" frames are all displayed in the "main" frame. - links in the "main" frame that belong to the current document (although possibly in a different HTML source file) are displayed in the "main" frame. - links in the "main" frame that *do not* belong to the current document (e.g. they are to another site) may be displayed in a full browser window. You can control this behaviour using the policies [[HYPERLINK POLICY,Open frame links in new window]] [[BR]] [[HYPERLINK POLICY,New frame link window name]] The default behaviour for links in the last category is to display them in the "_top" window. The name "_top" is reserved by browsers to mean the main browser window, so in most cases clicking on a link to an external site will cause the current set of frames to be replaced by the selected page without creating any additional browser windows. 8.7 Splitting large files When generating frames documents the area of screen allocated to the "main" frame is necessarily smaller than the whole browser window. For his reason you may want to split your document into many, smaller pages to reduce the need for scrolling. This can work well with a contents list on the left, and is in fact the main reason people like frames as a means of navigating through a large set of information. See [[GOTO File generation policies]] for details on the policies that may be used to split large files into a set of smaller pages all linked together. 8.8 Selecting "Output HTML as a set of FRAMES" in the Windows version For your convenience the Windows version of the software allows you to select a conversion type and includes a "Output HTML as a set of FRAMES" option. If you select this option a number of policies are set for your convenience. These include [[HYPERLINK POLICY,"Place document in frames"]] [[BR]] [[HYPERLINK POLICY,"Split Level"]] [[BR]] The former is set to "yes", and the latter is set to "2", which should hopefully prove a suitable default. When this option is first selected the "Frames" properties sheet is displayed to allow you to review and edit the selected [[GOTO frames policies]]. 9 Purchasing AscToHTM, and contacts on the web -------------------------------------------------- 9.1 Purchasing AscToHTM 9.1.1 Why should I purchase AscToHTM? You need a reason? :-) Oh well... here are some reasons to visit the [Reg location] :- - You'll get a warm glow from supporting the author financially. Not enough eh? Okay... - Although he doesn't make a living from this software, the author appreciates the support that the program gets. - You'll get support from the author, especially in your early days as a new user. - You'll be notified of any [upgrades]. To date all upgrades have been *free* to _registered_ users. This means people who paid last year's price are getting this years software at a discount (plus they've had a year's use). You can read details of the 11 or more upgrades made over 3-4 years in the [[GOTO "Change History"]]. Every single one of these has been free to those who have registered to date. - You'll be able to ask the author for new features. You won't always get them, but you'd be *amazed* at how much of the functionality arose directly from _registered_ users party requests. Finally... - No nag lines at top and bottom of each page, and no nag screen when you start the program up. - All limitations on the program (file size etc) are removed 9.1.2 What happens if I don't register the shareware version of AscToHTM? Originally I wanted to produce a fully-featured, but time-limited shareware version. However, for various reasons we've had to move to move to producing a largely-featured version with a 30 day time limit. Sorry 'bout that. At present the shareware version of the program will expire after 30 days plus 5 days usage. Each time the program runs it will tell you how may days use you have. During this period the program inserts a one line reminder to register at the top and bottom of each HTML page generated. This line is omitted in the registered version. This line is easily deleted from the output source file, but we expect this to become quite tedious if the program is used repeatedly, particularly when large documents are split into a number of small files, or a large number of files are being converted. *Some people have actually put such pages on the web*. There are other limitations of the shareware version :- - If you don't register, it will cease to function properly after 30 days. Specifically after 30 days any conversions will convert all your text to random case. This will still allow you to evaluate the software, but the resulting HTML will be of little use to you. This mixed case will go away if you register the software. - In the shareware version you're limited to only the first 500 lines of any source file. After 500 lines a warning is placed in the output, and all subsequent lines are converted to upper case. This allows you to gain an impression of what the HTML will look like for evaluation purposes. - In the shareware version, wildcard conversions are limited to only 5 files - Certain other policies are not supported in the shareware version. I don't like limiting the software, but there you go. 9.1.3 Can't I get something for nothing? Only if you fall into either of the following two categories: - You're an FAQ maintainer. FAQ maintainers add a lot of value to the Internet. As a little "thank you" I'm making this software free to anyone who maintains an FAQ. See the [Reg location] for more details. - VMS users. The software is largely developed and tested under VMS. Although the VMS version doesn't have the windowed interface, but does share the underlying conversion software. *<SOAPBOX>* We VMS users pay too much for software (that's when we can get it), so the VMS version of this software is made available for free. *</SOAPBOX>* If you really want a free copy, buy an OpenVMS system :) If you find any for less than $40, let me know :) 9.1.4 I'm convinced. How to I purchase AscToHTM? First we recommend you try out the product to convince yourself that it meets your needs. You've done that right? Okay, visit the [Reg location] and follow the instructions there. You can buy AscToHTM online through a couple of third party registration services. These services can accept payment through a number of methods including credit cards, wire transfer or snail mail. Once the registration service has received payment details of how to get your registered version. Often the whole process can be completed in minutes. If you experience any problems registering email sales@jafsoft.com with details. 9.2 Contacts on the Web 9.2.1 The home page The AscToHTM [home page] is hosted on the [JafSoft] web site. If you have problems locating the home page and suspect it has moved, go to [AltaVista] and enter +"John A Fotheringham" +AscToHTM to locate any new home page. 9.2.2 E-mail E-mail any feedback to info@jafsoft.com. Most people will get a reply within 24 hours, although we cannot guarantee this given holidays and the like. 9.2.3 Support Support is available to registered users by emailing support@jafsoft.com. Any enquiries should be directed to the same address. Sadly, we cannot guarantee any replies, though we do try to be helpful. Priority is given to people who have registered copies. Recently a [FAQ] has been created. Although not complete, it may help to answer some of the commoner questions people have. 10 Known problems --------------------- We listen to all suggestions, and indeed many of the features added have been in direct response to customer feedback. (You couldn't expect us to *invent* all this stuff on our own now, could you? :) 10.1 Bug reports _Registered_ users are free to make bug reports or suggestions for enhancements to support@jafsoft.com. We try to fix bugs ASAP, and to date have usually shipped fixes to specific problems within 72 hours, but we can't promise this response time. We used to maintain a bug list, but we prefer spending the time fixing them rather than documenting them. 10.2 Features All good software has features (ask Microsoft). - Links in the link dictionary that have some common text may get confused. The problem generally is that having put some HTML markup into your line, it becomes hard not to search the contents of that markup subsequently. This is my problem not yours. Unless it catches you out when, of course, it becomes your problem, not mine. - Bullet characters inside section headings used to cause confusion. The following are consequences of how the program works, and may take longer to "fix" - The program currently assumes a structure of contents list and/or main body. It doesn't (yet) cope with Appendices. Nor does it cope with several sections all with their own numbering systems. - Certain algorithms put a <BR> on the end of the line. Where these algorithms "misfire" you may find unexpected breaks in large paragraphs. Over time more of these will be configurable via the document policy, but originally we tried to avoid the need for such micro-control. 10.3 Coming soon... or not. Many, many features have been added in direct response to (_registered_) users requests. Many, thanks to all those who've come up with suggestions and feedback (you all know who you are). 11 Change History -------------------- 11.1 Version 4.1 (August 2001) 11.1.1 New functions - New /TABLE (see 4.2.2.13) command line qualifier that allows the input file to be treated as a single plain text table - Added support for HEAD_SCRIPT [HTML fragment]. This allows HTML to be defined that can be copied into the <HEAD> of a document. This can inclued <META> tags of <SCRIPT>...</SCRIPT> sections. - Added Swedish interface. Many thanks to Dan Sverraby. - Added new policy [[HYPERLINK POLICY,"Only allow pages to be viewed in frames"]] - New utility A2HDETAG is available to registered users so they can "de-tag" their source files to remove all AscToHTM pre-processor tags, leaving a plain text fit for publishing, e.g. on Usenet. - Added BEGIN_ASCII ... END_ASCII pre-processor tags. These identify text that will be copied to the output of A2HDETAG. It is ignored in all other conversions, and is intended to allow alternative text to be placed in text and HTML versions of a document. - Added [[HYPERLINK POLICY, "character endcoding"]] policy to allow the character encoding of a document to be set. The software has limited ability to detect Japanese ("x-sjis") and Cyrilic ("koi-8") text, but in some cases this will need to be set. The auto-detect of character sets can be switched off by using the [[HYPERLINK POLICY, "Look for character encoding"]] policy - Added policies to allow different fonts to be applied to different types of text as follows Normal text [[HYPERLINK POLICY,"Default font"]] Headings [[HYPERLINK POLICY,"Heading Font"]] Text in tables [[HYPERLINK POLICY,"Table font"]] Table of contents [[HYPERLINK POLICY,"TOC Font"]] Fixed-pitch text [[HYPERLINK POLICY,"Fixed font"]] The "Default Font" policy existed previously, the other four policies are new in this version. - Added [[HYPERLINK TAGGING,PAGE]] directive. This marks a page boundary. In HTML this simply results in a <HR> tag, since HTML doesn't really support pages. This may be expanded in future to allow page numbers and the like to be displayed. 11.1.2 Other Changes _Windows version_ - Loading a policy file with "place policy in frames" policy will now toggle the Conversion type - You no longer get prompted to "save policy" just because you pressed OK on one of the policy sheets. Now this only happens when something has been changed. - The main menu now has a "check for updates" option. If you select this you'll be taken to the JafSoft website where you'll be told if any newer versions of the software have been released. _Documentation_ - The list of bug fixes is removed from this document and is now to be found on-line at http://www.jafsoft.com/doco/asctohtm_bug_history.html _All versions_ - Added support for HTML fragment files to $_$_INCLUDE other HTML fragment files. This allows common fragments to be shared. - Fine-tuned the detection of whether or not a file has an in-situ contents list - When Frames generation is selected the default "Split level" is set to 1 instead of 2. This means you'll get fewer files generated and - depending on the type of headings you have - no splitting may occur unless you manually increase the split level. - The [[HYPERLINK TAGGING,"LINKPOINT"]] pre-processor tag can now be used as a directive as well as an in-line tag. (see the [Tag Manual] for details). - Added a "Range" attribute to the [[HYPERLINK TAGGING,"CONTENTS_LIST"]] tag. This allows mini-contents lists to be generated which contain only entries for a part of the document, rather than the whole document, e.g. for just a single chapter. This should help those who want to split large files into pages and to have a mini-contents list for each section. - Improved handling of VT escape characters. These are either removed from the output or converted to "line" characters - Added auto-detect of double spaced files (files where every second line is blank). This will set the [[HYPERLINK POLICY,"Input file is double spaced"]] policy whenever double-spaced text is detected (unless the policy has already been set). 11.2 Version 4 (May 2001) 11.2.1 New functions _API version_ - For those wishing to call AscToHTM programmatically, an API has been developed. This is sold under separate license. Contact sales@jafsoft.com if you're interested. _Linux version_ - A Linux command line version will soon be available. Beta versions have been tested, and I hope to do a Linux command line release just after version 4 is released. _Windows version_ - You can now choose from the main screen whether you want your HTML output as one or more HTML file(s), sent to the Windows Clipboard (see 3.4.5 and 4.1.1.4), or turned into a set of HTML frames (see [[GOTO Frames]]). - Program now remembers positions of windows from one invocation to the next. - The user interface is now available in Italian and Swedish _All versions_ - Version 4 introduces frames support (see [[GOTO Frames]]). This introduces a large number of supporting policies :- [[HYPERLINK POLICY,"Place document in frames"]] [[HYPERLINK POLICY,"Output frame name"]] [[HYPERLINK POLICY,"Header Frame depth"]] [[HYPERLINK POLICY,"Footer Frame depth"]] [[HYPERLINK POLICY,"Contents Frame width"]] [[HYPERLINK POLICY,"Use main header in header frame"]] [[HYPERLINK POLICY,"Use main footer in footer frame"]] [[HYPERLINK POLICY,"Add contents frame if possible"]] [[HYPERLINK POLICY,"Add Frame border"]] [[HYPERLINK POLICY,"Open frame links in new window"]] [[HYPERLINK POLICY,"New frame link window name"]] [[HYPERLINK POLICY,"Add NOFRAMES links"]] [[HYPERLINK POLICY,"NOFRAMES link URL"]] [[HYPERLINK POLICY,"Number of levels in contents frame"]] [[HYPERLINK POLICY,"First frame page number"]] [[HYPERLINK POLICY,"Header frame background colour"]] [[HYPERLINK POLICY,"Header frame text colour"]] [[HYPERLINK POLICY,"Contents frame background colour"]] [[HYPERLINK POLICY,"Contents frame text colour"]] [[HYPERLINK POLICY,"Footer frame background colour"]] [[HYPERLINK POLICY,"Footer frame text colour"]] - Added [HTML fragments] feature, with [[HYPERLINK POLICY,"HTML fragments file"]] policy and DEFINE_HTML_FRAGMENT, RESET_HTML_FRAGMENT pre-processor commands. This allows you to define HTML fragments that can be used to replace the standard HTML generated by the program. This allows you to customize headers, footers, horizontal rules, contents lists, navigation bars and more. - Added support for URL parsing, including :- - new top level domains (.info, .biz etc) are supported - the "snews://" secure news server protocol type is now supported - URLs of the form http://username@domain_name/... are now supported - Added [[HYPERLINK POLICY,"Check domain name syntax"]] policy - Added [[HYPERLINK POLICY,"Create Telnet links"]] policy - Added support for "obfuscated" URLs such as http://3640005069/ http://7934972365/ http://0330.0366.0021.0315/ http://%6c%6f%63%6b%65%72%67%6e%6f%6d%65%2e%63%6f%6d/ Although the display text is left unchanged, the hyperlink will point to a non-obfuscated URL (either the domain name, or an IP address). This is because the obfuscated URLs such as there are often used by spammers, and the author has no intention of allowing his software to aid spammers in their goals. If someone cares to give me a valid reason for using such URLs I may reconsider this behaviour. - Added support for embedded headings with the [[HYPERLINK POLICY,"Expect embedded headings"]] policy (see 5.4.4). These are "headings" that are embedded as the first sentence in a paragraph. - Added support for headings that start with particular words or phrases via the [[HYPERLINK POLICY,"Heading key phrases"]] policy (see 5.4.5). - New /COMMA (see 4.2.2.1) and /TABBED (see 4.2.2.12) command line qualifiers that allow comma delimited and tab delimited files be converted into tables. - Added [[HYPERLINK POLICY,"Check indentation for consistency"]] policy to allow checking of headings to be relaxed (e.g. when they're centred on the page). - Added [[HYPERLINK POLICY,"Look for diagrams"]] policy - Added [[HYPERLINK POLICY,"Input file contains PCL codes"]] policy - Added [[HYPERLINK POLICY,"Input file contains Japanese characters"]] support. - Added [[HYPERLINK POLICY,"Preserve new paragraph offset"]] policy - Added [[HYPERLINK POLICY,"Omit <HEAD> and <BODY> from output"]] policy - Added [[HYPERLINK POLICY,"Document Base URL"]] policy - Added [[HYPERLINK POLICY,"Comment generation code"]] policy - Added [[HYPERLINK POLICY,"Number of words to include in filename"]] policy to allow filenames to be generated from the first few words of the title when splitting documents with underlined or capitalised headings at each heading. - Added [[HYPERLINK POLICY,"Lines to ignore at end of file"]] and [[HYPERLINK POLICY,"Lines to ignore at start of file"]] policies to allow lines at the start and end of the source file to be discarded. This can be useful if you source text is coming from a third party source that adds extra, unwanted, lines. - Added [[HYPERLINK POLICY,"Suppress all colour markup"]] policy - Added [[HYPERLINK POLICY,"Suppress all font markup"]] policy - Added [[HYPERLINK POLICY,"Mirror margins"]] policy (RTF only) - Added [[HYPERLINK POLICY,"First line indentation (in blocks)"]] policy - Added [[HYPERLINK POLICY,"Column boundaries have zero width"]] policy 11.2.2 Other changes _Windows version_ - On some systems DDE doesn't always work properly. This would cause the program to hang when it attempted to display results. In such cases you would need to stop the program from the task manager. In version 4 the program will now detect when this has happened and disable use of DDE next time it runs. NOTE: DDE won't work with Netscape 6.0 (it doesn't support it) - Added the policy [[HYPERLINK POLICY,"Suppress URL messages"]] to the Settings | Diagnostics menu option. When disabled all URLs, email addresses etc will be listed in the log file. Since this file can be saved to disk, this is one way of identifying all the candidate hyperlinks from your text file. _All versions_ - Improved analysis for tables using bar ('|') column separators - Improved detection of ASCII art diagrams. - Improved handling of heavily indented blocks of text. Previously these were (poorly) rendered as tables. Now the tables more accurately preserve the large indentation (see 5.5.3.4). - The first three words of an underlined heading are now used to generate the filename. Previously only the first word was used, leading to less meaningful names, with more chances of duplication. - VMS command line now allows multiple filespecs, separated by spaces. Policy file must now be a .pol file, rather than the second argument. - Anchor names from filename are now lower case (to reduce possible mismatches) - Shareware version now expires after 30 days + 5 uses. This will allow people to use the software on 5 different days after the first 30 days, giving people more time to evaluate the software at their leisure. - Now strip out leading and trailing "---" from heading text to make them more presentable in HTML or RTF - Added support for headings that span up to 3 lines, previously this was only 2. - Changed heading to allow <H4> markup to be used. Previously "level 4" headings would get <H3> markup since anything smaller would end up smaller than the main text. With the advent of CSS style sheets This should be less of a problem. - Changed emphasis handling to allow hyphenated parts to be emphasised independently, e.g. pre-_formatted_ or _pre_-formatted. 11.3 Version 3.3 (June 2000) The AscToHTM [[TEXT 3.3]] release follows 6 "micro-releases" announced via the updates page on the Web. As such it will appear as a small step forward over [[TEXT 3.2.06]], but in fact it offers a fair amount of new functionality over version [[TEXT 3.2]] Major changes in version [[TEXT 3.3]] include :- - *Support for fonts*. You can now choose a font for the whole document. By default this is implemented using CSS, but you can elect to use <FONT> tags should you prefer. - *Enhanced Language support* The Spanish and German interfaces added in the last version have had Portuguese added. Also a new feature allows you to save the interface to a "language skin" text file which may be edited and then reloaded. Using this feature we can now offer - American English (simply a spell-checked UK English file) - "Babelfish" French. A French translation from http://babelfish.altavista.com/ If anyone wants to correct these files and send them back to me, feel free. - *More table generation controls*. Several new controls have been added to give you more control over the detection, analysis and generation of tables in the text. - *Support for comma and tab delimited tables*. Pre-processor commands have been added to allow you you mark up a section of comma-delimited or tab-delimited data you want turning into a table. - *Support for preserving file/line structures*. You can now elect to preserve the original line structure of a file, or to place the whole file in <PRE> markup (which is a little defeatist, but has its uses) - *Support for non-standard characters*. The program can now recognise, to a limited extent, DOS line-drawing characters, MIME-encoded text and text documents with "change bars" in them. - *New "Tag manual"*. The [[GOTO Using the preprocessor]] and [[GOTO In-line tags]] sections of this document have now been re-merged and their contents largely moved to a new document called the [Tag Manual]. 11.3.1 New functions _Fonts_ - The default font for the whole document can now be set via the [[HYPERLINK POLICY,"Default font"]] policy. Headings will also adopt the selected font, and will scale with the selected font size, although the <H1> headers are slightly smaller than the default. You can choose to have the fonts implemented using <FONT> tags or CSS (e.g. according to your target audience) using the [[HYPERLINK POLICY,"Use CSS to implement fonts"]] policy. _Definition Blocks_ - Definition blocks allow you to define blocks of text that you may then insert at any point in the text (e.g. to give an "end of page" effect). You can also "define variables" whose value is then inserted wherever a VARIABLE tag is used. The pre-processor commands involved are: [[HYPERLINK, TAGGING,"DEFINE/END_BLOCK and RESET_BLOCK"]] [[BR]] [[HYPERLINK, TAGGING,"DEFINE_VARIABLE"]] [[BR]] [[HYPERLINK, TAGGING,"EMBED_BLOCK"]] [[BR]] [[HYPERLINK, TAGGING,"INSERT_BLOCK"]] [[BR]] [[HYPERLINK TAGGING,"VARIABLE"]] [[BR]] [[HYPERLINK, TAGGING,"SAVE/RESTORE_CONTEXT"]] [[BR]] _Tables_ - Added several new policies and tags to help with table analysis. Policies added include [[HYPERLINK POLICY,"Default TABLE layout"]] (also pre-processor tag [[HYPERLINK TAGGING,"TABLE_LAYOUT"]]) This allows you to specify the number of columns in each table, and the attributes of each column, specifically the character position that marks the end of each column. Rather than use this policy, it is probably better to use the related directive $_$_TABLE_LAYOUT in the source text on a per-table basis. [[HYPERLINK POLICY,"Default TABLE alignment]] (also pre-processor tag [[HYPERLINK TAGGING,"TABLE_ALIGN"]]) Allows the alignment of the table to be specified (left, right, center) [[HYPERLINK POLICY,"Ignore table header during analysis"]] (also pre-processor tag [[HYPERLINK TAGGING,"TABLE_IGNORE_HEADER"]]) Specifies that table headers should be ignored when columns are being auto-detected. Some tables have complex headers that confuse the analysis. This policy can be used to help them be ignored. [[HYPERLINK POLICY,"Table extending factor"]] Controls the degree to which pre-formatted lines should be expanded into adjacent text. [[HYPERLINK POLICY,"Column merging factor"]] Controls the degree to which columns which don't appear to be very clear should be "merged" together [[HYPERLINK POLICY,"Tables could be blank line separated"]] Indicates that tables could be using blank lines to separate rows of data. This affects the analysis and detection of the tables extent. - Added support for embedding comma-delimited and tab-delimited table data in your source file (e.g. data exported from Excel and the line). The new pre-processor directives :- [[HYPERLINK TAGGING,"BEGIN/END_COMMA_DELIMITED_TABLE"]] [[BR]] [[HYPERLINK TAGGING,"BEGIN/END_DELIMITED_TABLE"]] [[BR]] _Other_ - Added options to allow more control over how the original document's file structure should be preserved [[HYPERLINK POLICY,"Treat each line as a paragraph"]] If this option is selected, every line in the source file is treated as a paragraph. This may be suitable if the file has been authored using an editor that wraps the lines (i.e. doesn't put in hard breaks) and which doesn't add blank lines between paragraphs. [[HYPERLINK POLICY,"Preserve line structure"]] If this option is selected a <BR> is added to every line, thereby preserving the line structure of the original and giving the resulting HTML file an "A4 look" that hugs the left margin regardless of how wide the window is made. [[HYPERLINK POLICY,"Preserve file structure using <PRE>"]] If this option is selected the whole document is placed in <PRE> markup, and very few conversions are attempted. This is really a "last resort" option that you may want to use if the file has complex structures which the program is failing to understand. This option was added for a customer who wanted to convert all 2800 RFCs without having to manually correct each one. - Added support for parsing files with some Mime-encoded quotable strings in them. The new policy [[HYPERLINK POLICY,"Input file contains mime encoding"]] can be found under _Analysis->File structure_. At present there is some (very limited) auto-detect for this feature. - Added support for documents with change bars. By default change bars are stripped out, and the changed text coloured red this behaviour may be changed in later versions. Added the new policy [[HYPERLINK POLICY,"Input file has change bars"]] which can be found under _Analysis->File Structure_. - Added support for converting DOS characters. The new policy [[HYPERLINK POLICY,"Input file contains DOS characters"]] can be found under _Analysis->File Structure_. There is a limited auto-detect of DOS characters when diagrams are present. - Changed hyperlink detection to only allow explicit FTP URLs and email addresses that don't start with numbers. These behaviours can be reversed using the new policies [[HYPERLINK POLICY,"Only allow explicit FTP links"]] and [[HYPERLINK POLICY,"Allow email beginning with numbers"]], both of which are on the _Output->Hyperlinks_ tab. - Added the policy [[HYPERLINK POLICY,"Create gopher links"]] to toggle the conversion of gopher links into hyperlinks. - Added the policy [[HYPERLINK POLICY,"Check indentation for consistency"]] so that it could be disabled in documents where headings were centred (and thus all at different indentations) - Added several new pre-processor in-line tags :- [[HYPERLINK TAGGING,"FILENAME"]] - output name of converted file[[BR]] [[HYPERLINK TAGGING,"FRACTION"]] - output a fraction[[BR]] [[HYPERLINK TAGGING,"VERSION"]] - output program version number[[BR]] [[HYPERLINK TAGGING,"IGNORE_THIS"]] - for comments in the source code[[BR]] - Added policy to allow selection of which version of HTML should be generated. Policy is [[HYPERLINK, POLICY,"HTML version to be targeted"]]. Only "HTML [[TEXT 3.2]]" and "HTML [[TEXT 4.0]] Transitional" are currently supported. 11.3.2 Other changes _Windows_ - The main screen now allows access to Policy file selection. Previously this was only available on the menu structure. The Menu structure has been left unchanged, meaning you now have two ways of choosing your policy files. _All_ - The contents list styling has been changed slightly. For example only the major section headings are now shown in bold. People were complaining :-) - Now add BORDER=0 attribute to tables with no border, rather than just omitting the attribute. This is a workaround for a bug in Netscape where a gap appears where a border would be when coloured rows are selected. - Support for IE [[TEXT 3.0]] as the browser of choice is added, by allowing the filename rather than file URL to be passed to the browser. To do this disable the "file://localhost/" option on the _Settings->Viewers_ dialog screen. - More changes on bullet characters, in particular to disallow 'O' (upper case) from becoming a bullet character through analysis. This really doesn't work in Portuguese documents :-) 'o' (lower case) may still be detected. If upper case 'O' is wanted this can still be manually switched on. - Increased maximum width allowed in tables to 200 (after encountering a sample at 165). Lines longer than this are disregarded as candidate table lines. - Introduction of German and Portuguese user interface, with extension of the Spanish user interface. - Horizontal lines are now implemented as <HR> tags whose length attempts to approximate the original (e.g. 50% or whatever). Previously lines would become full width. - Chapters 7 and 8 of this document were merged into a single chapter 7 (about the pre-processor). Most of that material has now been moved to the new [Tag Manual]. Subsequent chapters have thus been renumbered which may lead to invalid references to chapter 11... especially if you keep old versions of the doco lying around. Also reversed the order of sections in this "Change History" section 11.4 Version 3.2 (October '99) (Version [[TEXT 3.1]] was never released, but a release of [AscToTab] occurred sometime after version [[TEXT 3.0]], and so in keeping with the policy of synchronizing version numbers *that* was labelled version [[TEXT 3.1]]) Over a year after the last release, version [[TEXT 3.2]] is a major upgrade, but is only given a minor version number change because the remainder of the functionality produced in that time will be revealed in version [[TEXT 4.0]]. Version [[TEXT 3.2]] starts to prepare the groundwork for Cascading Style Sheet (CSS) and general font support that will be introduced in version [[TEXT 4.0]]. This has required a fairly radical change to the type of HTML code generated and how this is put together. For example the HTML is now more standards compliant (this is now a stated goal of the software, although I can't always promise full compliance see 1.1.4), and as an aid towards CSS support "optional" end tags such as </P> are now being placed in the generated HTML. Note that the use of the <FONT> tag is deprecated in HTML [[TEXT 4.0]], and if you choose to add FONT markup to your pages they'll become much bigger, especially if they contain tables. This is because the HTML standard requires the FONT tag to continually be re-expressed to achieve the right appearance in all browsers (believe me, I only accepted this through bitter experience and grudgingly). Major changes in version [[TEXT 3.2]] include :- - The program now *always* makes three passes through the document - previously it only did this if a contents list was requested (see 3.3). This may make the conversion a little slower. The middle pass calculates how the file will be split into sections, where all the hyperlinks should point to and what the contents list should be. This approach should be less error prone than previously. - New "overview" options (see 6.2.1). These allow you to easily enable and disable the program's search for certain features. - Introduction of in-line tagging (see [[GOTO "In-line tags"]]). These allow you to get more out of your conversion by inserting commands into your source text. - Addition of DDE support (in Windows) (see 4.1.3.4) - New and improved command line options, and full command line support built into the Windows version (see 4.2.2) - Improved message filtering. Each message is now labelled according to its type (information, warning etc), and may be optionally suppressed or filtered by severity. A new /SILENT command qualifier (see 4.2.2.10) allows complete suppression of messages. - Improved log file capability (see 4.3.4) - Added support for mail and USENET headers (see 5.4.7) - (Limited) support added for stripping out page markers, converting "double spaced" files, and converting .prn and VT escape sequences. This functionality may be improved in later versions. - New options to colour the odd and even rows of tables differently (see [[GOTO Table generation policies]] and 7.1.4) 11.4.1 New functions _Windows Version_ - Added "Save" option to status dialog, so that the messages can be saved into a .log file - Added DDE support to display results in existing browser window - Full drag and drop support added. You can now drag files onto the program when it is visible. - New "browse for directory" buttons added. - More menu options added to make finding policies easier. _All versions_ - Now support tab-delimited tables (mainly for AscToTab) (see 7.1.1) - Support for stripping out mail and USENET headers (see 5.4.7) - New pre-processor directives :- - [[HYPERLINK TAGGING,BEGIN/END_DELIMITED_TABLE]] section delimiters - [[HYPERLINK TAGGING,BEGIN/END_IGNORE]] command - [[HYPERLINK TAGGING,CONTENTS_LIST]] command - [[HYPERLINK TAGGING,NAVIGATION_BAR]] command - [[HYPERLINK TAGGING,LINERULE]] command - [[HYPERLINK TAGGING,TOC]] command - New and improved command line qualifiers - /CONSOLE (see 4.2.2.2) - /LIST (see 4.2.2.4) - /OUT=filename (see 4.2.2.8) - /SILENT (see 4.2.2.10) - (improved) /LOG=<filespec> (see 4.2.2.7). You can now specify the log filename - (improved) /POLICY=filename (see 4.2.2.9). You can now specify the created policy filename - New overview "look for" analysis policies :- - [[HYPERLINK POLICY,"Look for indentation"]] - [[HYPERLINK POLICY,"Look for paragraphs"]] - [[HYPERLINK POLICY,"Look for short lines"]] - [[HYPERLINK POLICY,"Look for quoted text"]] - [[HYPERLINK POLICY,"Look for preformatted text"]] - [[HYPERLINK POLICY,"Look for mail headers"]] - [[HYPERLINK POLICY,"Look for horizontal rules"]] and [[HYPERLINK POLICY,"Minimum ruler length"]] - [[HYPERLINK POLICY,"Look for MAIL and USENET headers"]] - [[HYPERLINK POLICY,"Look for bullets"]] - [[HYPERLINK POLICY,"Look for hanging paragraphs"]] - [[HYPERLINK POLICY,"Look for white space"]] - Other new analysis policies :- - [[HYPERLINK POLICY,"Input file has page markers"]] and [[HYPERLINK POLICY,"Page marker size (in lines)"]] - [[HYPERLINK POLICY,"Input file is double spaced"]] - [[HYPERLINK POLICY,"Recognise '-' as a bullet"]] - [[HYPERLINK POLICY,"Recognise 'o' as a bullet"]] - New diagnostic policies :- - [[HYPERLINK POLICY,"Monitor tag generation"]] - [[HYPERLINK POLICY,"GOTO Display messages"]] policy and /SILENT qualifier (see 4.2.2.10) - [[HYPERLINK POLICY,"Suppress INFO messages"]], - [[HYPERLINK POLICY,"Suppress TAG ERROR messages"]] - [[HYPERLINK POLICY,"Suppress URL messages"]] - [[HYPERLINK POLICY,"Suppress WARNING messages"]] - [[HYPERLINK POLICY,"Suppress program ERROR messages"]] - Other new output policies :- - [[HYPERLINK POLICY,"Create a log file"]] and [[HYPERLINK POLICY,"Output log filename"]] - [[HYPERLINK POLICY,"Maximum level to show in contents"]] - [[HYPERLINK POLICY,"Preserve underlining of headings"]] - [[HYPERLINK POLICY,"Use <EM> and <STRONG> markup"]] - [[HYPERLINK POLICY,"Colour data rows"]] and related policies (see [[GOTO TABLE generation policies]]). - [[HYPERLINK POLICY,"Default TABLE cell alignment"]] and [[HYPERLINK TAGGING,TABLE_CELL_ALIGN]] directive - [[HYPERLINK POLICY,"Suppress all colour markup"]] - [[HYPERLINK POLICY,"Open link in new browser window"]] and [[HYPERLINK POLICY,"new browser window name"]] - [[HYPERLINK POLICY,"Break up long HTML lines"]] 11.4.2 Other changes _On the web site, and documentation_ - A dedicated site www.jafsoft.com now deals with AscToHTM and related products. - An [updates] page has been added to the Web site. This will list all the updates available for AscToHTM, although in most cases you'll need to be a registered user to receive details for you to obtain the update. - An [FAQ] has been added to the web site. It's not finished yet (what part of the web is?), but it may help answer some of your questions. - Created a new document called "The [Policy Manual]". This replaces what was becoming the largest section of this document. _Windows version_ - The Windows help file now has a better Index. It also has a full contents list as a topic, showing you the structure of the RTF file used to generate the Help file. Unfortunately I've been unable to hyperlink this topic. - The Windows version now "remembers" which options page you were on so that each time you go back there the same sheet is shown. - The Windows version is now "statically linked" against the necessary .DLLs. This makes the program slightly larger, but makes the download smaller as it is no longer necessary to ship .DLLs with the program. This makes overall version management simpler. _VMS version_ - The VMS version now converts all filenames to lower case internally. This is so that all hyperlinks and references to the file are in lower case, making them more Internet-friendly and portable to other systems. _All versions_ - Changes to the tagging to aid standards compliance and CSS support. this includes the addition of the </P> tag which was previously omitted. These changes have introduced slight differences in th amount of vertical white spacing produced in places. - Improvements have been made to the file splitting algorithms. In particular - The program will no longer generate two output pages with the same name. Where duplicate names are detected, the second file is given a generated name, usually by appending "_n" (n=1,2,3...) to the filename. All hyperlinks pointing to sections in the duplicate file will be adjusted accordingly. - A file with underlined headings can now be split into pages at the heading boundaries. The subsequent pages have _U1, _U2... appended to the name of the first page. - Local links (i.e. to anchors in the same file) are now recognised as such, and the filename is omitted. This should make it easier to rename files after production without breaking local hyperlinks. Links to/from other files would still stop working though. - link names for underlined or capitalised headings that are more than 60 characters long are now truncated. They are given a link name derived from the first 30 characters of the section name with a unique identifier tagged on the end. This avoids long link names being split over two or more lines and becoming unusable. - Allow relative links to subtract out filename (e.g. in contents list) when target is in same file - Can now recognise URLs with commas in then such as recognise http://cgi.pathfinder.com/netly/opinion/0,1042,1692,00.html in addition to comma separated lists of URLs. - The KEYWORDS, DESCRIPTION and TITLE pre-processor commands can now be multi-line. This allows long lists of keywords to be placed over several lines (each beginning with the command), making then easier to manage. - The default name for the directory index file is now "dirindex.html" rather than "index.html" to prevent overwriting of any existing index file. - Program now always does a "contents pass". Benefits of this are - can now generate in situ contents lists /contents bars - can now generate navigation bars wherever wanted - can now eliminate duplicate filename generation - can check hyperlink cross references are correct - Improved table/diagram recognition - Now support conversion of tab-delimited data into tables, provided it's placed inside [[HYPERLINK TAGGING,BEGIN/END_DELIMITED_TABLE]] directives - Relaxed indentation test on "n.n" headings. Heading can now be 2 characters to the left, or 1 character to the right of the expected position - Now recognise use of asterisk and underscore combined to produce bold-italic emphasis. Previously only asterisk (bold) and underscore (italic) by themselves were recognised. - Now recognise "]" as a possible "quoting" character. - Now recognise '+' as an underling character - Improved error reporting when file errors occur. The program will now abort the conversion on error, instead of continuing and reporting errors for each line. - Now detect read-only output directories and abort conversion. This would occur if you tried to convert a file on CD. - Definitions now use <DL compact> offering a more-faithful rendition of the original text - Underlined heading and text will now be rendered as underlined by default. Previously this either promoted the previous line to be a heading, or was drawn as a line. - Improved handling of first line indents on paragraphs. Now these are preserved in the output by the inclusion of &nbsp; characters, and the error whereby the following line was deemed to be a different indentation (and thus acquire a <BLOCKQUOTE>) has been largely solved. - Introduction of the TEXT in-line tag (see 7.2.2) now allows numbers like Windows [[TEXT 3.1]] to be protected from conversion into a hyperlink to section 3.1. 11.5 Version 3.0 (August '98) There are a fair number of small changes in functionality over V2.3, together with a fair number of bugfixes and refined algorithms. A lot of development during this time was directed towards the production of a text-to-RTF converter ([AscToRTF]) using the same analysis engine. Consequently there are a lot of changes "under the bonnet". The main functional change has been the revamp of the Windows User Interface. A new section (4.1.2) has been added to this document describing the Windows interface in some detail. The changes include :- - the button bar is replaced by a proper Windows menu, allowing easier access to the programs functions. - under the Help menu a link to the HTML documentation shipped with the software is now provided. - the policy sheets are now "non-modal". This means you no longer have to dismiss them in order to do a conversion, you can leave them up whilst the conversion is going on, making it easier to go through the convert-change policy-convert cycle. 11.5.1 New functions _Windows Version_ - Major re-structuring of the user interface (see 4.1.2) - Program's Help options now provide access to the online and offline versions of the HTML doco. A lot of people were downloading the software and then picking up a version of the doco, unawares they already had it. Don't you people read README.TXT files or what? _All Versions_ - New [[HYPERLINK POLICY,"Search for Definitions"]] policy - New [[HYPERLINK POLICY,"TAB size"]] policy - New [[HYPERLINK POLICY,"Expect sparse tables"]] policy and [[HYPERLINK TAGGING,TABLE_MAY_BE_SPARSE]] pre-processor command - New [[HYPERLINK POLICY,"Add <BR> to lines with URLs"]] policy - New [[HYPERLINK POLICY,"Output file extension"]] policy - New [[HYPERLINK POLICY,"Minimise HTML file size"]] policy - New [[HYPERLINK POLICY,"Headings colour"]] policy. Eventually I hope to add a whole suite of heading styling options, as these have been requested by a number of people. - New [[HYPERLINK POLICY,"Convert TABLE X-refs to links"]] policy and [[HYPERLINK TAGGING,TABLE_CONVERT_XREFS]] pre-processor command - New [[HYPERLINK TAGGING,CHANGE_POLICY]] pre-processor command - New [[HYPERLINK POLICY,"Error reporting level"]] policy 11.5.2 Other changes - Improved Windows interface - Empty lines in a table cell now get an extra &nbsp; added, in addition to the <BR>. This is to compensate for a bug in Internet Explorer 3 which would ignore the <BR> otherwise, leading to alignment errors. - Now treat phrases with all the words connected by underscores, and with underscores at both ends as well as underlined e.g. _this_type_of_thing_ - Improved handling of tables with long urls in them. Previously these would not be recognised as part of a table. Increased "long line" limit inside tables to 110 characters - Improved error reporting/handling - Report unrecognised pre-processor lines - Report results of table analysis (e.g. if diagrams are detected) - Report failure to find requested files - Abort conversion if can't find requested policy file - Improved detection of "mal-formed" tables. Previously this was over-cautious, especially on short tables. - Now add a trailing "/" to www etc URLs if none present (e.g. www.jafsoft.com). This is a more correct URL, which should be accessed slightly more efficiently. - Now recognised "....." underlining, although why people do this is beyond me :) - Improved contents list detection in short documents with only level one headings, and documents with a chapter "0". - Improved headings detection in small files. Made this less trigger happy. - Improved code detection, and now add bold emphasis of C++ like comments inside a code section - No longer allow "{" and "}" to be detected as probable bullet characters when code is expected - I've produced (with help from antipodean friends) an icon for files converted by AscToHTM. It's called a2hlogo.gif. Feel free to use it should you wish on any pages created with AscToHTM. An example piece of HTML code would be $_$_BEGIN_PRE <A HREF="http://www.jafsoft.com/asctohtm/?from=doco"> <IMG SRC="a2hlogo.jpg" WIDTH=100 HEIGHT=36 BORDER=0 ALT="Converted by AscToHTM"></A> $_$_END_PRE - With the introduction of the [[HYPERLINK POLICY,"Add <BR> to lines with URLs"]] policy this behaviour is no longer default. That is, if you *do* want <BR> added at the end of all lines containing URLs you will need to switch this behaviour on using the new policy. - With the introduction of the [[HYPERLINK POLICY,"Convert TABLE X-refs to links"]] policy this behaviour is no longer default. That is, if you *do* want section links inside your tables, you will need to switch this behaviour on using the new policy. - ".htm" files are now with a lowercase extension, unless [[HYPERLINK POLICY,"Use DOS filenames"]] policy selected 11.6 Version 2.3 (late April '98) Minor bugfixes and upgraded functionality over V2.2. The main functional changes have been a) The introduction of wildcard support to allow conversion of multiple files at once. b) (related to the above) the introduction of the [[GOTO Directory Page]] feature that allows the generation of a hyperlinked document spanning all the files in a directory. c) Major re-write of the contents-list generating routines. The program now makes a third, intermediate, pass through the document to analyse the contents structure. This means that contents lists are now placed at the top of the HTML file be default, rather than in a separate file as previously - though that behaviour is still supported if wanted. This approach is expected to pay further dividends in later releases. 11.6.1 New functions _Windows Version_ - Added a "Preform simple conversion" tick box on the front panel. This does exactly the same as the [[HYPERLINK POLICY,"Keep it simple"]] policy. - Improved the Headings dialog to allow headings policies to be more easily edited now. - Pre-processor document sections now working. _All versions_ - Wildcard support has been added (see 4.3.3.1). - Major re-writing of contents list generation has occurred (see 3.4.2). Includes new [[HYPERLINK POLICY,"Use any existing contents list"]] and [[HYPERLINK POLICY,"Generate external contents file"]]. More changes are expected here in later versions. - New [[GOTO Directory Page]] feature. Supporting policies include:- $_$_CHANGE_POLICY preserve line structure: yes [[HYPERLINK POLICY,"Make Directory"]] [[HYPERLINK POLICY,"Directory filename"]] [[HYPERLINK POLICY,"Show file titles in Directory"]] [[HYPERLINK POLICY,"Indent headings in Directory"]] [[HYPERLINK POLICY,"Directory title"]] [[HYPERLINK POLICY,"Directory keywords"]] [[HYPERLINK POLICY,"Directory description"]] [[HYPERLINK POLICY,"Directory return hyperlink text"]] [[HYPERLINK POLICY,"Directory Script file"]] [[HYPERLINK POLICY,"Directory header file"]] [[HYPERLINK POLICY,"Directory footer file"]] $_$_CHANGE_POLICY preserve line structure: no - New [[HYPERLINK POLICY,"Minimum TABLE column separation"]] policy and [[HYPERLINK TAGGING,TABLE_MIN_COLUMN_SEPARATION]] pre-processor command to allow some tuning of table analysis. - New [[HYPERLINK POLICY,"Use first heading as title"]] policy - New [[HYPERLINK POLICY,"Use first line as title"]] policy - New [[HYPERLINK POLICY,"Recognised USENET groups"]] policy - New [[HYPERLINK POLICY,"Automatic centring tolerance"]] policy - New [[HYPERLINK POLICY,"Use <P> markup for paragraphs"]] policy to allow choice of either <P> or <BR> markup to be used for paragraphs. - New [[HYPERLINK POLICY,"Default table width"]] policy and [[HYPERLINK TAGGING,TABLE_WIDTH]] pre-processor command to allow table widths to be specified as percentages - New pre-processor command [[HYPERLINK TAGGING,HTML_LINE]] 11.6.2 Other changes - Reinstated some of the "error" messages removed in the last version, to do with section numbering. This should make it more visible when the section heading analysis goes wrong. - Added error reporting to file open. You should now get an error message if the program fails to find/open a file somewhere. - Now support headings down to 5 levels (previously this was 4). Note, if you only have a couple at this level, the program may still ignore them as statistically insignificant. - Removed certain policies (such as "generate policy file") from the output when generating a full policy file. This is because, when they were read back in, they could cause problems. - The "Include document section" policy is now renamed to "Include document section(s)" reflecting the fact that you can now enter multiple values on one line, rather than requiring multiple lines with one value each as previously. - Major re-structuring and additions to [[GOTO HTML markup produced]] to make the section more coherent and up to date. Some of the sections marked as new in this version are simply the documentation catching up on the features added in earlier releases. Sometimes I just work too hard :^) 11.7 Version 2.20 (Feb '98) First major release after V2.0 (when AscToHTM first went fully-Windowed). Major change this time has been the introduction of TABLE generating algorithms. These were first made available as a separate freeware utility [AscToTab]. This version is reviewed by ZDNet and awarded 5-stars, their highest award. 11.7.1 New functions _Table generation_ This is the biggest change in this version. AscToHTM now incorporates the technology first introduced in [AsctoTab]. To support this the detection of pre-formatted text has been improved, new policies added, and new preprocessor commands added. New policies include :- [[HYPERLINK POLICY,"Attempt TABLE generation"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE border size"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE header rows"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE header cols"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE cell spacing"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE cell padding"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE colour"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE border colour"]] [[BR]] [[HYPERLINK POLICY,"Default TABLE caption"]] [[BR]] New Pre-processor commands include :- [[HYPERLINK TAGGING,BEGIN/END_CODE]] [[BR]] [[HYPERLINK TAGGING,BEGIN/END_DIAGRAM]] [[BR]] [[HYPERLINK TAGGING,BEGIN/END_TABLE]] [[BR]] [[HYPERLINK TAGGING,TABLE_BORDER]] [[BR]] [[HYPERLINK TAGGING,TABLE_BORDERCOLOR]] [[BR]] [[HYPERLINK TAGGING,TABLE_BGCOLOR]] [[BR]] [[HYPERLINK TAGGING,TABLE_CAPTION]] [[BR]] [[HYPERLINK TAGGING,TABLE_CELLSPACING]] [[BR]] [[HYPERLINK TAGGING,TABLE_CELLPADDING]] [[BR]] [[HYPERLINK TAGGING,TABLE_HEADER_ROWS]] [[BR]] [[HYPERLINK TAGGING,TABLE_HEADER_COLS]] [[BR]] _Other changes_ - Added a policy to allow <CODE> markup to be used for code fragments in the document (see [[GOTO HTML Styling Policies]]) - Added pre-processor [[HYPERLINK TAGGING,BEGIN/END_CODE]] commands to allow sections of code samples to be identified and distinguished from tables - Added pre-processor [[HYPERLINK TAGGING,BEGIN/END_DIAGRAM]] commands to allow diagrams and sections Ascii art to be identified and distinguished from tables 11.7.2 Other changes _Documentation_ - Added the "Policy Dictionary" (since superceded by the [Policy manual]), and renumbered the document accordingly. _All versions_ - "tables/pre-formatted text" - Various improvements to detecting the start and end of pre-formatted regions of text. - Shareware now expires after 30 days, rather than after a fixed date. - Headings policies have been revised. Still more work to be done in this area. - Slight improvement in detection of centred text. Still not good enough to offer as a default though (too prone to errors). - Added section on saving/using policy files (see 6.5) - Shareware version now adds nag lines at top and bottom of the page, instead of just the top. - A number of improvements in code sample detection - Reduced number of "error" messages reported. These may be made optional in a later version, and are still placed in the diagnostic files if these are created. _Windows version_ - Now added a "Settings" dialog to allow you to configure various aspects of how the program runs such as what browser to view files with, what policy file to use as default etc, etc. - New /COMMA (see 4.2.2.1) and /TABBED (see 4.2.2.12) command line qualifiers that allow comma delimited and tab delimited files be converted into tables. 11.8 Version 2.10 (never officially released) V2.1 was never officially released, but much of this functionality "crept out" as the shareware version was updated. Some of these versions were shown as V2.01 instead of V2.1. There's nothing like a bit of consistency (and yeah, this was *nothing* like a bit of consistency). 11.8.1 New functions - New [[HYPERLINK POLICY,"Document keywords"]] policy and pre-processor [[HYPERLINK TAGGING,KEYWORDS]] command. - New [[HYPERLINK POLICY,"Document description"]] policy and pre-processor [[HYPERLINK TAGGING,DESCRIPTION]] command. - New [[HYPERLINK POLICY,"Hyperlinks on Numbers"]] contents policy - New [[HYPERLINK POLICY,"Document style sheet"]] policy and pre-processor [[HYPERLINK TAGGING,STYLE_SHEET]] command. 11.8.2 Other changes _All versions_ - Now recognise domain names without a protocol specified (such as http:// or ftp:// etc.) that end in standard domains (e.g. .edu, .net, .org etc) as probable FTP sites. This allows references to sites like rtfm.mit.edu to be correctly turned into hyperlinks. - Some renumbering of this document has occurred - Quoted text is now marked up using <em>..</em> markup _Windows version_ - Now stores data in the Registry under the HKEY_CURRENT_USER root with a "\Software\JafSoft\AscToHTM\..." key - Now supports "most recently used" lists for both policy files and files to be converted. These are accessed via a drop-down Combo box. - Now remembers last source directory each time the program is run. This is used as the initial directory next time the Browse button is pressed. - The filenames now include the path. This is to allow the most recently used (MRU) file drop-down list to function correctly. 11.9 Version 2.00 (October '97) Version 2.0 marks the production of the first fully-windowed version for Windows 95/NT. This took a few months to be produced, so a fair number of other features have been added over this time. 11.9.1 New functions - New [[HYPERLINK POLICY,"Output policy filename"]] policy - New [[HYPERLINK POLICY,"Use .HTM extension"]] policy - New [[HYPERLINK POLICY,"Generate diagnostics files"]] policy - New [[HYPERLINK POLICY,""External contents list filename"]] policy - New [[HYPERLINK POLICY,"Use <DL> markup for defn. paras"]] policy - New [[HYPERLINK POLICY,"Ignore multiple blank lines"]] policy - New [[HYPERLINK POLICY,"Search for emphasis"]] policy - New [[HYPERLINK POLICY,"Allow definitions inside PRE"]] policy - New Pre-processor [[HYPERLINK TAGGING,BEGIN/END_CONTENTS]] command - New Pre-processor [[HYPERLINK TAGGING,BEGIN/END_HTML]] command - New Pre-processor [[HYPERLINK TAGGING,TITLE]] command - New Pre-processor [[HYPERLINK TAGGING,INCLUDE]] command 11.9.2 Other changes - White space immediately adjacent to PRE sections now ignored. - Changed anchor names to contain no spaces (makes URL's easier to quote) - Title defaults to "Converted from filename" instead of "No title" (see also 7.1.2) - Introduced some support for use of ctrl-H (backspace) in Unix documents to underlined and highlighted words - Automated "simple" file detection now attempted - Automated "code samples" detection now attempted - Some policies have been renamed as follows :- Was Now --- --- Expect Numbered sections Expect Numbered Headings HTML header HTML header file HTML footer HTML footer file - The policy section headings have been renamed as well. This may cause "ignored policy line" messages when old policy files are used. 11.10 Version 1.1 (August '97) 11.10.1 New functions - Added a [[HYPERLINK POLICY,"Only use known groups"]] policy to improve accuracy of newsgroup hyperlink detection. - Added more document colour policies - Added a /POLICY and [[HYPERLINK POLICY,"Output Policy file"]] option (see 4.2.2.9) to make the generation of an output policy file optional - Added preprocessor support for user-formatted sections (see 7.1.1) 11.10.2 Other changes - Indentation is now done using <BLOCKQUOTE> markup. - Changed default background colour to white. - Generation of a .pol file is no longer default (see 4.2.2.9) - The use of <PRE> ... </PRE> to mark up user-formatted text is replaced by the new preprocessor commands [[HYPERLINK TAGGING,BEGIN/END_PRE]] - re-write of section 4.1 - Improved error reporting. The .LIS file created if the /DEBUG qualifier is used (see 4.2.2.4) now has error and information messages included in it. 11.11 Version 1.05 (late July '97) 11.11.1 New functions - Added an [[HYPERLINK POLICY,"Output directory"]] policy. This allows redirection of output to a directory different from that containing the source files. Note: This functionality may not be available in the shareware version of the software. - Added an [[HYPERLINK POLICY,"Output policy"]] policy. This allows the suppression of output policy files where not wanted. - Added a [[HYPERLINK POLICY,"Expect code samples"]] policy. This helps in technical documents that include samples of C code. - Added preprocessor support to allow variant documents to be produced (see [[GOTO Preprocessor policies]] and [[GOTO Using the preprocessor]]) 11.11.2 Other changes - Policies now accept "Yes/No" as well as "True/False". "Yes/No" is now the default when outputting policies. - shareware version now limited to processing the first 500 lines only. - Lines with email addresses no longer have <BR>'s forced on the end. Lines with http, ftp and news links still do. This will become fully configurable in later versions. 11.12 Version 1.04 (early July '97) 11.12.1 New functions - Added policy [[HYPERLINK POLICY,"Minimum automatic <PRE> size"]]. This replaces the policy "Allow automatic 1-line <PRE>" - Added policies [[HYPERLINK POLICY,"Largest allowed <Hn> tag"]] and [[HYPERLINK POLICY,"Smallest allowed <Hn> tag"]] to allow control over generated heading sizes. - Added policy [[HYPERLINK POLICY,"Short line length"]] - Added Batch processing to allow multiple files to be converted at the same time. (see 4.3.3.2) 11.12.2 Other changes - Created a 16-bit DOS version - VMS version now available as freeware. - Added "SendTo" tips for Windows 95/NT users section to the documentation (see 4.4.4) 11.13 Version 1.01 (April '97) New functions - Added the /CONTENTS qualifiers (see 4.2.2.3). - Added the /SIMPLE qualifier (see 4.2.2.11).