FileMerlin™
Advanced File Conversion Software for Windows Computers
version 8.0
by

656 Kreag Road
Pittsford, NY 14534-3730
Phone:
(585)-385-3810
Fax: (585)-385-6822
Copyright © Advanced Computer Innovations, Inc., 2008
FileMerlin™ is a TradeMark of Advanced Computer Innovations, Inc.
This chapter provides general information about FileMerlin™ and why you should use it.
FileMerlin™ software converts a very wide variety of file types accurately between different applications, including all popular legacy and modern word processors, as well some of the popular spreadsheet and database formats.
In addition to applications such as word processors, it can convert to other important file formats such as PDF (Adobe Portable Document Format), HTML (for web publishing), XML (for structured processing of documents or data) and simple text or comma-separated values (e.g., for programmatic examination of content).
It also converts many embedded or linked objects such as spreadsheets and presentations embedded in MS Word and MS Works documents, as well as graphics and pictures embedded in or linked to many kinds of word processor documents.
Further, if the file type is not recognized or supported, FileMerlin™ offers the option to extract whatever text content is possible and put it into a file format of your choice.
Although some programs may provide the ability to import or export other file formats, FileMerlin™ offers many significant advantages over these built-in filters:
Very accurate and complete conversion - FileMerlin™ converts not only the text, but in most cases also practically all formatting and special functions such as tables, graphics, autonumbering, fonts, hanging indents, etc. to the extent that they are supported or permitted in the destination format. In general, the conversion is considerably superior to anything available using word processor built-in filters or other alternatives.
Stand-alone conversion – Unlike some other conversion products on the market, FileMerlin™ does not require the source or destination application to be installed. It is a purely stand-alone and self-sufficient convertor. For example, other PDF convertors which work as printer drivers require you to open a document to be converted in the application that created it, the “print” to a PDF file. With FileMerlin™, the application that created the document is not needed. Instead, FileMerlin™ reads the document and converts it directly.
Very easy to use - FileMerlin™ presents a straightforward and easy user interface with no training required. Various levels of online help are available, including instant "quick-help" using the right mouse button. You may convert a single file or thousands of files with a few simple mouse clicks, including files in nested folders.
Automatic determination of source file format - FileMerlin™ can automatically determine the source file type and revision of most file formats, and provide the appropriate conversion without any effort on your part.
Extensive and flexible batch conversion support - In addition to conversion of multiple files through the user interface, FileMerlin™ also supports a command-line batch mode for unattended operation.
Extremely fast – With conversion speed in the range of 10 to 100 pages per second, FileMerlin™ is faster than any other product offering this level of conversion detail and accuracy.
Unique AutoServe mode - In this mode, FileMerlin™ can run silently in the background and convert files as and when they arrive without any user or programmatic intervention.
Very customizable - FileMerlin™ provides two levels of customization: (1) a very easy-to-use and efficient user interface for commonly desired conversion customization; and (2) an advanced command-based interface for advanced users that enables customization of over a thousand conversion parameters.
Full Unicode Support - for proper handling of many non-Latin character sets such as Hebrew, Greek and Cyrillic.
Also converts graphics formats - FileMerlin™ also converts most graphics images. This includes graphics images that are linked to the original document, as well as those embedded into the source document.
Can extract text and data - FileMerlin™ includes the ability to extract document content to simple formatted or unformatted text files, as well as the ability to extract tabular content into a generic text data file such as comma-separated, tab-separated, quote-delimited, etc. This includes extraction of content to generic Unicode for languages using non-Latin character sets.
Integrates well into your system - FileMerlin™ integrates with your Windows system. For example, you may send a file to FileMerlin™ for conversion while browsing in Windows Explorer using the Send To option; or, you may drag a file to the FileMerlin™ icon for instant conversion.
API available for programmers - If you are a programmer or software developer, you may purchase a FileMerlin™ developer or AutoServe license, which lets you access the conversion functionality from your own application by making direct calls to the conversion engine DLLs using the documented API. This opens up a host of possbilities and relieves you from having to write and maintain your own conversion code.
FileMerlin™ can convert your documents to well-formatted HTML level 4 for instant web publishing. It supports the advanced features afforded by this level of HTML, including the facility for complex formatting using cascaded style sheets (CSS).
FileMerlin™ can automatically embellish the converted web pages with features such as background textures, selective text coloring, table of contents and index, etc., letting you create attractive and professional-looking web pages without any additional effort.
When converting graphics for web publishing, FileMerlin™ automatically extracts or converts the pictures to .JPG or .GIF files if necessary, so that all popular browsers can display them correctly. It also places the graphics files into appropriate folders and automatically generates the links to them in the HTML file.
FileMerlin™ can convert your documents to MS Word 97, 2000, 2002 (XP), 2003 or 2007, taking advantage of the advanced formatting and layout features offered by this word processor. FileMerlin™ generates true MS Word files in the native .DOC file format, so no additional conversion is required. It also gives you the option of using Windows (ANSI) or Unicode as the primary coding mode, thus enabling accurate conversion of documents using non-Latin character sets such as Hebrew, Greek and the Cyrillic languages.
In addition to converting to the native Microsoft Word .DOC file format, FileMerlin™ provides the option of converting to the Microsoft Office 2003 XML format as well as to the Microsoft Rich Text format. This opens up the possibility of converting documents to any software application that can read one of these file formats.
FileMerlin™ can convert your documents to WordPerfect 6 or higher (including all versions of Corel WordPerfect), taking advantage of the advanced formatting and layout features offered by this word processor. FileMerlin™ generates true WordPerfect files in the native .WPD file format, so no additional conversion is required. Files produced by FileMerlin™ are also compatible with WordPerfect 6 for Dos.
FileMerlin™ can convert your documents to PDF (Adobe Portable Document Format) without requiring you to open and print the document in the application that created it. It is up to 100 times faster than convertors that work as printer drivers, and in addition offers customizability letting you tweak the conversion to suit your particular requirements.
FileMerlin™ can convert documents, HTML files, data files and spreadsheets to unformatted text (e.g., for content examination) as well as to formatted text (e.g., to create a presentable plain text file with appropriate indents, tab positioning, tables with layout, etc.). It can also convert word processing or HTML tables and spreadsheets to a plain text flat file format such as comma-separated, quote-delimited, tab-separated, etc. In all cases, you can select between pure ASCII-only output, output conforming to any popularly used code page (such as Windows, ANSI, code page 850 and many others), as well as any Unicode output format.
FileMerlin™ is modular in functionality, and you may purchase only or as many functions you need. Currently, FileMerlin™ is available with the following functions:
Web - This function lets you convert documents to HTML for publishing well-formatted web pages on the web. It also lets you convert to simple text (e.g., for programmatic examination of file content).
Word - This function lets you convert documents to MS Word 97, 2000, 2002 (XP), 2003 and 2007, with text and formatting intact (including complex and advanced formatting). It supports conversion to the Microsoft Word native .DOC format, as well as to Microsoft Office 2003 XML format and Microsoft Rich Text format (RTF). It also lets you convert to simple text.
WordPerfect - This function lets you convert documents to WordPerfect 6 or higher, with text and formatting intact (including complex and advanced formatting). It supports conversion to the WordPerfect native .WPD format. It also lets you convert to simple text.
PDF - This function lets you convert documents to PDF (Adobe Portable Document Format). It also lets you convert to simple text.
When you order FileMerlin™ , you may select one or more of these functions. You pay only for the functions you order, and there is a discount for ordering more than one function.
FileMerlin™ is available in single-user as well as multi-user (network) versions. Site licensing is available to users requiring a large number of copies. Programmers, software developers and webmasters wishing to incorporate this conversion technology into their own programs or web sites may license it from Advanced Computer Innovations, Inc. under very flexible and reasonable terms incorporated into a Developer, AutoServe or OEM license.
FileMerlin™ runs on any computer using Windows 95, 98, Me, XP, NT, 2000, 2003 or Vista, as well as 32-bit Windows emulators. It imposes no special memory or other requirements.
This chapter discusses how to install FileMerlin™ from CD or by downloading, how to run it and, if applicable, purchase and enable the fully working copy.
Insert the Advanced Computer Innovations, Inc. CD into your drive. In most cases, an installation dialog box appears automatically. If it does not appear in a few seconds, run the program SETUP.EXE on the CD.
In the installation dialog box, click FileMerlin™ .
This starts the FileMerlin™ Setup Program. Respond to the dialogs on the screen, and FileMerlin™ will be installed on your hard disk.
If you download the FileMerlin™ software, you obtain a file named FMERLIN.EXE. Run this file by locating it with Windows Explorer and doubleclicking it. This starts the FileMerlin™ Setup Program. Respond to the dialogs on the screen, and FileMerlin™ will be installed on your hard disk.
The Setup Program creates a desktop group named FileMerlin. In this group it places various icons/shortcuts that let you easily run the program, bring up the program documentation as well as network usage documentation and, if required, uninstall the program. The Setup Program then starts FileMerlin™ .
FileMerlin™ starts automatically immediately after installation. When it exits, it leaves an icon for its folder on the desktop. To run it later, open this folder and click the FileMerlin™ program in it; or, click Start in the Windows taskbar, then click Programs, FileMerlin and FileMerlin Program. When FileMerlin™ is run, it presents its Main Window which looks like this:

The factory-shipped copy of FileMerlin™ first runs in the Trial Mode. In this mode, it introduces spelling and numeric inaccuracies in the converted file, but operates in every other way like the unrestricted version. This lets you to see how the program works and the quality of conversion. Once you decide to purchase the software, you may convert the installed software to the fully functional mode by entering a special 10-character key code. To do this, click Purchase Now in the upper right corner of the main panel. This brings up the following dialog box:

If you have already purchased the software, you will find the key code in a sealed envelope that came with the product, or you may have obtained this key code by phone, eMail or other means. In this case, click Manual Purchase , and enter the key code into the slot that appears. This automatically enables FileMerlin™ for full functionality.
If you do not have the key code but would like to purchase FileMerlin™ instantly with a credit card at any time, click Internet Purchase or Modem Purchase . This brings up an order form, then encrypts and sends the information to Advanced Computer Innovations, Inc.'s secure transaction servers over the internet or with a direct telephone call. Since your credit card number is either heavily encrypted or does not go over the Internet, there is no security compromise. As soon as the transaction is completed (which takes only a few seconds), FileMerlin™ is automatically enabled for full functionality. You should note down the key code which the program gives you, and keep it in a safe place. If you ever need to reinstall FileMerlin™ in the future, you may at that time enable FileMerlin™ for full functionality by executing a manual purchase using this key code that you have already paid for.
This chapter briefly discusses the general sequence of steps when using FileMerlin™ , as well as some of its other features. More details on all these functions are given in later chapters.
In general, when using FileMerlin™ to convert files, you carry out the following steps:
Select or specify the source file(s) you wish to convert by clicking Files in the Source area.
Select or specify the destination format you wish to convert to by clicking Format in the Destination area.
Specify the destination for the converted file(s) by clicking Files in the Destination area.
Do the conversion by clicking Convert .
You can view any of the specified source or destination files by clicking the corresponding View button.
Beyond the basic steps, FileMerlin™ gives you a lot more options. Some of them are summarized below:
You may specify wild card characters "*" and "?" as part of the source file name(s) to literally convert thousands of files in a single operation.
FileMerlin™ gives you a lot of flexibility in specifying how converted files should be named, including the use of "wild cards", the option to replace or extend the file name extension, as well as embedded sequence numbers to distinguish converted files that may otherwise have the same name. This is described in more detail later.
The check boxes in the Source and Destination areas of the main window let you convert files in the specified source folder as well as in all its nested folders automatically. Further, you have the option of automatically replicating the source folders tree structure into the destination folder.
The check boxes in the Options area, as well as further options brought up by clicking More Options , let you customize how FileMerlin™ operates. The easiest way to find out more about an option is to click the right mouse button on it.
To customize the way FileMerlin™ converts your files, click Customize to customize commonly used conversion parameters using a graphical user interface, or Advanced to effect advanced and extremely detailed customization of more than a thousand conversion parameters using scripted commands.
To view a file whose name appears in the Source or Destination area of the main window, click the associated View button. FileMerlin™ immediately launches, if possible, the application associated with the file.
FileMerlin™ may be run in the pure batch mode where all required parameters are specified on the command line and no further user interaction is required. This is described later.
FileMerlin™ may also be started indirectly to convert a file as follows:
When exploring in Windows Explorer, you may right-click a file to bring up a pop-up menu. One of the options in that menu is "Send To". If you select this option, FileMerlin™ is one of the targets to which you can send the file. Doing so sends that file to FileMerlin™ , which opens up the FileMerlin™ main dialog box with the source file name already filled in, ready to be identified and converted.
You may drag a file to the FileMerlin™ icon. This also sends the file to FileMerlin™ for conversion, i.e., it opens up the FileMerlin™ main dialog box with the source file name already filled in, ready to be identified and converted.
You may copy a file and paste it on the FileMerlin™ Icon. This also sends the file to FileMerlin™ for conversion, i.e., it opens up the FileMerlin™ main dialog box with the source file name already filled in, ready to be identified and converted.
Though very easy and intuitive to use, FileMerlin™ provides plenty of help online.
You may click the right mouse button on most dialog items to bring up quick help on that control. For example, right-clicking the Files button in the Source area brings up a quick help panel like this:

Some dialog items (such as edit fields) normally bring up a Windows properties menu when right-clicked. In such cases, right-click some other item first and then right-click that item.
When a quick help panel is displayed, the mouse cursor temporarily changes to a dot. To exit the quick help, simply click inside the displayed help panel.
For more detailed help, click the Help button. That brings up detailed hyperlinked and cross-indexed help which may be reviewed serially or searched by keywords as well as phrases.
This chapter discusses how you specify the source file(s) to be converted when using FileMerlin™ interactively. It also discusses how you may specify the source type (i.e., its file format) or have FileMerlin™ determine it automatically - a process known as AutoRecognition (see "Autorecognizing the Source File Type").
You select the source file(s) to be converted by clicking the File(s) button in the Source area of the main window. This displays the following standard Windows dialog box:

First use the buttons at the top of this dialog to navigate to the drive and directory containing your source files. The files and folders in this directory will be displayed in the large dialog box window.
To convert a single file, doubleclick it (or click it and then the OK button). To convert several consecutive files, click the first file, then hold down the Shift key and click the last file. To convert several non-consecutive files, hold down the Ctrl key while clicking them in turn.
You may also enter the source file name, or a file(s) name template with * and ? wild cards, into the File Name entry field. Any wild card characters entered become a part of your specification and are subsequently used to select files during the conversion process.
Once the source file(s) have been selected, FileMerlin™ returns to the main window. At this point the Source area of the main window displays the source file name (if only one file was selected), or the source path with a count of files selected (if multiple files were selected), or the filename template if one was entered.
If you are sure of the source file(s) path and filename, you may enter it directly into the main window edit field without going through the above dialog box. This may also include the wild card characters * and ?.
If your source file specification includes one or more of the wild cards * and ?, then the Include Nested Folders check box in the Source area is enabled. Checking this box causes FileMerlin™ to convert all matching files in your specified folder as well as in all nested folders.
FileMerlin™ can directly access and convert files that reside on other computers networked to your computer. The computer may be specified using UNC (Universal Naming Convention) or by mapping it to a drive letter.
If the Auto-recognize Source File Format option has been checked in the Options area of the main window, FileMerlin™ attempts to determine the source file type automatically as discussed below:
If only one source file has been selected, FileMerlin™ attempts to determine its file type immediately when the selection is made. If the determination is successful, the recognized file format is displayed in the Source area of the main window.
If multiple source files have been selected, or if a wildcard template was entered for the source file(s), FileMerlin™ displays (Auto) as the source file(s) format in the main window. The autorecognition then takes place automatically when you try to convert the selected files using the Convert button.
FileMerlin™ recognizes file types by examining the contents of the file as well as the file name. In some cases (e.g., InterScript™ files) autorecognition is not possible because the source file does not contain any reliable signature or other indication of its origin. In such cases, the source file type has to be specified manually as discussed in the next section.
If the Auto-determine Source File Format option has not been checked in the Options area of the main window, or if FileMerlin™ is unable to determine the source file(s) type, you must specify the source format manually. To do this, click the Format button in the Source area of the main window. A dialog box that looks something like this is displayed:

Select the source file format by clicking it. At this point the explanation window on the right may display some relevant information about this format. Then click the OK button to complete your selection and return to the main window.
You may also select the file format by doubleclicking it. FileMerlin™ immediately returns you to the main window. This is easier but does not give you a chance to look at any information that may be displayed in the explanation window.
Once you return to the main window after selecting the source file format manually, this file format is displayed in the Source area of the main window.
If you explicitly specify the source file(s) format as described in the previous section, FileMerlin™ normally assumes that all matching files to be converted are truly in that format and proceeds to convert them based on this assumption. This can cause a problem if your source directory or directories contain the desired along with other kinds of files. In such a situation, if the file format you have specified is autorecognizable (see "Autorecognizing the Source File Type"), you may click the check box marked Ignore files that are not in above format. If this box is checked, FileMerlin™ checks each file before conversion to ensure that it is in the stated format, and skips it if it is not. This is useful if some undesired files match your specified filename selection template. However, it also results in redundant file checking. If you are sure that all files matching your specification are indeed in the correct file format, you may uncheck this box.
This chapter discusses how you specify the name and location of destination file(s) when using FileMerlin™ interactively. It also discusses how FileMerlin™ can automatically compute destination file name(s) for you, a process known as auto-naming the destination file(s)..
You specify the destination file(s) name and location by clicking the File(s) button in the Destination area of the main window. When you do this, FileMerlin™ displays the following dialog window:

Click the buttons at the top of the dialog box to navigate to the drive and directory where you wish to save the converted file(s). Then enter the destination filename (single file) or file naming template (multiple files) into the file name field. A template may include special characters as described below:
You may use * and ? wild card characters to keep portions of a destination file name the same as the matching portions of the source file name. The ? character matches one single character, while the * character matches the remainder of the name or extension field. Thus for example, to preserve the name field but use .HTM for the extension field, the destination filename template would be *.HTM.
As a special case, you may enter "*.*.ext" in the name and extension portion of the destination file name. This causes the extension you specify (ext) to be appended to the source file name, rather than replacing the source extension. For example, if the destination specification is given as C:\Converted\*.*.htm, a source file named abcdef.123 is converted to abcdef.123.htm in the C:\Converted directory.
You may use a string of ">" (greater than symbol) characters to represent a sequence number. This number starts with 1 for the first file converted, and increments by one with each file. This is useful to ensure that each converted file has a unique name and so does not overwrite other converted files. Let's say, for example, you want to convert files named DOCUMENT.XX1, DOCUMENT.XX2, DOCUMENT.XX3 ... from WordPerfect to HTML, and you would like all converted files to be named with .HTM extension. If you were to use the destination filename template *.HTM, all converted files would have the same name (DOCUMENT.HTM), and would overwrite each other or prompt you each time for an action. In such a situation, you could specify the destination filename template as DOCU>>>>.HTM. This would result in the converted files being named DOCU0001.HTM, DOCU0002.HTM, DOCU0003.HTM, ... , each file having a unique name. The sequence number is formatted with leading zeroes to have the same number of characters as in the string of ">" symbols. Note that the string of ">" characters may appear in the name and/or extension portion of the destination filename template.
You may use a string of "<" (less than symbol) characters to represent the sequence number. This works just like the ">" character described above, except that the sequence number is not padded with leading zeroes.
If the Auto-name destination file option has been checked in the Options area of the main window, FileMerlin™ automatically computes the name(s) of the destination file(s) by keeping the name portion of the name the same as that of the source file while using an extension dictated by or descriptive of the destination file format. In other words, it uses a destination filename template of the form *.ext where ext depends on the destination file format. For example, if the destination file format is HTML, the template used is *.htm.
A destination filename therefore depends on the source filename and the destination file format. So if auto-naming has been enabled, the destination filename(s) are computed whenever the source filename(s) or the destination file format is changed. If you do not like this behavior, you may turn auto-naming off by unchecking the Auto-name destination file option in the Options area of the main window, If you do this, you always have to specify the destination file(s) name manually as described above.
Note that auto-naming computes only the destination filename component, not the drive and directory. These latter components stay the same as most recently specified.
If the name of a destination file works out to be the same as that of the source file it is being converted from (including all the path components), then the converted file replaces the source file, i.e., the original file is overwritten. By default, a warning message is displayed before any source file is overwritten. If you like to live on the edge, you may suppress this warning by clicking More Options in the main window.
Under certain conditions, a single source file can potentially produce multiple destination files. For example, if a Word document named TEST.DOC containing 3 tables is converted to a comma-separated data format, three destination files are produced (one for each table). In such a situation, each destination file name has a sequential number added to it. In the above example, these files may be named, e.g., TEST-1.CSV, TEST-2.CSV and TEST-3.CSV. Note that only the root of these file names (i.e., TEST.CSV) is displayed in the main window dialog box and in the conversion log.
If multiple files are being converted by using "*" and/or "?" wild card characters in the source file specification and nested folders in the source folder are included, then you have the option of automatically building nested destination folders to match the source nested folders structure. In order to do this, check the Replicate Nested Folders box in the Destination area of the main window. Converted files are then placed in the appropriate destination folder to replicate the original structure.
If the Replicate Nested Folders box is not checked and there are nested source folders, files from all the source folders are placed in the single destination folder that you have specified.
FileMerlin™ can directly place converted files on other computers networked to your computer. The computer may be specified using UNC (Universal Naming Convention) or by mapping it to a drive letter. Please note, however, that FileMerlin™ will not create folders on a remote computer unless it is mapped to a drive letter.
Unlike the source file(s) format, which may be autorecognized (see "Autorecognizing the Source File Type"), the destination file(s) format must be specified. To do this, click the Format button in the Destination area of the main window. The actual specification process using a dialog box works identically to specifying the source file(s) format, and so is not discussed here again.
Once the source and destination files and their types have been specified as described in the previous chapters, you may click the Convert button to execute the conversion.
When all source files have been converted, FileMerlin™ returns to the main window. At this point you may view the conversion log (as described later), convert additional files or exit the program.
Normally, FileMerlin™ displays a progress meter as each file is being converted. This is reassuring when converting large files. However, the display and maintenance of a progress meter slows down the program somewhat. This slowdown is insignificant when converting large files, but is more prominent when converting large numbers of very small files. If desired, display of the progress meter may be turned off by clicking the More Options button in the FileMerlin™ main window.
Some features in a document may not be convertible because they are not supported by the destination word processor or file format. FileMerlin™ can record such exceptions in a log file containing an audit trail. This file may be viewed after doing a conversion by clicking the View Log button in the FileMerlin™ main window, which shows the conversion log like this:

Use the scroll bar if necessary to scroll through the log. You may also use the UpArrow , DownArrow , PgUp , PgDn , Home and End keys. The log is actually stored in a text file named CONV_LOG.TXT in your temporary folder. You may click the Print or Edit Log button to bring this file up in a word processor or text editor. This lets you edit, highlight, search or print the log file.
Normally FileMerlin™ maintains a log file only for the current session. In other words, the log is cleared each time FileMerlin™ is started. However, you may accumulate the log over successive sessions by clicking the More Options button in the FileMerlin™ main window. You may also delete the log at any time by clicking the Delete Log button in order to start a new log for subsequent conversions.
When you're all done converting, click the Exit button to close FileMerlin™ . Normally FileMerlin™ remembers your program settings from one session to the next. This is done by saving these settings when you exit FileMerlin™ . If you do not wish to save these settings, uncheck the Save Settings on Exit box in the FileMerlin™ main window.
In addition to the options that can be set in the main window, FileMerlin™ provides more options that may be accessed by clicking More Options . This brings up the Additional Options window, which looks like this:

This window lets you specify various operational parameters.
If destination folder does not exist ... - You may select what FileMerlin™ should do if a destination folder does not exit. If you opt to not create it, conversion of that file will not take place. If you opt to prompt and create it, FileMerlin™ prompts you every time a new destination folder needs to be created, and gives you the option to create it or abort the conversion. Or, you may opt to let FileMerlin™ create the destination and proceed with conversion without prompting.
When converting multiple files ... - When converting multiple files, FileMerlin™ gives you the option to (1) pause and have the user acknowledge each conversion; (2) pause and have the user acknowledge only those conversions that were not successful; or (3) do not pause for any user acknowledgement. Depending on how much interaction you desire with FileMerlin™ , you may select the appropriate of these options.
Warn if destination file overwrites existing file - In its factory-shipped condition, this checkbox is checked, and FileMerlin™ warns you if an output (converted) file is going to overwrite a previously existing file. It proceeds to write over the existing file only if you explicitly authorize it to do so. You may uncheck this box, in which case FileMerlin™ overwrites the previously existing file without warning. We strongly suggest that this box stay checked.
Accumulate Log over multiple sessions - Normally, FileMerlin™ starts a fresh conversion log with each session, i.e., a previously existing conversion log is deleted at the start of a new session. You may check this box to suppress this deletion, and have the log accumulate over multiple sessions. If this box is checked, please be aware that the conversion log can grow significantly over time.
Preserve File Date and Time - Normally, the date and time stamp of a converted file represents the time it was created (converted). You may check this box to stamp converted files with the date and time of the original files they were converted from.
Show Progress Meter - Normally, FileMerlin™ displays a progress meter as each file is being converted. This is reassuring when converting large files. However, the display and maintenance of a progress meter slows down the program somewhat. This slowdown is insignificant when converting large files, but is more prominent when converting large numbers of very small files. If desired, display of the progress meter may be turned off by unchecking this option box.
Enable SmartMouse - SmartMouse is a FileMerlin™ feature whereby the mouse cursor automatically positions on a default button every time a new window opens. Some users love this feature, others find it distracting. This feature is turned on by default, and you may turn it off by unchecking this option box.
Enable Sound - FileMerlin™ uses a few sound signals as part of its normal operation (e.g., after a batch of files has been converted). You may uncheck this option box for silent operation.
Place FileMerlin™ icon on desktop - If this option box is checked (default setting), an icon for the FileMerlin™ folder is kept on the Windows Desktop. From this folder icon, you can easily run FileMerlin™ , view the user manual and access other FileMerlin™ support functions. If you do not wish to have this icon on your Windows Desktop, you may uncheck this option box. The FileMerlin™ functions are still accessible by clicking the Start button in the Windows taskbar.
Include FileMerlin in "Send To" menu - If this box is checked, an entry for FileMerlin™ is placed in the Windows "Send To" menu, which appears if a file is right-clicked, for example, in My Computer or Windows Explorer. This lets you convert a file instantly as you are exploring folders. If you do not wish to include FileMerlin™ in the Send To list, you may uncheck this option box.
AutoServe - This function, which lets you set up FileMerlin™ to convert automatically in the background, is described in detail in the next chapter.
Once you have set up these options the way you want, click OK to save these settings and close the Additional Options window. If you wish to discard the changes made to these options, click Cancel instead.
FileMerlin™ is primarily a very powerful word processor document convertor. It converts between a very large array of word processor file types. Further, it handles even complex document formatting and layout as well as advanced modern-day document functions.
In addition to various word processing program file types, FileMerlin™ also supports conversions involving other important document file formats such as HTML, XML, formatted or unformatted text (ASCII, ANSI, Unicode or code page based), RTF (Rich Text Format), DCA/RFT (Document Content Architecture – Revisable Form Text), and several others.
FileMerlin™ can convert data base files (such as Microsoft Access, FoxPro and dBase) to tables in word processing document files (e.g., MS Word tables), as well as to generic data base file formats such as comma-separated, tab-separated, quote-delimited, etc.
Similarly, FileMerlin™ also converts spreadsheet files (such as Microsoft Excel and Microsoft Works) to tables or to generic data base files. When converting Excel files or MS Works spreadsheets to Word or HTML tables, detailed layout information such as cell widths, merged cells, background colors, borders, fonts, etc., are all preserved. The converted file is therefore very suitable for presentation. Note that FileMerlin™ does not convert formulas or the logic of a spreadsheet. Rather, it converts the displayable data into a well-formatted document, or into a flat file suitable for appropriate processing.
FileMerlin™ is not a stand-alone graphics file convertor as such. However, when converting documents containing embedded or linked graphics, it converts the graphics to whatever is needed by the destination program or file format being converted to. For example, if converting from a word processor to HTML, it can extract graphics images from the source word processor document and convert them to .GIF or .JPG as required for web publishing.
FileMerlin™ offers the option to treat any file as an unknown collection of bytes, and recover whatever text is possible from it. The recovered text is placed in the output file and destination format that you select. This is most useful for unsupported or unknown source file types, or corrupted files. Note that this works for any kind of file, not just the source file types supported by FileMerlin™ . To use this option, select "Unknown for Text Recovery" as the source file type.
Please note that under this option, FileMerlin™ looks for anything in the source file that looks like text, extracts it, and puts it into the output file. So all text content will be extracted from the source file (unless it is stored in an encrypted or non-ASCII form). However, there may also be a fair amount of "garbage" and irrelevant text interspersed with the actual text content. In some cases, text that was "deleted" may resurface. Further, the extracted text may not necessarily be in the correct sequence. So you may have to substantially edit and touch up the converted file. However, this is still a useful last-resort option.
FileMerlin™ converts documents produced by all versions of Ami Professional (also called Ami Pro), and automatically figures out the version.
Ami Pro uses the ANSI (for American National Standards Institute) character set. Some symbols (such as the PC box- and line-drawing characters) do not exist in the ANSI set. Similarly, some ANSI characters have no counterpart in the PC set. When converting from Ami Pro, FileMerlin™ replaces any character that cannot be converted by a dummy character. You may alter or omit this replacement, as described later in the chapter on Customization.
Ami Pro supports Text Frames, which are basically rectangular regions in a document where text is inserted. Due to technical considerations, FileMerlin™ converts text frames to paragraphs placed at the end of the document.
Ami Pro always numbers footnotes (or endnotes) sequentially. Hence when converting documents with non-sequential note numbers to Ami Pro, the numbers may change. Note, though, that the converted document is still consistent, i.e., note numbers appearing in the body text match those in the notes.
Ami Pro implements outline numbering by defining special paragraph styles which automatically produce outline tags for paragraphs they apply to. When converting Ami Pro documents to other formats, FileMerlin™ correctly transfers such outline tags in the correct format.
Ami Pro requires .SAM as the filename extension for its documents.
FileMerlin™ converts documents produced by all Brother Word Processors using 720K or 1.44M disks. This includes the daisy wheel, notebook and ink jet models. Some older Brother machines used 240K disks which do not work with FileMerlin™.
Brother WP requires .WPT as the filename extension for its documents.
"DCA/RFT" stands for Document Content Architecture, Revisable Form Text. This is a relatively standardized logical document file format used by IBM in many of its products. FileMerlin™ automatically identifies and converts files in this format.
DCA/RFT enables the exchange of documents between diverse IBM products (dedicated word processors, mainframe, mini- or micro-computers), as well as some non-IBM products which export to or import from this format. This lets FileMerlin™ convert from such systems. Such conversions generally preserve formatting and print controls. Please note, however, that some products import from or export to DCA/RFT incompletely, and most of them do not handle advanced functions.
FileMerlin™ automatically detects and directly converts documents produced by DisplayWrite-2, -3, -4, -4.2 and -5, without requiring them to be saved in an intermediate format like DCA/RFT.
Page breaks produced automatically or created with Ctrl E in DisplayWrite are soft page breaks and not permanent, i.e., they may be removed by repagination. Permanent breaks are produced by the Required Page End code ( Ctrl R ), and are hard page breaks. Normally FileMerlin™ converts hard page breaks, but not the soft ones since these are inserted by the destination word processor and may not be in the same place as in the original document. However, you can set up FileMerlin™ to translate soft page breaks as hard breaks as described later in the chapters on Customization.
Please also note that DisplayWrite documents do not paginate to their final form until you explicitly paginate them. FileMerlin™ correctly converts paginated and unpaginated DisplayWrite documents.
By default, DisplayWrite-2 and 3 use .TXT as the filename extension for their document files, while DisplayWrite 4.x and 5 use .DOC. However, users may use different extensions if they wish by specifying them explicitly.
DisplayWrite Assistant document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for DisplayWrite Assistant documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
FileMerlin™ converts documents produced by the word processing modules of all versions of Enable, including 1.x, 2.x, 3 (OA) and 4.x.
Mail-Merge field names in Enable are enclosed in square parentheses, and are recognized only by implicit reference to an Enable data base. Therefore, FileMerlin™ cannot tell if a word enclosed in square parentheses is a field name or just text, and so converts these codes literally.
Enable implements a Shorthand function which lets phrases or sentences be referenced using a two-letter code. The translation from the two-letter code to the phrase it represents is given in a Shorthand Table. Before printing a document in Enable, it must be expanded so that the shorthand codes are replaced by their corresponding phrases. When converting these documents to other formats, it is recommended that the document be similarly expanded. Otherwise, the converted document will contain the two-letter shorthand codes instead of the phrases they represent. In this situation, FileMerlin™ includes the shorthand table into the converted document as a non-printing comment.
Enable always numbers footnotes/endnotes sequentially using superscripted arabic numerals, and FileMerlin™ uses this convention when converting documents from other programs to Enable.
Enable WP document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for Enable WP documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
By default, Enable WP uses .WPF as the filename extension for its document files. However, users may use a different extension if they wish by specifying it explicitly.
GeoWrite is the word processor application distributed with GeoWorks for such handheld devices and dedicated word processors such as the Brother GeoBook. Although GeoWorks itself runs under the Geos operating system, FileMerlin™ can convert certain versions of GeoWrite files to other word processors.
FileMerlin™ can convert HTML files (including HTML 4 files containing CSS styles) to other file formats. However, there are a few considerations you should keep in mind.
Flat HTML files which present as formatted documentation may convert very well, including such features as style formatting, special characters, lists, etc. You may convert such files to formatted or unformatted text (e.g., for content extraction), to MS Word or to other formats.
Tables in HTML files may be converted to comma-separated text files for importing into data base programs. The internet provides access to a lot of data formatted as HTML tables. FileMerlin™ provides a very convenient mechanism to import such data into data base or other data processing applications.
Many modern web pages are produced by HTML editors that use scripts, nested tables and such other functions to produce interactive pages with fairly complex layout. FileMerlin™ is not intended to convert such pages, and some of these features are not supported.
FileMerlin™ converts IBM Personal Typing System (PTS) documents to other word processors. IBM PTS document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for IBM PTS documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
FileMerlin™ converts documents produced by all versions of IBM Signature. This file format is similar to that of XyWrite 4.0, and the same considerations apply. However, there are also significant differences between XyWrite and Signature, and FileMerlin™ correctly takes them into account.
FileMerlin™ converts IBM Writing Assistant documents to other file formats.
Writing Assistant documents store very little formatting information. For example there is no differentiation between hard and soft returns, nor any codes indicating centering, right-alignment and tab alignment. So when converting these documents to other formats, FileMerlin™ deduces such variables sometimes using statistical properties of documents in general. Experience shows that these decisions are mostly valid, but some touching up may ocassionally be required. This limitation is due to the nature of the Writing Assistant file format.
IBM Writing Assistant document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for IBM Writing Assistant documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
FileMerlin™ converts Leading Edge Word Processor (LEWP) documents (all versions from 1.3 to 1.5a) to other word processors.
LEWP used long file names as well as short (Dos) file names to reference its documents. FileMerlin™ uses the Dos file names. The long file names were used by LEWP for document management purposes, and these names are stored not in the document files but in separate (folder) files. FileMerlin™ does not have access to this information, and uses the short (Dos) file names to identify and convert files.
LEWP document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for LEWP documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
LEWP requires .DOC as the filename extension for its documents. Other files produced by LEWP (such as those with .DDR extension) are not document files, and you should not attempt to convert them.
FileMerlin™ converts documents from all versions of Lotus Manuscript (i.e., 1.0, 2.0 and 2.1) to other file formats, and automatically figures out the version.
When converting from Manuscript, FileMerlin™ handles both structured and unstructured documents, and fully supports document structure formatting (such as automatic section numbers and section indents).
FileMerlin™ does not support the Manuscript Equation Editor, but equations entered as text using subscripts, superscripts and extended characters are correctly converted.
Manuscript defines two kinds of tabs for numbers: decimal tabs (which align on decimal points), and numeric tabs (which align on the rightmost digit). Since other word processors do not implement numeric tabs in this manner, FileMerlin™ converts both these tab types as decimal tabs. The impact of this approximation on most documents is negligible.
Manuscript allows up to twenty font specifications in a document (ten in version 1.0), which includes specification of font sizes. These are entered in a global font table. The first three entries are reserved for specific document components, and the remaining are available for font changes throughout the document. When converting from Manuscript to other formats, FileMerlin™ correctly translates the font sizes for all 20 entries.
By default, Lotus Manuscript uses .DOC as the filename extension for its document files. However, users may use a different extension if they wish by specifying it explicitly.
FileMerlin™ converts documents produced by all PC and VAX versions of Mass-11. Documents from non-PC environments (e.g., DEC Vax computers), must be downloaded to (or be available on) a PC-compatible system before processing with FileMerlin™ . They may be downloaded either as native Mass-11 files, or in Mass-11 export format, which basically uses a carriage return and/or line feed to terminate each Mass-11 record.
Mass-11 uses the concept of "documents and folders" for file management. Each native-format Mass-11 document belongs to a "folder", and a disk or subdirectory may contain several folders. Each folder is a Dos file having filename extension .000. For example, a folder named SMITH would physically be a file named SMITH.000. The folder file contains information to manage the documents associated with it.
A native-mode document associated with a folder has the same Dos file name as the folder name, but a different extension. Document extensions use an alphanumeric numbering system starting with .AA0 and going on to .AA1, .AA2, etc. The Dos name is different from the Mass-11 name for the document, which may be up to 30 characters long but must be different for each document in a folder. Mass-11 identifies documents by their 30-character names, and the Dos name of any document may be found by examining its folder index.
When converting documents from Mass-11 to other programs with FileMerlin™ , you must specify them using their Dos name(s). FileMerlin™ encodes the 30-character Mass-11 name as the long file name in the converted document's summary area if possible. As described in chapter 6, FileMerlin™ can automatically identify native-mode Mass-11 documents, thus excluding irrelevant files in a disk or subdirectory.
Export-format Mass-11 files usually have a filename extension .TXT, even though they are not Dos text files and include binary codes. These files are useful when working with Mass-11 on non-PC platforms.
Mass-11 uses a concept similar to "printer definition files" for certain printer-specific attributes, such as font size and line height. FileMerlin™ does not have access to this information when converting Mass-11 documents, and so cannot convert these parameters automatically. However, you may convert font information in a customized manner as described later in the chapter on Customization.
Mass-11 numbers footnotes, endnotes, pages and outline tags in arabic, roman or alphabetic fashion. FileMerlin™ correctly translates such numbering styles. Mass-11 also allows "customized" numbering schemes, which are defined external to the Mass-11 document, are not available to FileMerlin™ and hence are not converted.
Mass-11 numbers footnotes/endnotes sequentially. Non-sequential numbering is only possible by using customized numbering as described earlier. Therefore, when converting documents with non-sequential footnote numbers to Mass-11, footnote numbers may change. Note, however, that the converted document is still consistent, i.e., footnote numbers appearing in the body text match the corresponding numbers in the footnotes.
Mass-11 supports automatic paragraph numbers (outline tags), but provides limited control over their format. Also, it does not support non-sequential numbering except using customized numbering as described earlier. As shipped, FileMerlin™ transfers outline tags using true Mass-11 outline codes with full functionality, which is usually preferred. But if you want an exact match with the original document which is not possible in this manner, you may customize FileMerlin™ to "expand" each outline tag into literal text as described in later chapters on customizing.
RTF (Rich Text Format) is a file format introduced by Microsoft to store document content in a relatively standardized manner. This file format has features to store advanced document formatting and layout functions. It is a text-based format in that it does not include any binary codes.
As word processors get more powerful and include more features, this file format has been upgraded over time and has gone through many revisions.
Most Microsoft products can read and write the RTF format. Many non-Microsoft products also include filters to read and write this file format with varying degrees of accuracy and completeness.
FileMerlin™ can convert documents from Microsoft RTF to other file formats, as well as from other file formats to RTF. This provides an alternative for converting to or from word processors and programs that may not be supported directly by FileMerlin™ but which can read from or write to the RTF format.
Microsoft RTF requires .RTF as the filename extension for its documents.
FileMerlin™ automatically identifies and converts all Dos and Windows versions of Microsoft Word. It does not directly handle Word for Macintosh (MacWord) files. However, MacWord can save its files in Word for Dos or Word for Windows format, in which case such files can be converted if they are brought over to a Windows PC.
FileMerlin™ can convert all Dos and Windows versions of Microsoft Word. The table below lists all these versions.
|
Revision |
Minimal Operating System |
Commercial Name |
|
4 |
Dos |
Word for Dos 4.0 |
|
5 |
Dos |
Word for Dos 5.0 |
|
5.5 |
Dos |
Word for Dos 5.5 |
|
6 |
Dos |
Word for Dos 6.0 |
|
1 |
Windows 3.x |
Word for Windows 1.0 |
|
2 |
Windows 3.x |
Word for Windows 2.0 |
|
6 |
Windows 3.x |
Word for Windows 6 |
|
7 |
Windows 95 |
Word 95 |
|
8 |
Windows 95 |
Word 97 |
|
9 |
Windows 98 |
Word 2000 |
|
10 |
Windows 98 |
Word 2002 or Word XP |
|
11 |
Windows XP |
Word 2003 |
|
12 |
Windows XP |
Word 2007 |
The Dos and Windows versions of Word are dealt with separately in the following subsections.
Word for Dos allows style sheets to format a document. However, style sheet files are separate from document files, and style formatting is not explicitly recorded in the documents; instead, the document file only makes a reference to the style sheet. So when converting Word for Dos documents, style sheet formatting is not converted.
Word for Dos uses side-by-side paragraphs as equivalent to parallel columns (also called synchronized columns) provided by some other word processors. Wherever possible, therefore, FileMerlin™ translates them to parallel columns (or tables) in other formats.
Outline mode paragraphs in Word for Dos may be flush-left or indented, but this selection is made on the fly, and not recorded in the document file. When converting these documents, FileMerlin™ does not normally indent these paragraphs. However, you may opt to have them indented as described later in the chapter on Customization.
Microsoft Word for Dos document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for Microsoft Word for Dos documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
By default, Microsoft Word uses .DOC as the filename extension for its document files. However, users may use a different extension if they wish by specifying it explicitly.
FileMerlin™ converts all versions of Word for Windows (also called Microsoft Office Word, WinWord, MS Word or just Word). These include versions 1, 2, 6, 7 (95), 97, 2000, XP, 2003 and 2007.
Word for Windows versions 6 and 7 use the ANSI (for American National Standards Institute) character set. This set does not include box- and line-drawing characters used by many older word processors, but these characters are either available through a special font known as the MS Line Draw) font. Word for Windows versions 8 (97) and later also use the ANSI and UniCode character sets, and encode the line drawing characters in Unicode. They do not need the MS Line Draw font. FileMerlin™ recognizes and converts box-drawing characters in all of these cases..
Word can save its documents in full-save or fast-save modes. Documents should be full-saved when completed. FileMerlin™ converts only full-saved Word documents for versions 1 through 7 (95), but for versions 8 (97) or higher it can handle full- as well as fast-saved files.
Although Word can save its documents in several different formats, each version has a preferred format. It is the format to which a document is saved by default, and the one that the program is optimized for.
The native file format for Word versions prior to 2007 is the binary .DOC file format. Although there have been many revisions of this format over time, it has been fairly stable for Word revisions 97 through 2003. Documents saved in this format have .DOC filename extension. FileMerlin™ can convert all of these .DOC format revisions.
Starting with Word 2000, documents could also be saved in XML format. The support for XML has increased with each successive revision of Word, although the implementation has also changed substantially with each revision. Documents saved in this format have .XML filename extension, and in some cases the content of a single document is distributed over multiple files. FileMerlin™ is not designed to convert Word 2000 or 2002 XML, but it can accurately convert Word 2003 XML files.
Word 2007 introduces a new document file format, which is a compressed file format that packages all document content into a single file. Files saved in this format, which is the default format for Word 2007, used a filename extension of .DOCX or .DOCM The former is used for documents that do not contain macros, the latter for documents containing macros. FileMerlin™ can accurately convert .DOCX and .DOCM files as well.
FileMerlin™ converts documents produced by all versions of MS Works 2 through 4.x, and can automatically figure out the version. Note that MS Works versions 1 and 2 were used under Dos, while versions 3 and higher are Windows versions. The Windows version documents may contain embedded spreadsheets (tables), pictures and other objects. Please note that FileMerlin™ does not convert these embedded objects.
Microsoft Works versions 1 and 2 document files use the original IBM PC character set (also called the OEM character set), where special characters are identified by means of a Dos code page. By default, FileMerlin™ assumes the US Dos Code page (code page 437) for Microsoft Works versions 1 and 2 documents. However, you may change this for documents created in other locales or environments, as described later in the chapter on Customization (see "Text Tab").
By default, Microsoft Works uses .WPS as the filename extension for its document files. However, users may use a different extension if they wish by specifying it explicitly.
FileMerlin™ converts documents produced by Microsoft Write, which is the "desk accessory" word processor included with Microsoft Windows 3.x and prior. Like Word for Windows, MS Write uses the ANSI character set, and the related considerations mentioned under Word for Windows apply to MS Write as well.
By default, Microsoft Write uses .WRI as the filename extension for its document files. However, users may use a different extension if they wish by specifying it explicitly.
FileMerlin™ converts Microsoft Excel files to tabular layout as appropriate for the destination format. If converting to a document format, the spreadsheets in the Excel file are laid out as formatted tables. If converting to a generic format such as comma-separated, each spreadsheet in the Excel file is output to a separate comma-separated file. If such multiple destination files are produced, FileMerlin™ adds sequential numbers to their file names.
It is important to note that FileMerlin™ does not do a spreadsheet-to-spreadsheet conversion. In other words, formulas and the spreadsheet logic are not converted. Rather, the data values are converted in a formatted manner.
FileMerlin™ converts stand-alone Excel files, as well as those embedded in or linked to Word documents.
Excel versions 2003 and prior save workbooks in a binary file format by default. These files have .XLS filename extension. FileMerlin™ converts these files.
Excel 2003 can also save workbooks in an XML-based file format, although this is not its native default file format. Files saved in this manner have .XML filename extension. FileMerlin™ can also convert these files.
Excel 2007 saves workbooks by default in a new compressed file format. These files have a filena