I've decided to update and repost my original "The Anatomy of a Postional Flat-File" post as the first in a three-part series on developing flat-file schemas in BizTalk Server. If you're like most BizTalk developers, one of the first challenges you'll face is how to develop a schema for a positional flat-file output from some legacy system. The first thing I want you to understand is that “I'm No Expert” at flat-file schema development but almost all of the integrations I've been involved in during the past five years have involved a flat-file schema of some sort. This series of posts outline some of the basics in understanding positional flat-files I've learned and would like to pass on.
The first thing to understand is that most (if not all) “positional” flat-files are actually delimited flat-files with positional records. The most common record delimiter is the Windows standard text file CRLF (carriage return-line feed) delimiter as shown in the example below:
N1 ST MY COMPANY, INC.
N3 123 ANY ROAD P.O. BOX ABC
N4 HOUSTON TX77061

Figure 1: Positional Record Example
Each record in the positional flat-file is delimited with a new line (CRLF) and each field occupies a fixed area. The “Record Type” field in the example above is three characters long, left justified and padded with spaces. The “ID Code” field is the same and the “Address Name” field is very similar except the field length is thirty characters. Each positional record is followed by the CRLF delimiter which in BizTalk Server is denoted by the properties listed below:
Child Delimiter - 0x0D 0x0A
Child Delimiter Type - Hexadecimal
Child Order - Postfix
The other schema properties that are important to each positional record are the “Tag Identifier” and the “Tag Offset”. The Tag Identifier is the character string used to identify the record and the Tag Offset specifies the starting position of the Tag Identifier. Both are vital to the flat-file parser in order to identify each new positional record. In the example shown above, the Tag Identifier is “N1” and the Tag Offset is “0”. One thing to keep in mind is that the Tag Identifier need not be the first field in the record it just needs to be distinct.
Working Hint #1: I use a product called TextPad when working with positional and delimited flat-files since it can show the “padding” characters (spaces) as well as the CRLF delimiter. This makes it much easier to “see” everything in your instance document and helps prevent the most common “field length” errors.
Working Hint #2: The format of Windows and Unix text files differs slightly. In Windows, lines end with both the carriage return and line feed ASCII characters, but Unix uses only a line feed. If you use the BizTalk FTP Adapter to “get” the files (in FTP ASCII mode) from your Unix server, it will convert these from Unix text to Windows text and add the CR to the LF.
Working Hint #3: Read the posts in the microsoft.public.biztalk.nonxml newsgroup religiously! This is your best source of help in developing flat-file schemas correctly and the folks from MS are extremely helpful.