Lesson 21 of 51 · Query and Batch

Batch and File Protocols (BHS and FHS)

Batch and File Protocols

When real-time exchange is the wrong model

A single HL7 V2 message begins with an MSH segment, and MLLP frames one such message at a time across a TCP connection for real-time delivery. That model fits events that must be acted on the moment they happen: an admission, a new order, a result that just verified. But not every exchange is event-driven. A laboratory might post a full day of finalized results in a single overnight run. A registry might send a reconciliation file that restates every patient record so the receiver can detect drift. A billing system might hand off thousands of charges once per night.

For these workloads, opening a connection and shipping one message at a time is awkward. Real-time delivery optimizes for low latency on each message, but bulk transfer cares about throughput, restartability, and being able to confirm that the whole set arrived intact. For this, HL7 V2 defines a batch and file protocol: a way to wrap many ordinary messages inside a labeled envelope that can be moved as a single unit 1.

The envelope hierarchy

The protocol nests three levels. A file contains one or more batches, and each batch contains one or more messages. Each message is exactly what it would be on its own: a normal V2 message beginning with MSH. The new pieces are the wrapper segments that mark where each level starts and ends.

  • FHS / FTS — File Header Segment and File Trailer Segment, the outermost wrapper.
  • BHS / BTS — Batch Header Segment and Batch Trailer Segment, wrapping the messages of one batch.

Laid out, the structure looks like this:

FHS                       (file header)
 BHS                      (batch header)
  MSH ... (message 1)
  MSH ... (message 2)
 BTS                      (batch trailer; often carries a message count)
FTS                       (file trailer; often carries a batch count)

The header segments mirror much of what MSH carries — sending and receiving applications, a timestamp, and identifiers — but describe the batch or file as a whole rather than any one message 1. A single file commonly holds just one batch, but the two levels exist so that, for example, results from several departments can travel together while each department’s messages stay grouped.

Counts and completeness checking

The trailer segments do more than mark an ending. BTS typically records the number of messages in its batch, and FTS typically records the number of batches in the file. These counts turn the receiver into its own integrity check: after parsing, it compares what it actually read against what the trailer declared.

This matters because bulk files travel without the per-message handshake that real-time exchange provides. A batch is usually delivered as a file dropped onto shared storage or pushed over SFTP, not framed message-by-message over MLLP. If a transfer is cut off, a disk fills, or a process writes only part of its output, the result is a truncated file. A count mismatch — fewer messages than BTS promised, or fewer batches than FTS promised — is the signal that the file is partial and must not be processed as if it were complete 2.

The trade-off, and what it costs operationally

Batch processing trades latency for throughput. Nothing in a nightly file is timely in the way an admission alert is; in exchange, the sender produces one self-describing artifact, the receiver handles it in one pass, and the whole exchange is easy to schedule, retry, and audit. For high-volume, non-urgent data, that is the better deal.

The shift to files changes what can go wrong. Because a batch is reprocessed as a unit — after a failure, or because someone resends yesterday’s file — ordering and idempotency become real concerns. If messages within a batch depend on each other (a patient created before an order references them), the receiver must honor that order. And if the same file can arrive twice, applying it twice must not double-count charges or duplicate records; safe reprocessing usually relies on stable message identifiers so repeats are recognized and ignored. Designing for partial files, replays, and ordering up front is what separates a batch interface that survives a bad night from one that quietly corrupts data.

References

  1. HL7 Standards — Section 1d: Version 2 (V2). HL7 International. verified
  2. Tim Benson, Grahame Grieve. Principles of Health Interoperability: FHIR, HL7 and SNOMED CT. 4th ed. Springer. 2021. verified