Формат edifact пример файла заказа

Краткая суть задачи. Используемый язык PHP, но к языку не обязательно привязываться, хочу просто уловить суть механизма.
Есть 10 компаний которые будут скидывать EDI файлы. Стандарты у них различные (однако они предоставили спецификации). Отрасль: грузоперевозки. EDI у нас не было и я с EDI прежде не работал, поэтому и прошу хоть каких-то подсказок. Заранее спасибо.
Хочу прописать классы с абстрактной фабрикой чтоб можно было передавать им ЕДИ файлик, а оно его жевало и всё правильно разбирало. И вот небольшой список вопросов =)
По поводу самого EDI файла.
1) UNH (это как начало конверта, насколько я понял). И вот некоторые компании скидывают всё в одном UNH (инфо о каждом грузе), другие же скидывают каждый груз в отдельном UNH - UNT.
2) В спецификации есть поле "Position", но я так и не до конца понял как лучше с ним работать. То что оно отвечает за порядок полей - я понял, однако многие поля пропускаются (Conditional).
3) Как точно узнать к какой именно группе-сегментов относится данная строка (сегмент данных). Так же интересует про вложенные группы сегментов.

По поводу кода
Поскольку не все компании используют EDIFACT, есть ещё стандарт (не нашел нормальное название) который имеет структуру в стиле:
18AAAA1234567 7229932 2H000386CTNS 0000000752200006.86
Поскольку изначально задача стояла в разборе только 1 формата (строкой выше) и достаточно редкого применения, а сроки поджимали, то писалось на коленке.
Идея была в стиле:
Есть класс, внутри него есть массив "specification", в котором массивы по кодам

и так по каждому коду спецификации. Потом проходили по каждой строке, брали первые 2 символа (код) и доставали откуда по куда доставать инфо.
Когда попросили добавить для EDIFACT, всё с теми же горящими сроками и всё на той же коленке.

Ну а при построчном переборе мы просто вызывали этот метод и через регулярные выражения доставали все необходимые значения.
Мне не надо 100% универсального решения на все случаи жизни. Просто хотелось бы услышать мнения и советы профессионалов, которые может подскажут как лучше работать с этими EDI файлами. Ибо нормальной литературы я не нашел. Разве что хоть как-то мне прояснило что такое EDI и с чем его едят
"A GXS TUTORIAL
EDIFACT Standards Overview Tutorial."

An EDI file may seem like a random jumble of characters at first glance. On closer examination, however, a well-thought-out schematic structure is revealed, which makes the processing of the message by computer programs possible in the first place. Behind an EDI file is a specific EDI standard that dictates how the file must be structured. Typical standard formats that can be used here are, for example, EDIFACT, XML, CSV files or flat file formats.

An EDI standard usually builds on the following four principles:

Syntax rules

The syntax rules define the allowed characters and the allowed order in which the individual characters may be used.

Codes

Within an EDI file most of the information is accurately identified using codes – for example, currency codes, country codes, but also codes for identifying a particular date format, etc.

Message design

The message design defines the structure of a particular message type. Message types include, for example, purchase orders, delivery notes, invoices, etc.

Identification of values in an EDI message

Depending on the standard used, there are three ways in which a value can be identified in an EDI file:

a) Implicitly by the position in the message. This technique is used in flat file formats and CSV. The exact position and semantic of a given value is defined in the accompanying documentation. For example: A line beginning with the characters 100 stands for the header line. The characters from position 4 – 17 in a line beginning with 100 represent the GLN of the sender, etc.

b) Implicitly through the use of separators. Using a set of predefined separator characters, a message structure is explicitly defined. The message structure defines which building blocks are to be used and in which order the building blocks must be aligned to form a correct EDI message, e.g., an orders message. This technique is used in EDIFACT files.

c) Explicitly through the application of metadata. With the help of additional data, the meaning of the individual information fragments in a file is specified more precisely. This technique is used in XML, for example, where the actual information is enclosed by means of markup elements and attributes – e.g.:

UN/EDIFACT – A standard of the United Nations

UN/EDIFACT is the abbreviation for United Nations Electronic Data Interchange for Administration, Commerce, and Transport. The standardization organization behind UN/EDIFACT is UN/CEFACT (United Nations Center for Trade Facilitation and Electronic Business). UN/EDIFACT is one of the most widely used EDI standards today alongside ANSI ASC X12. ANSI ASC X12 is very common in the North American market, whereas in Europe UN/EDIFACT prevails (or one of the various subdialects).

As the following figure shows, the UN/EDIFACT standard is based on four different pillars.

Syntax

The syntax defines the exact rules for message building, as well as the characters used to separate individual message segments and elements.

Data elements

A data element is the smallest unit in an EDIFACT file.

Segments

Groups of similar data elements form so-called segments.

Messages

Messages represent an ordered sequence of segments – for example, a DESADV file represents a delivery note.
Additionally, the EDIFACT standard defines message delivery requirements. For example, the exact structure of a specific message exchange, which in turn may contain several EDIFACT files.

EDIFACT standards

The exact structure of an EDIFACT message is defined in official standard documents, which are also available online. UN/CEFACT approves two EDIFACT standard versions per year, each marked with the year followed by “A” (for the first release of a year) and “B” (for the second release of a year). D01B is therefore the second standard version from the year 2001.

In addition, separate subsets of the official UN/EDIFACT standard exist for the various industry sectors and domains. In the consumer goods sector the EANCOM subset is very common, which is also used for the example of this article. EANCOM is the world’s most widely used standard for electronic data interchange. The most frequently used EANCOM message types are ORDERS, DESADV and INVOIC.

Structure of an EDIFACT file

An EDIFACT file follows an exact hierarchy, which is denoted below.

The topmost unit of an EDIFACT message is the Interchange (UNB), which can be thought of as an envelope. The interchange defines the message recipient, the message sender, the message number, the message date, etc.

An interchange can in turn contain several individual Groups (UNG) representing message groups. Alternatively, an interchange can also contain individual messages (concrete messages). Mixing of individual messages and message groups within an interchange is not allowed.

A message itself is enclosed by a header (UNH) and a trailer segment (UNT). A message group is surrounded by an UNG and UNE segment.

Within a message, there are several segments and segment groups, which represent individual related message parts (for example, information about the biller, a specific invoice line, etc.). A segment group is initiated by a so-called trigger element.

Segments consist of data elements and composite data elements.

The smallest unit of an EDIFACT file are simple data elements.

Simple Data Elements

Simple data elements form the basic building blocks of an EDIFACT file and represent – as the name already suggests – simple data values.

An example for a simple data element is a party name.

Description: Name of a party.

Representation: an..35

Example:

The abbreviation an..35 means, that a maximum of 35 alphanumeric characters may be used for the party name.

Simple data element with code list

For a simple data element with code list no free text can be used, but a list of predefined values (= codes) must be used.

An example for a simple data element with code list is a coded document name.

Description: Code specifying the document name.

Representation: an..3

Example:

Composite Data Element

A composite data element consists of individual simple data elements and represents data with additional metadata (= additional data describing the actual data).

The individual components within a composite data element (typically simple data elements and simple data elements with code lists) are separated using the : character.

An example for a composite data element is a duty/tax or fee type.

5153	Duty or tax or fee type name code	C	an..3
1131	Code list identification code	C	n..17
3055	Code list responsible agency code	C	an..3
5152	Duty or tax or fee type name	C	an..35

Example:

The following example shows an exemplary duty/tax/fee type.

AAA = Petroleum tax
52 = Value added tax identification
1 = CCC (Customs Co-operation Council)
tax type xyz = Free text description of tax

Segments

Segments consist of simple data elements and composite data elements and represent compound data, such as an address.

The individual data elements in a segment are separated by the + character. A segment starts with the three-digit segment identifier and ends with the ' character.

The exact structure of a segment is described by means of so-called segment tables. The following segment table describes the structure of the LIN (line item) segment segment .

M indicates “mandatory”, i.e., the simple data element or composite data element must be specified. C stands for “conditional” and means that the data element can optionally be specified. The values an..3 , an..17 , etc. represent the number of permitted alphanumeric characters.

Assume we want to represent the following line item information using the LIN segment.

Pineapples
Line item number: 2
GTIN: 9393398439325
Type: article has been added

The resulting EDIFACT segment is:

Segment groups are used to aggregate several individual segments into groups of related segments.

For example, the following segment group allows to specify contact details by combining the segments CTA (Contact information) and COM (Communication contact).

Some possible segment sequences would be for example:

As indicated by C 5 , the segment group itself is optional and may occur up to 5 times. The segment group is initiated by a so-called trigger segment. It is the first element within the segment group that usually has cardinality M 1 (that is, it must occur exactly once).

Please see our article here for more on the EDIFACT PRI segment.

Messages

A message represents a related sequence of segments and represents a concrete business document – for example, an DESADV message (dispatch advice). The following section shows the first part of a DESADV definition.

EDIFACT sample dispatch advice

In the following EDIFACT dispatch advice example we will represent the delivery structure shown in the figure below.

Based on the previous concepts, the example now shows a concrete EDIFACT message for a dispatch advice. To increase readability, line breaks have been added after each segment. In a regular EDIFACT file no line breaks shall be used.

In the following, we examine the structure of the individual segments in detail.

The UNA segment stands for the “Service String Advice” and describes the separators used in the message. Usually, the following separators are used (syntax version 3).

: Composite element delimiter
+ Data element delimiter
. Character reserved for the decimal comma
? release character (escape character)
remains empty
' Segment delimiter

UNB segment

The UNB segment represents the interchange header, and contains information about the message sender, the message recipient, the message date, and so on.

UNOA = UN/ECE level A; complies with ISO 646 – also called International Alphabet No. 5 – except lowercase letters.
UNOB = UN/ECE level B; like UNOA but also lowercase letters.
UNOC = UN/ECE level C; complies with ISO8859-1
UNOD = UN/ECE level D; complies with ISO8859-2
UNOE = UN/ECE level E (Cyrillic)
UNOF = UN/ECE level F (Greek)

3 identifies the UN/EDIFACT syntax version. There are four different UN/EDIFACT syntax versions. Nowadays mostly syntax version 3 and 4 are used.

8773456789012:14 corresponds to the sender of the message.

9123456789012:14 corresponds to the recipient of the message.

The identifier 14 indicates that the number is a GLN.

140218:1552 stands for February 18, 2014, 15:52.

MSGNR4711 is the unique number of the interchange. It is used in particular in the context of message routing for the unique identification of a message transmission.

The last 1 indicates that the test indicator is set and that the given interchange is a test message. This information is also important for message routing, because the recipient can distinguish production messages from test messages and they can be processed accordingly different.

The UNH segment represents the header of a document. The number 1 is the unique number of the document within the interchange. The number is assigned by the sender.

DESADV:D:96A:UN indicates that the document is a dispatch advice and that the document type is from UN/EDIFACT Directory D96A. EAN005 indicates that it is an EANCOM document type and identifies the EANCOM version of the D96A EANCOM standard used.

BGM segment

The BGM (beginning of message) segment initiates the actual document.

351 = Despatch advice
35E = Returns advice (EAN Code)
YA5 = Cross dock despatch advice (EAN Code)

DOCNR4712 is the unique number of the document given by the sender.

DTM segment

The DTM segment is used to specify date and time information.

The first part of this composite data element identifies the type of the date (date/time/period qualifier). For example:

137 = Document/message date/time. Date/time when a document/message has been issued.
2 = Delivery date/time, requested date on which buyer requests goods to be delivered

The second part represents the actual date value:

20180218 for example, February 18, 2018.

The third part specifies the pattern for the date (date/time/period format qualifier).

102 corresponds to CCYYMMDD

The NAD segment is used to indicate the names and addresses of the companies involved. Instead of names and addresses, however, in most cases the unique identification by means of numbers, such as the GLN (global location number), is used.

BY = Buyer
DP = Delivery party
SU = Supplier
WH = Warehouse keeper

The second part represents the 13-digit GLN and 9 indicates, that the provided number is a GLN number.

A dispatch advice is structured using the concept of consignment packing sequences. Thereby, a CPS represents a specific layer in the hierarchy of a shipment, e.g., a pallet, a box, a carton, etc.

The number 1 indicates the hierarchy level, which in this case is level 1 . Further CPS sequences may then refer to the upper layer, using the second digit. For example:

indicates consignment packing sequence 2 . The parent layer is consignment packing sequence 1 .

The PAC segment is used to specify the number and the type of packages. In the example above 1 packaging of type PK (package) is denoted. Other packaging types are for instance:

09 = Returnable pallet (EAN Code)
201 = Pallet ISO 1 – 1/1 EURO Pallet (EAN Code)
PK = Package
SL = Slipsheet

The PCI segment is used to specify markings, which are used to uniquely identify the packages in the shipment. In retail mostly SSCC (serial shipping container codes) are used.

The code 33E indicates: marked with serial shipping container code.

GIN stands for goods identify number and is used to specify the code, which is attached to the packaging.

The LIN segment represents a line item position in a dispatch advice. Thereby, 1 is the line item number assigned by the sender.

The following number 4260304623843 represents the item number. The third part EN indicates the type of number – in this case a GTIN (global trade item number) was used.

QTY segment

The quantity segment is used for the definition of shipping quantities.

The first part of the composite data element indicates the type of quantity.

12 = Despatch quantity
21 = Ordered quantity
59 = Numbers of consumer units in the traded unit

The second part is used to provide the actual quantity – in this case 110 .

The third part indicates the unit of measure in which the quantity is provided. Possible values are for instance:

PCE = Piece
KGM = Kilogram
PND = Pound

RFF segment

The RFF segment is used to provide reference numbers.

ON indicates a reference to an order message. 8493848394 is the referenced order number and 1 is the position number in the referenced order message.

CNT segment

The CNT segment is used to specify control values that can be used to check the integrity of the message upon receipt.

In the example above 2 indicates, that the following value represents the number of line items in the message. The control value 3 for the number of line items is correct, since the message actually contains three line item element.

UNT segment

The UNT segment represents the message trailer.

The first number 34 indicates the number of segments in the message from the UNH to the UNT segment and is thus also a check digit. The number 1 must be the same message number as used in the UNH segment. This also serves to check the integrity of the EDI message.

UNZ segment

The UNZ segment represents the interchange trailer and is the last segment in an EDI interchange.

The first number 1 represents the number of messages contained in the interchange. The second entry MSGNR4711 is the same interchange number as in the UNB segment and also serves to check the integrity of the message.

Summary

Although incomprehensible at first glance, a closer look at an EDIFACT message shows the information hidden inside. In contrast to markup-based approaches such as XML, the size of an EDIFACT message is very small, since only coded information is transmitted and no space-intensive markup is used.

Although one might think that space does not matter with the storage capacities and Internet bandwidths available today, EDI message size is of major relevance. For example, Deutsche Telekom still calculates X.400 traffic on a kilobyte basis.

EDIFACT and its subsets are the most widely used EDI data exchange formats of companies worldwide. As a supplier to large companies or as a buyer of large companies, one therefore often has no choice but to use EDIFACT.

Questions?

Do you have any more questions about EDIFACT? Please do contact us or use our chat — we’re more than happy to help!

Document standards are an essential part of electronic data interchange (EDI). In short, EDI standards (aka EDI file formats) are the specific guidelines that govern the content and format of B2B documents such as orders, invoices and order responses. These documents are then sent via EDI protocols to the service provider / business partner.

How EDI file formats work

Sending documents according to an EDI standard ensures that the machine receiving the message is able to interpret the information correctly as each data element is in its expected place. Without such standards the receiver’s system will be unable to identify what part of the message is what, making automated data exchange impossible.

Although EDI documents may seem like a random mix of letters and symbols, all EDI messages conform to very strict rules. Typically EDI standards are based on the following four principles:

Syntax

Syntax rules determine what characters can be used and in what order.

Codes

Codes are used to identify common information such as currencies, country names or date formats.

Message designs

The message design defines how a particular message type (e.g. invoice or purchase order) is structured and what subset of rules from the prescribed syntax it uses.

Identification values

The means by which values in an EDI file are identified, e.g. by its position in the file or via the use of separators. These changes from standard to standard.

Most EDI standards also include the following three components:

Elements – The smallest part of a message, providing submitted values (e.g. “50” or “KGM” or “Potatoes”)
Segments – A collection of elements or values logically combined to provide an information (e.g. Quantity 50 kilograms of potatoes)
Transaction sets – A collection of segments, composing a message (e.g. an Invoice for the sale of 50 kilogram of potatoes)

Essentially different formats are like different languages, with the elements and segments of a certain standard mirroring the words and sentences of a regular language.

A brief history of EDI document standards

In the very early days of EDI it quickly became evident that document standards were required to avoid confusion and improve the efficiency of even paper-based supply chain communication. Following the advent of file transfer between computers (FTP) becoming possible, in 1975 the very first EDI standards were published by the Transportation Data Coordinating Committee (an organisation formed by US automotive transport organisations in 1968). In 1981 the American National Standards Institute then published the first multi-industry national standard, X12. In turn this was followed by the creation by the UN of a global standard, EDIFACT, in 1985.

No one attempt to unify standards has ever been completely successful, however. As technology has evolved and industry specific needs have become increasingly disparate, new standards have steadily been introduced over the decades. Somewhat counterintuitively, therefore, today there is no such thing as a single all-encompassing EDI standard for every document type. Instead, businesses choose their preferred document standard from a number of options (usually opting for the one most widely used in their industry). When trading with partners using different standards, businesses then have to ensure that their messages are correctly converted to the recipient’s required format. This process is called mapping.

The 5 most used EDI file format standards:

1) UN/EDIFACT

The most popular EDI file format standard today outside North America is UN/EDIFACT, which stands for United Nations rules for Electronic Data Interchange for Administration, Commerce and Transport. These international B2B message guidelines are extremely widely used across many industries.

In fact, given the scope of EDIFACT’s adoption, several industries have developed subsets of the main standard which allow for automation of industry-specific information. One well-known subset is EANCOM, for example, which is used in the retail industry.

ORDERS (for purchase orders)
INVOIC (for invoices)
DESADV (for despatch advice)

All of these EDIFACT messages have the same basic structure, consisting of a sequence of segments:

UNA – separators, delimiters and special characters are defined for the interpreting software
UNB – file header (with the file end UNZ this makes up the envelope, containing basic information)
UNG – group start
UNH – message header
UNT – message end
UNE – group end
UNZ – file end

Example EDIFACT order

[segment tags shown in bold]

As well as transmission format, the EDIFACT standard also defines delivery requirements. EDIFACT provides set guidelines, for example, concerning the exact structure of specific message exchanges, which might themselves contain several individual EDIFACT files.

For a more detailed look at the structure of an EDIFACT file, see our blog post on this topic here .

2) TRADACOMS

Despite being less widely-used than EDIFACT, the TRADACOMS standard was released several years before the UN standard. Designed primarily for UK domestic trade (and particularly popular in the retail industry), TRADACOMS is made up of a hierarchy of 26 messages. Like EDIFACT, each message is also given a six letter reference.

TRADACOMS does not use a single message format. Instead, a transmission to a trading partner will consist of a number of messages. For example, one purchase order will often contain an Order Header (ORDHDR), several Orders (ORDERS) and an Order Trailer (ORDTLR). Multiple individual order messages can repeat between the ORDHDR and the ORDTLR.

As with EDIFACT standards, the TRADACOMS standard uses segments for ease of translation. Below are four of the most common segments:

STX – Start of Interchange
MHD – Message start
MTR – Message end
END – End of Interchange

Example TRADACOMS order

[segment tags shown in bold]

3) ANSI ASC X12

ANSI ASC X12 stands for American National Standards (ANSI) Accredited Standards Committee (ASC) X12, though (for obvious reasons) this is often abbreviated to just X12.

Though initially developed in 1979 to help achieve EDI document standardisation across North America, X12 has been adopted as the preferred standard by approaching half a million businesses worldwide.

Compared to other EDI standards organisations, X12 has a particularly comprehensive transaction set. There are over 300 X12 standards, all of which are identified by a three digit number (e.g. 810 for invoices) rather than the six letter code system used by EDIFACT and TRADACOMS. These EDI file format standards fall under X12’s different industry-based subsets:

AIAG – Automotive Industry Action Group
CIDX – Chemical Industry Data Exchange
EIDX – Electronics Industry Data Exchange Group (CompTIA)
HIPAA – Health Insurance Portability and Accountability Act
PIDX – American Petroleum Institute
UCS – Uniform Communication Standard
VICS – Voluntary Interindustry Commerce Standards

With each containing slight variations, these subsets are used by different industries as appropriate (apparel retail businesses use VICS for example).

In addition, the hundreds of document types are divided into 16 helpful message series, from ‘order’ to ‘transportation’, with each containing the relevant individual message types.

As with Documents conforming to X12 standards are made up of several segments, some of which are optional and some mandatory. Below is a list of the mandatory segments:

ISA – Start of interchange
GS – Start of functional group
ST – Start of transaction set
SE – End of transaction set
GE – End of functional group
IEA – End of Interchange

As with all EDI documents, these segments in turn are comprised of elements, as can be seen in the example X12 document below:

Example X12 order

[segment tags shown in bold]

For more information on the X12 format, please see our more recent article here.

4) VDA

The Association of the Automotive Industry (Verband der Deutschen Automobilindustrie in German, or VDA for short) was established in 1901 by German automobile businesses.

The VDA was one of the first associations to develop EDI file formats in 1977, making VDA standards even older than EDIFACT.

Like X12 standards, every VDA message standard has a unique identification number (four digits long in this case). For example, VDA EDI file format 4905 is a delivery forecast.

As worldwide use was not expected when the standards were developed, all VDA standards were published in German – something that continues to this day. This can make interpretation difficult, particularly concerning German business terms. Likewise, as VDA also does not use a naming convention for each element, knowledge of German is required to identify them.

Unlike EDIFACT and X12 standards, VDA standards do not use segments or separators. Instead data elements with a constant length, known as fixed length format elements, are used. When the data to be transmitted is shorter than the required length, spaces are used to fill the gaps. Unfortunately, this fixed length format means that the amount of data that can be transmitted is limited, meaning conversion to/from other EDI standards can be difficult.

Because of these issues, VDA fixed length document standards are slowly being replaced by EDIFACT document types. Today, VDA standards are effectively a subset of the EDIFACT standards used extensively by the automotive industry (much like the EDIFACT subset CEFACT is used by retail businesses).

To aid those using their standards, the VDA have published suggestions regarding transition to EDIFACT.

Example VDA delivery forecast

[segment tags shown in bold]

For more information on VDA standards, please see our dedicated VDA blog post for a more comprehensive explanation.

5) UBL

The Universal Business Language, or UBL, is a library of standard XML-based business document formats. UBL is owned by Organisation for the Advancement of Structured Information Standards (OASIS), who have made it available to all businesses for free.

As UBL uses an XML structure it differs from other more traditional EDI document formats. Perhaps the biggest difference is the fact that XML-based transmissions are easier to read than other EDI file formats. However, XML file sizes are considerably larger than those of other EDI file formats, though this is no longer a problem following the advent of broadband internet.

When first established in 2003, UBL had seven EDI file format standards. By the time version 2.1 was released over a decade later this number had increased to 65, and the release of version 2.2 in 2018 further increased the number of document types to over 80.

Significantly CEN/TC434 recently named UBL as one of two EDI syntaxes which comply with new EU regulations regarding e-invoicing. As such, as the use of PEPPOL grows, so the use of UBL is also likely to increase.

Like X12, UBL message types are split into higher level categories. These categories include pre-award sourcing, post-award sourcing, procurement and transportation. UBL messages themselves, meanwhile, include validators, generators, parsers and authoring software.

Example XML order (ecosio ERPEL)

How to exchange different EDI file formats with your trading partners

Whilst each of the above document standards is widely used, particularly within certain industries, unfortunately no one set of document standards is universally used by all supply chain businesses. As a result, if you wish to grow your business and to automate B2B data exchange with your partner network to achieve the cost benefits of automation you will need the ability to translate data between numerous formats.

Given the amount of technical expertise EDI file format translation requires, the fastest and most efficient way to gain this capability is to enlist the help of a managed EDI solution provider such as ecosio. In addition to enabling you to transfer documents in any required format and over any EDI protocol via a single connection to our Integration Hub , ecosio’s managed services remove virtually all internal effort concerning EDI. For example, new onboardings are handled by a dedicated project manager who is experienced in liaising between both sides to achieve fast, hassle-free and successful connections. Similarly, ecosio’s unique API ensures users are able to send and receive data directly from their ERP’s existing user interface.

For more information on ecosio’s solution and to find out how we could help your business move to the next step, get in touch today!

What is EDIFACT? | UN / EDIFACT standard overview

What is UN/EDIFACT standard?

United Nations/Electronic Data Interchange for Administration, Commerce and Transport (UN/EDIFACT) is the International EDI standard ISO 9735-1987, developed under the UN

The general standard is adopted by national and sectoral standards bodies to better reflect the needs of each industry.

At least twice a year, the standard is updated globally. The reason of this update is to create a new directory of data and messages, in addition to improving the usability of existing EDIFACT messages.

The UN/EDIFACT standard has been developed for trade and transport management. The concept of “trade” was interpreted in a broad sense (orders, deliveries, insurance, payment of goods, customs formalities). Currently, the use of UN/EDIFACT has expanded to include accounting, customs control, pensions, health care, social insurance, judicial practice, employment, statistics, construction, finance, insurance, manufacturing, tourism, trade, freight, and container transportation.

The UN/EDIFACT standard is developed and supported by two international organizations: United Nations Economic Commission for Europe — UN ECE and the International Organization for Standardization. – ISO

EDIFACT subsets

EDIFACT is predominant outside of North America. Due to its complexity, branch-specific subsets of EDIFACT have been developed. These subsets are subsets of EDIFACT and contain only the functions relevant to specific user groups, such as:

EANCOM consumer goods industry
ODETTE European automotive industry.
CEFIC chemical industry
EDICON standard used in the construction industry
RINET – the Insurance industry
HL7 standard is used in healthcare.
IATA air transportation
SWIFT banking
UIC 912 rail transport
EDIFICE electronics, software, and telecommunications industry. EDIFICE has played an important role in the implementation of RosettaNet standards in Europe. EDIFICE became the European RosettaNet User Group.

EDIFACT messages: Structure and syntax of the standard

EDIFACT is a special, structured data language that describes all types of commercial activities, based on information logistics. Using elements and segments of standard informational messages, you can create a description of any document, generate its electronic form and transmit it in open telecommunication networks without fear of interception of private commercial information.

UN / EDIFACT Structure

Any document in UN / EDIFACT standard has a hierarchical structure. The entire electronic document is called a message. A message consists of data groups combined in some way, for example, a data group describing customs payments, a group of data describing the attributes of documents, etc. In turn, the group consists of typical data segments that describe document attributes in more details. The standard provides about 200 different types of segments from which messages are composed. The segments themselves also have a hierarchical structure and consist of data elements that can be simple (data field) and composite (usually 2-3 data fields).

The following is the structure of an EDIFACT transmission:

Service String Advice
Interchange Header
Functional Group Header
Message Header
User Data Segments
Message Trailer
Functional Group Trailer
Interchange Trailer

Example EDIFACT

UNB+UNOB:2+ XYZCORPORATION:ZZ+COMPANYX:ZZ+190521:1604+906019++++++1′

UNH+1+ORDERS:D:96A:UN’

BGM+220+4500265532+9′

DTM+137:20190425:102′

RFF+CT:CompanyX’

NAD+BY+2010::91′

CTA+OC+2010:G. Smith ‘

COM+044-1010605:TE’

COM+044-1010662:FX’

NAD+SE+0000906300::92′

CTA+SC+0000906300′

NAD+DP+++Consulting Inc St+ Begun + Laval++8003+CA’

CUX+2:CHF:9′

LIN+10++TH300010:BP’

PIA+1+000000000000500807:SA’

IMD+A++::92:HIR0010H12′

QTY+22:1:PCE’

DTM+2:20190423:102′

LIN+20++T0004671:BP’

PIA+1+000000000000501516:SA’

IMD+A++::92:CCGT060204NS LT1110S’

QTY+22:10:PCE’

DTM+2:20190423:102′

LIN+30++T2001171:BP’

PIA+1+000000000000501328:SA’

IMD+A++::92:LTPNG-R20-3.0′

QTY+22:1:PCE’

DTM+2:20190423:102′

UNT+28+1′

UNZ+1+906019′

Principles and technologies of application of the UN / EDIFACT standard

The EDIFACT Standard has three types of reference books:

The first type is directories that are based on the ISO standards. It includes directories of currency codes, country codes, units of measurement, modes of transport, delivery conditions, and some others.

The second type of directories, are the ones included in the EDIFACT standard., by default

The third type of directories is developed by different organizations responsible for issuing codes. Here is the list of organizations 3055 Code list responsible agency code

There are four main components in EDIFACT that are subject to standardization, when preparing documents for exchange between business partners.

data elements
standard data segments
standard messages
syntax rules

Data elements are the smallest, non-dividing parts of information, for example, the document date, the name of the destination, the amount of tax. More than 600 data elements used in international trade and transport have been published in a special UNTDID directory.

EDIFACT standard principles

The UN / EDIFACT standard is based on the following principal:

1. Standardize data at the segment and element level. Any document intended for electronic exchange should consist of typical segments. This means that the segment of the supplier’s address or delivery address is described by the same elements, regardless of what kind of document it is – invoice, order, declaration, etc. The practice has shown that to describe almost any document, it is enough to have no more than 100 typical segments. The fields inside the segments are standardized the same way, and the ratio of fields to segments is one-to-many, i.e. the same field can be included in different segments.

2. Record the fields used in segments as code. It is assumed that the partners exchanging electronic documents have identical code tables (directories). The composition and content of the reference books is standardized at three levels – international, national and corporate.

3. The independence of standards from the language of communication. The peculiarity of the UN / EDIFACT standards is that more than 90% of the electronic message consists of different codes. Another feature is that only the content of the document is transmitted, without a form. The document form is restored when the message is decoded.

EDIFACT messages

The EDIFACT standard which provides a set of standard messages has greatly simplified international and multi-branch trade and the exchange of electronic business documents between countries and various industries.

The standard message UN / EDIFACT has a six-letter identifier that reflects the short name of the message, for example:

Some of the standard EDIFACT messages with X12 equivalent are listed in the table below.

X12 Transaction Number	EDIFACT Transaction ID	Transaction name
850	ORDERS	Purchase order message
855	ORDRSP	Purchase Order Acknowledgment
846	INVRPT	Inventory Inquiry/Advice
856	DESADV	Shipment Notification ASN
810	INVOIC	Invoice
997	CONTRL	Functional acknowledgment
860	ORDCHG	Purchase Order Change – Buyer Initiated

Due to the independence from the language and the transfer of only the contents of the document, the restoration of the form of the document takes place on the receiving side in accordance with the rules that apply in this place.

Benefits of EDIFACT

EDIFACT has a competitive advantage that positively affects the efficiency of a company and improves business processes. The main advantages of EDIFACT:

Profitability – reducing the volume of papers to be processed leads to a decrease in personnel and administrative costs.

Efficiency – large volumes of commercial data can be transferred from one computer to another within minutes

Accuracy – the use of EDIFACT eliminates human errors that are inevitable when manually keying in data.

EDIFACT is a key component of a just-in-time strategy that ensures prompt customer satisfaction. EDIFACT in conjunction with the Internet allows real-time electronic transactions and accelerates the interaction between trading partners.

Easy EDIFACT integration

EDI стандарт (Electronic Data Interchange) — часть старых, устоявшихся систем. Но мы постоянно видим, как EDI представляют, как современный стандарт. Так ли это? Надо ли нам рассматривать EDI в качестве базовой технологии для новых проектов?
Давайте посмотрим на EDI с технической точки зрения, отбросив все остальное.

Формат данных в EDI

EDI использует delimited text формат. Он хорошо работает для плоских структур данных, таких как таблицы. Он не так хорош для представления иерархических структур данных. Вложенные объекты лучше сериализуются с помощью tagged форматов, таких, как XML и JSON.
Очень странно, но так и не был создан язык описания (document definition) для EDI. Прошло столько лет с момента появления EDI и столько усилий было затрачено на него, но язык описания так и не создан. Язык описания позволяет автоматизировать обработку данных, а именно их генерацию, верификацию, преобразование, сериализацию, десериализацию. Для сравнения, для верификации XML данных мы берем схему данных (XML Schema, xsd) и парсер автоматически проверяет данные на соответствие этой схеме.
Можно обойтись и без схемы, но тогда желательна разметка документа. XML и JSON документы могут быть десериализованны и без схемы, потому что сами данные содержат тэги (имена) элементов данных. EDI имеет тэги только для сегментов и не имеет тэгов для элементов. Элементы определяются позицией внутри сегмента. Универсальный EDI парсер сможет разобрать документ только на примитивные коллекции, потому что документ не содержит ни имен, ни типов для элементов данных.

Давайте обратимся к деталям.

Пакетный формат

EDI определяет пакеты для наборов документов, групп документов и самих документов/транзакций (Interchange, Group and Transaction/Document). Пакеты ограничиваются соответственно ISA/IEA, GS/GE, ST/SE парами сегментов.
Замечание: Для иллюстрации я использую EDI X12 вариант стандарта, распространенный в Северной Америке. Другой вариант стандарта, EDIFACT, распространен в Европе и принципиально не отличается от X12.
Здесь представлен пример самых первых сегментов всех трех пакетов: ISA, GS и ST. Пример взят отсюда:

ISA*00* *00* *ZZ*RECEIVERID *12*SENDERID *100325*1113*U*00403*000011436*0*T*>~
GS*FA*RECEIVERID*SENDERID*20100325*1113*24712*X*004030~
ST*997*1136~

Как видим, практически все элементы в пакетных сегментах или бесполезны, или, более того, опасны, если мы будем их использовать в соответствии со стандартом.
Пожалуйста, не пытайтесь использовать данные из пакетных сегментов для аутентификации и адресации.
EDI был создан во времена, когда размещение этой информации в пакетах было единственным вариантом. Сейчас мы передаем документы через интернет и используем большой набор стандартов и протоколов для упаковки, адресации, аутентификации, авторизации, надежности, кодирования, сериализации, сегментирования и т.д., и т.п. Специфичная для конкретного протокола информация добавляется и удаляется на всем пути данных, и эта информация независима от самих данных.

EDI — это стандарт формата данных или протокол?

EDI пытается быть протоколом, именно поэтому мы видим эти элементы адресации, авторизации и запроса подтверждения. Я не знаю, как эту информацию можно сопоставить с OSI protocol layer model.
Но все же большая часть EDI стандарта посвящена форматам данных.

Форматы документов

Внутри пакетов мы видим сами документы. Но мы не найдем стандарта для универсального, обобщенного документа. Стандарт определяет многочисленные форматы для всевозможных типов документов: для заказов, для накладных, для описей вложения… Здесь вы найдете небольшую часть из громадного списка стандартизованных документов.
EDI следует известному мифу: «Где-то там есть идеальный формат, который описывает все на свете сценарии. Мы обязательно найдем этот формат. Нам нужно просто добавлять новые сценарии и подстраивать старые.»
Как результат EDI стандартные документы (спецификации) чрезмерно сложные.
Возьмем один пример: Нам нужна накладная для небольшого местного книжного магазина. Мы нашли подходящую стандартную спецификацию, EDI 850, заказ на покупку (Purchase Order). На первый взгляд он выглядит чересчур детальным. Мы не будем покупать продукты питания, уголь, зерно, жидкие продукты, опасные продукты, медицинские препараты. Нам не нужны международные адреса. Мы не будем использовать службы срочной доставки. EDI спецификация описывает все эти возможные варианты, но в ней слишком много полей, которые мы никогда не будем использовать. Она чересчур сложна для нашего простого документа.
Существует много индустриальных (domain) стандартов, которые используются как своеобразные хранилища знаний. Но эти стандарты не используются как стандарты передачи данных. (Посмотрите эту статью, описывающую проблему индустриальных стандартов.)

Циклы (Loops) внутри документов

Структура индивидуальных документов довольно проста. Документы составлены из серии сегментов, внутри которых находятся данные документов.
Но оказывается, что сегменты могут объединяться в группы или в повторяющиеся группы, так называемые циклы (loops). Пикантность в том, что эти циклы абсолютно никак не выделены в документе. О наличии цикла мы можем прочитать в спецификации данного конкретного документа. Сегменты одинакового типа (с одинаковыми тэгами) могут располагаться как независимо, так и внутри циклов. Создать парсер, распознающий циклы (которые, повторяю, никак не отмечаются в документе), это довольно нетривиальная задача.
В XML и JSON такой проблемы не стоит, иерархические объекты или коллекции объектов любого уровня вложенности очень просто задаются с помощью открывающих и закрывающих тэгов, именованных или неименованных.
EDI попытался усидеть на двух стульях. С одной стороны, его документный формат похож на формат csv и удобен для представления табличных данных. С другой стороны, он пытался описывать иерархические объекты, и попытка эта окончилась очень неубедительно. Конечно, мы понимаем это сейчас, когда имеем перед глазами JSON. Но давайте вспомним, что EDI был сделан не для передачи табличных данных, а именно для передачи документов, структура которых именно иерархическая.

Нетехнический взгляд на EDI

Как видим, EDI стандарт устарел практически в каждом аспекте, если мы рассматриваем его с технических позиций. Вряд ли сейчас есть рациональные технические причины для его использования. Но, несмотря на это, EDI по-прежнему широко используется.
В следующей части мы постараемся найти этому причины. Скорее всего они будут не технического характера.

Читайте также: