Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Programming with M


SP 96: Programming with M

Dan, an engineer with InterSystems, previously worked at the Laboratory of Computer Science at Massachusetts General Hospital, the birthplace of the M language.


As the modern day successor to MUMPS, M is an ANSI-standard application-development environment incorporating a programming and data-manipulation language and integrated database. M was traditionally used to complement SQL by providing an extra dimension of data-access control that can prove especially useful in select, high-volume, transaction-processing applications. M's direct-access approach to data lets you exert greater control over performance issues, thereby letting you manage update concurrency at a finer level of granularity. For example, M database structures can be designed to support field-level locking and the ability to frame a transaction (TSTART ... TCOMMIT) to include only those database structures involved in processing the current transaction. You can also control the physical proximity of related data, such as grouping header and line-item data together in processing invoice or order data.

Conversely, SQL abstracts data structures from the programmer, so that the function of database administration (DBA) is more centralized. By optimizing the physical data model, the database administrator strives to optimize performance across a breadth of programs. This philosophical center-of-control difference is one key reason why M is more commonly used by application developers than by end-user organizations and why M applications exhibit very good performance, operational, and cost-of-ownership characteristics. Whether deployed by an application developer or end-user organization, performance is, to a large degree, a function of direct program control and M system processes--not DBA activity. DBA functions still exist but are not dominated by database-tuning activities.

M: Past and Present

Starting in the mid-1970s, MUMPS gained widespread use in health-care applications. M evolved from the need for mainframe-like transaction-processing applications for autonomous departmental computing. M is an ANSI/FIPS/ISO/JIS (Japan Industrial Standard) applications environment and is accepted by the ANSI SQL Standards committee as a host language for embedded SQL (ESQL). M integrates a programming language and database into a single application-development environment. It is common to access the same data with both SQL and M. Indeed, M and SQL data accesses can even be mixed within the same programs. As a result of the integrated programming/database standard, M-based programs are highly portable across operating platforms. M is most commonly used on UNIX and Windows, but it can also be found in DOS, VMS, and other platforms. At the same time, M is an environment particularly suited to high-volume, online transaction-processing applications.

M database structures are implemented as highly optimized B-tree structures on disk. To the application programmer, the database presents itself as user-designed, multidimensional arrays ("globals" in M parlance). All database designs (hierarchical, network, relational, and so on) are supported, and the design is under the control of the application developer. The ANSI/MDC X11.1-1990 specification of the M language component describes 21 commands, 17 intrinsic functions, and 7 intrinsic variables and reserves a "namespace" for vendor-specific extensions to the ANSI specifications. (The next M ANSI standard specification is now in the formal acceptance stage.) Additionally, there is a full complement of unary, arithmetic binary, numeric- and string-relational, pattern-matching, and concatenation operators. The ANSI standard also describes extrinsic functions and user-authored variables. M data typing is variant. A type is applied based on the context in which the data is referenced. Both private routine and database structures are dynamically allocated, supporting both scalar and array structures. All structures are of variable length. Arrays support numeric and string subscripting, and allocation of these structures is sparse--only those nodes defined by application logic exist.

A complete description of M is beyond the scope of this article. However, Table 1 presents a summary of ANSI M's operators, commands, and intrinsic functions and variables. (You can obtain a complete copy of the ANSI specification by contacting the M Technology Association.)

M is implemented by a variety of commercial vendors as well as academia. Many facilities are associated with each implementation of M, including networking (two- and three-tier client/server and peer-to-peer), system-management facilities (bulletproof database, concurrent read/ write incremental backup), application libraries and generators, debuggers, relational/SQL implementations describing M databases (supporting such APIs as the ODBC), and a host of evolving technologies (object orientation, for instance).

M Benefits

The major benefit of M for the programmer is increased direct control over data structures. In M, you can define data in terms of multidimensional arrays that map to an intended transaction process. It is easy to represent data in a hierarchical, network, or relational model. Because M provides direct control over logical data structures, a "flat relational structure" can be implemented. In any case, the M relational data dictionary provides a superimposed relational view of this same data. In this regard, M is identical to most relational DBMS products. After describing a developer-designed M data structure in the Open M relational data dictionary, the application developer will see in the M environment all characteristics expected of relational DBMS products.

The relational model is optimized for change. It excels at serving unanticipated information requirements, such as the ad hoc query. It is ideal for decision support systems (DSSs), executive information systems (EISs), and data-warehousing applications. However, the relational model can present significant challenges to high-volume transaction processing of complex data structures. M gives you an alternative, one that increases programmer control over key performance considerations such as concurrency, dependency, and data proximity.

Extending Visual Basic

To contrast SQL with M in a complex client/server transaction environment, consider a banking database containing banking products (checking, loans, mortgages, safe deposit, savings, credit cards, and investments) and customers (who individually or jointly may use several of the bank's products). Each product has multiple accounts. Each account may have multiple owners and many transactions a month. There are complex relationships between products, customers, transactions, and accounts. Aggregate transaction rates and data volumes can be very high, and security and accuracy are critical.

Representing an integrated banking model as a set of database tables is not trivial. Achieving acceptable performance can be even more daunting--both for online processing, where customers conduct transactions at a teller or an ATM, and for overnight batch runs, where voluminous data must be processed in a narrow time window. Figure 1 is a metadescription of the tables involved in the scenarios demonstrating Open M and SQL.

Listing One is a portion of a Visual Basic front-end application. The code shows the typical method of using SQL statements to retrieve data. Listing Two is the same program using M. As the user enters characters into a combo box, customer names starting with those characters populate the combo box. During the Form_Load event, Listing One retrieves all rows into the combo box. By setting the Sort property of the combo box to True, Visual Basic automatically sorts the list of names. SQL provides a perfectly acceptable scripting mechanism for this, and performance should not be an issue as long as the number of rows being retrieved is reasonable for the application configuration (available memory, network bandwidth, server performance, and so on). As the number of rows grows larger, however, it becomes more difficult to get the database to load in one shot, and repetitive queries against the server are required.

Note in Listing Two that having the data structures under the control of the application developer allows logical partitioning of data to suit the application. This also permits clustering of tables to any depth by using multidimensional array characteristic of M globals. The number and performance of JOINS are a function of the data structure. Inherent'Natural' ASCII collation of data in Open M globals (for English-speaking markets; other collation schemes are available for other markets) provides ordering of rows, and newly inserted data is automatically collated. This also allows for finer granularity in locking, transaction-processing control, and distribution of data. This improves database throughput by, for example, allowing data to serve concurrently as both a primary and index key.

I've taken liberties with the M code in Listing Two to assist those unfamiliar with M programming. The code is logically and syntactically correct but deoptimized for comprehensibility. Many commands and programming techniques that could have been used for this exercise were avoided for this reason.

Table 2 provides a quick description of the M commands used in Listing Two, while Table 3 describes the data structures (restated in Open M globals). In this database structure, the numeric and string-collation characteristics of Open M globals facilitate access to the upright data through its primary key or through a foreign key. The structures used here are simple: Each record is described as a collection of columns whose names are self descriptive. In this simple structure, data for a record is treated as a string with a caret (^) delimiting each column. The design of the global structure is analogous to defining a table in another DBMS environment. Listing Two is actually implemented in InterSystems' Visual-M, in which the Open M language and database are directly accessible to Visual Basic. With Visual-M, M and Basic are colleague languages. VB controls are registered with M and are programmatically introduced by an underscore character.

When to Use M

You can access the same data with either M or SQL. M provides both the "physical layer" and its own "intermediate logical representation layer." M also efficiently implements SQL tables, which are considered the "logical layer." Therefore, you are not forced to make a systemic choice between access alternatives.

Although, relative to SQL, M provides access that appears closer to the physical layer, data in M is still accessed at a very high level of programming--far above the level of disk pages, B-trees, and the like. M data is accessed using multidimensional arrays that provide associative memory and flexible, convenient data organization, where memory and disk allocation are dynamic and automatic.

SQL's strengths clearly lie in data analysis, decision support, and ad hoc queries. For very complex data structures and demanding transaction-processing requirements (high volumes, large databases, or critical response times), M's level of control provides generally better performance.

Many programming requirements will lie between the two polarities for which SQL and M are each optimized. In this middle ground, a wide variety of factors affect the choice.

Programmer preference is one obvious factor--experienced SQL programmers may prefer to stick to what they know best. On the other hand, data complexity and performance demands may lead you to choose M over SQL as the data-manipulation language. Because they are interchangeable, both M and SQL can be tried, and empirical evidence (user-specific benchmarks) can guide the final decision.

For many organizations, the interoperability and portability of M's development environment are significant. Standard SQL is available from numerous vendor sources, many of whom emphasize their own proprietary client/server program development environment to build SQL systems. In contrast, client/server architecture is embedded within the ANSI M language itself, which avoids proprietary programming requirements. Additionally, certain M vendors have embraced emerging de facto client/server standards, especially Visual Basic, as key development platforms. M extensions to Visual Basic can add a high-performance, transaction-processing dimension to this widely used GUI system-building tool. The growing market for commercially available Visual Basic custom controls adds acquired component software to the program-development arsenal.

Conclusion

M's historical focus on transaction processing with complex data in distributed implementations has lead to the development of several key characteristics. First, the programmer is given more direct control of data structures in which to optimize performance. Second, data can be organized to optimally suit program-performance considerations. And, third, requirements for systems and database administration, a key attribute in distributed systems, are minimized compared to traditional DBMS systems.

For More Information

M Technology Association

1738 Elton Road, Suite 205

Silver Spring, MD 20903-1725

301-431-4070

[email protected]

Table 1: Summary of ANSI M: (a) selected ANSI M commands; (b) selected M intrinsic functions; (c) selected intrinsic variables; (d) operators.

Command     Description

<b>(a)</b>

Break       Programmatically interrupt routine execution.

Close       Deallocate and release ownership of a device

             (terminal, file, tape, and so on).

Do          Inititate execution of a program, subroutine,

             extrinsic function.

Else        Introduce lines of code to execute if IF condition

             is not true.

For         Initiate looping.

Goto        Generalize transfer of control.

Halt        Terminate process.

Hang        Suspend for a specified period.

If          Introduce condition which, if true, causes rest of

             line to be run.

Job         Programmatically spawn another M process.

Kill        Remove data structure (scalar or array).

Lock        Programmatically interlock M structures among M

             processes.

New         Initiate variable scoping.

Open        Obtain ownership of a device.

Quit        Terminate execution of routines, subroutines, and

             For loops; unscope variables scoped by New, and

             so on.

Read        Take input from the current device (terminal, tape,

             file).

Set         Assign value to variable.

Use         Specify a device as the current device.

View        Peek into memory.

Write       Direct output to the current device.

Xecute      Execute the results of expression evaluation as M

             code.



<b>(b)</b>

$Ascii      Return the integer ASCII value of the character

             argument.

$Character  Return the ASCII character of the integer argument.

$Data       Return integer values denoting the argument's state

             of definition.

$Extract    Positionally assign or extract a substring from a

             string.

$Find       Locate the position of a substring within a string.

$FNumber    Apply format mask to data.

$Get        Return NULL or data depending on argument's

             existence.

$Justify    Apply justification mask to argument.

$Length     Return character length of argument.

$Order      Return the next subscript value at the current

             subscript level of the argument.

$Piece      Extract or assign a delimited substring from/to a

             string.

$Query      Return the next full local- or global-array

             reference.

$Random     Return a random number.

$Select     Return the evaluated expression of the first of

             many conditions that evaluates to true.

$Text       Return the source code of a program at the reference

             that is the argument of the function.

$TRanslate  Translate positionally one character for another in

             a string.

$View       Peek into memory.



<b>(c)</b>

$Horolog    System date and time.

$Io         Current device.

$Job        Current process identifier.

$Storage    Amount of private process memory available to the

             M process.

$Test       Contains truth value of last executed condition and

             other operations.

$X & $Y     Return the current x- and y-coordinates on a

             cursor-positionable device with some restrictions.



<b>(d)</b>

Arithmetic  +, -, /, \ (integer divide), # (modulo), *, **

             (exponentiation)

Unary       +, -, '(not)

Relational  =, >, <, ] (follows), [ (contains), ]] (sorts after)

Pattern

 Match      ?<argument>

Logical     & (and), ! (or)

Concatenation  _____LINEEND____

Table 2: M commands used in Listing Two.

Command        Function

Semicolon (;)  Introduces a comment.

SET            M's assignment command.

$$upper        Extrinsic function to convert characters to upper

                case.

$LENGTH        M intrinsic function returning string length.

FOR            Looping structure.

$ORDER         Traverses a specified branch of the B-tree

                presented logically as the M global structure

                until programmatically terminated.

QUIT           Terminates the For loop and the DO command.

DO             Calls an external routine, executes a subroutine

                or extrinsic function, and executes a

                "structured" block of code introduced by one or

                more periods denoting the nesting level of the

                block.

$PIECE         Intrinsic function which assists in extracting or

                assigning data from/to a specified delimited

                substring of the argument string. In this

                example, a caret is used as the delimiter: The

                first ^ piece of ^people("owners",pkey) is

                "name."

$GET           Intrinsic function that tests for the existence

                of the argument and returns either the data

                (exists), if true, or NULL (""), if not.

$EXTRACT       Extracts characters from a string from position

                N through optional position M.

$ASCII         Returns the ASCII value of a character position

                in the argument string.

$CHARACTER     Returns the character for the ASCII value.

Table 3: Data structures (restated in Open M globals). (a) People 'Table'; (b) Account 'Table'; (c) Service_Type 'Table'.

<b>(a)</b>
^people("owners",pkey)=name^date_of_birth^social_security_number
^people("by_name",<uppercase_name>,pkey)=<null>     (index)
^people("owners",pkey,"accts",account_number)=<null>    (index)

<b>(b)</b>
^accounts(account_number)=service_type^balance^open_date^close_date
^accounts(account_number,"owners",pkey)=<null>

<b>(c)</b>
^srvtypes(type_number)=type_name

Figure 1: Metadescription of the tables used in a typical banking application.

Tables: People
       Fields:
         pkey (number,10,unique)
         name (text,25)
         date_of_birth (date,8)
         social_security_number (text,11)
     Primary Key:
         pkey
     Index:  name (uppercase transformation applied)
     Accounts
     Fields:
         account_number (numeric, 10, unique)
         service_type (foreign key -> Service_Types)
         balance (float,10,2)
         open_date (date,8)
         close_date (date,8)
     Primary Key;
         account_number
     Acct_Owners
     Fields:
         account_number (foreign key -> Accounts)
         owner_id (foreign key -> People)
     Primary Key:
         account_number, owner_id
     Index:  owner_id
     Service_Types
     Fields:
         type_number (number, 10, unique)
         type_name (text, 25)
     Primary Key:
         type_number

Listing One

<Begin Combo1 KeyPress Event>
' ============================================================
' Bind to database via DSN in data control
' ============================================================
server$="ODBC;DSN=Server;UID=Username;PWD=Password"
Data1.Connect = server$
' ============================================================
' Set up query to find all rows where name 'starts with' character 
' input by user perhaps under the KeyPress event for Combo1
' ============================================================
query$="Select Acct_Owners.Owner_ID, Acct_Owners.Account_Number,"
query$=query$ & " People.Name,Service_Types.Type_Name
query$=query$ & " From Acct_Owners, People, Accounts, Service_Types"
query$=query$ & " Where People.Name Like '" & ucase$(combo1.text) & "%'"
query$=query$ & " And People.Pkey = Acct_Owners.Owner_Id"
query$=query$ & " And Account.Account_Number=Acct_Owners.Account_Number"
query$=query$ & " And Service_Types.Type_Number = Account.Service_Type"
query$=query$ & " Order by 3, 2"
' ============================================================
' Go to server to get rows
' ============================================================
Data1.Recordsource=query$
Data1.Refresh
' ============================================================
' Populate Combo Box
' ============================================================
Combo1.Clear
Do while not Data1.Recordset.EOF
    name$=Data1.Recordset.Name
    account=Data1.Recordset.Account
    srvtype$=Data1.RecordSet.Type_Name
    pkey=Data1.Recordset.Primary_Key

    Combo1.Additem name$ & " " & account & " " & srvtype$
    Combo1.ItemData(Combo1.NewIndex)= pkey
    Data1.Recordset.MoveNext
Loop
' ============================================================
' If the text area of the combo box is touched with new input, reexecute query 
' and do loop to recover rows from the database which satisfy the new input.
' If the end-user has double-clicked an item; issue new query to recover all 
' elements of row selected for presentation/manipulation by end-user
' ============================================================
<End Combo1 KeyPress Event>
<Begin Combo1 Double-Click Event>
' ============================================================
' detect which item in the combo box was selected
' ============================================================
selected=Combo1.ListIndex
' ============================================================
' get the row id in the people table for the item selected
' ============================================================
pkey=Combo1.ItemData(selected)
' ============================================================
' recover the info in the combo box to reuse it. piece is a user-written Basic 
' module which returns the portion of a string (1st arg) found between the 
' instances (specified by the 3rd and 4th args) of the delimiter-string 
' (2nd arg)
' ============================================================
x$=Combo1.List(selected)
name$=piece(x," ",1,1)
account=piece(x," ",2,2)
srvtype$=piece(x," ",3,3)
' ============================================================
' issue a query to recover all information to be displayed on the form
' ============================================================
query$="Select People.Date_of_Birth, People.Social_Security_Number,"
query$=query$ & " Accounts.Open_Date, Accounts.Close_Date, Accounts.Balance
query$=query$ & " From People, Accounts
query$=query$ & " Where People.Pkey = " & pkey
query$=query$ & " And Accounts.Account_Number = " & account
Data1.RecordSource=query$
Data1.Refresh
' ============================================================
' assign recovered information to appropriate controls on the form
' ============================================================
Pname.Text=name$
Pdob.Text=Data1.RecordSet.Date_of_Birth
Pssn.Text=Date1.RecordSet.Social_Security_Number
Pacct_num.Text=account
Pacct_odate.Text= Data1.RecordSet.Open_Date
Pacct_cdate.Text= Data1.RecordSet.Close_Date
Pacct_balance.Text=Data1.RecordSet.Balance
Pacct_srv_type.Text=srvtype$
<End Combo1 Double-Click Event>

Listing Two

; ============================================================
; get user input from combo box perhaps under the KeyPress event, transform
; to uppercase and get Length 
; $$upper is an extrinsic funtion which performs the work of UCASE$ 
; in Listing 1
; ============================================================
set input=$$upper(_combo1.text)
set inlng=$length(input)
; ============================================================
; initialize scanning variable to start at the string collating just before 
; user input, e.g., if combo1.text="smith", then input="SMITH" and 
; name="SMITG~" after the following code executes. It sets up to perform 
; the function of the LIKE operator in Listing One's SQL SELECT
; ============================================================
set name=input
set name=$extract(name,1,inlng-1)_$character($ascii(name,inlng)-1)_"~"
; ============================================================
; find all names in the data structure which start with characters input.  
; In Direct Database Access, this code actually does the SELECT, WHERE,
; and ORDER BY work described in Listing One by using Open M's ascii 
; collation, easily clustered tables, and $order function
; ============================================================
for  set name=$order(^people("by_name",name)) 
                           quit:$extract(name,1,inplng)'=input  set pkey="" do
. For  set pkey=$order(^people("by_name",name,pkey)) quit:pkey=""  
                                                             set account="" do
. . for set account=$order(^people("by_name",name,pkey,"accts",account)) 
                                                             quit:account="" do
. . . set name=$piece(^people("owners",pkey),"^")
. . . set acctype=$p($get(^accounts(account),"^"))
. . . set srvtype=^srvtypes(acctype)
. . . ; ============================================================
. . . ; The listbox portion of the combo box will hold peoples' names, account 
. . . ; numbers and service type. The DO command is used to execute the 
. . . ; Additem method to place these rows in the combo box.
. . . ; The innermost FOR loop (above) corresponds in a sense to the DO WHILE 
. . . ; loop in Listing One.
. . . ; ============================================================
. . . do _combo1.additem(name_" "_account_" "_srvtype)
. . . set _combo1.ItemAdd(_combo1.newindex)=pkey
. . . quit
. . quit
. Quit
quit
; ============================================================
; As a reminder, the above code is placed in the KeyPress event of combo box.
; If text area is touched with new input, data fetch and processing loop will
; execute again to recover rows from the database which satisfy the new input.
; This processing logic is that of VB and is independent of the 
; scripting/database language: BASIC/SQL vs. M
; ============================================================
<End of the KeyPress Event>
<Begin the Double-Click Event>
; ============================================================
; The following Visual-M code is placed under double-click event of combo box.
; If the end-user has double-clicked an item; a new query is issued to recover
; and display all elements of the row selected.
; First, detect which item in the combo box was selected just as in 
; Listing One but using Visual-M 
; ============================================================
set selected=_combo1.list
; ============================================================
; Now, get the row id from ItemData of the row selected.  Again, using 
; Visual-M
; ============================================================
set pkey=_Combo1.ItemData(selected)
; ============================================================
; recover the info in the combo box to reuse it.  This code uses the 
; $PIECE function emulated in VB by the 'piece' module
; ============================================================
set x=_Combo1.List(selected)
set name=$piece(x," ")
set account=$piece(x," ",2)
set srvtype=$piece(x," ",3)
; ============================================================
; issue M code to recover all information to be displayed on the form and
; display on the form.
; $$date is an extrinsic function to convert internal date format to 
; external format
; $$number is an extrinsic function to convert numbers to a desired 
; external format
; These might be defined in an Open M Include file and would 
; take arguments to define the output format.
; ============================================================
set _Pname.Text=name
; ============================================================
; Open M Direct Database Access is used to recover rows from the database 
; for the people 'table' and the account 'table'. The logic is to recover a 
; row into a private process variable (here 'x') and the assign the data to 
; the appropriate VB control. This code snippet and the one below combine the 
; function of the SELECT query in the Double-Click event of the combo box and 
; the assignments of the resulting data to the VB controls.
; ============================================================
set x=$get(^people("owners",pkey))
set _Pdob.Text=$$date($piece(x,"^",2),"dd/mm/yy")
set _Pssn.Text=$piece(x,"^",3)
set _Pacct_num.Text=account
; ============================================================
; recover 'account' row into local M variable and assign to VB controls.
; ============================================================
set x-$get(^account(account))
set _Pacct_odate.Text=$$date($piece(x,"^",3),"dd/mm/yy")
set _Pacct_cdate.Text=$$date($piece(x,"^",4),"dd/mm/yy")
set _Pacct_balance.Text=$$number($piece(x,"^",2),"$n.n")
set _Pacct_srvtype.Text=srvtype
quit
<End Combo Box Double-Click Event)
End Listings


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.