mvmf: mvmf language (MFL)

Executive summary: the mvmf language is called MFL, and is a mongrelization of a C-like language and a SIEVE-like language.

The syntax for mail filtering and mail disposition is mainly (entirely?) relegated to the "SIEVE" side of MFL. Sieve is a relatively easy to understand and easy to write language-- a lot of what you might want to do with your mail can be done entirely using Sieve constructs. It's easy enough that you can learn a lot about it just by browsing some examples. The C side of the MFL language is provided, in part, for those who want to orchestrate more elaborate control over mail delivery and over those SIEVE constructs.

This is not a language manual. This is more like a set of notes about MFL along with some simple examples. If you know "Sieve" (or know how to read the SIEVE RFC or look at some examples), and if you know "C" (or don't care about using the "C-like" side of MFL), these notes and examples should get you going.

You may want to skip directly to:

Background
MFL is like sieve
MFL is C-like
Misc notes
Future plans and ideas
Examples

Background

Although MFL may be now used in different utilities, it was developed for the mail delivery agent (now called mvmda). It was very tempting to invent something completely new for its language; for instance, logic-based or assertion-based languages seemed like they might fit the bill. But after a few of those flights of fancy, we decided that we wanted to find something that was easy to understand even by non-programmers, and yet might be of use to programmers as well. These utilities are really just for helping people deal with their mail, so we wanted something that at a basic level was fairly easy to use and configure and with which to achieve reasonable goals, but which could be used in more complicated ways for those who wanted to do so. We also wanted to get something accomplished and not go off on a language quest. So we decided to use a fairly simple syntax for the basic mail controls, but also allow the use of a procedural programming language, as well as other special extensions, to support more complex configuration. The procedural level we chose was like "C", so we call that part the "C-like" language.

We had run across the "SIEVE" language quite some years ago, when there was an internet draft put out by cyrusoft. In its form at that time, SIEVE looked reasonable: it provided some control structures that described some simple ways to look at mail, without being in itself a full-blown programming language. It seemed the ideal thing to wrap a procedural language around: making a nice union of a language providing control flow and complex evaluation and one providing basic mail handling syntax. There's been a bit of a cloud there, though: SIEVE was enventually codified into an RFC, and later modified and extended in various ways. Some of the extensions make the integration into an enclosing language more difficult. However, one does not have to accept all extensions, and indeed some of the extensions make a lot of sense.

At any rate, we combined a C-like procedural language and a SIEVE-like mail filtering language, and called it "MFL." To get the most benefit from MFL you need to use the SIEVE parts of the language-- and in fact the SIEVE parts can be used without using any C-like syntax. So we'll start there.

MFL is like SIEVE

The basic SIEVE definition is set out in RFC 5228, superseding RFC 3028 (see also the related reading area.) It is a control language that allows you to perform tests on parts of a mail message, and take actions that dispose of the mail message in various ways.

Because MFL combines a C-like syntax and a SIEVE syntax, all SIEVE language elements must be enclosed in a "sieve" block, which is the keyword sieve followed by a code block enclosed in curly braces. (Each utility using MFL may offer exceptions to this; for example the mvmda mail delivery agent can be instructed to assume that the script starts out in sieve mode.) A sieve block can appear anywhere in MFL that a C-like statement or an expression term can appear. (SIEVE constructs always return a value, even if that value is simply a completion status.)

Sieve statements fall into three broad categories: control, test, and action. A control statement affects the flow of control (e.g. by evaluating a test statement and conditionally executing other statements as a result). A test statement tests a condition, and an action statement performs some function such as saving mail into a mailbox. Any of these sorts of statements can be used as a SIEVE element, or SIEVE statements can be combined into a SIEVE program section.

For example, the following is a section of SIEVE code (enclosed, as normally required, in a "sieve" block):

    sieve {
        if header :is "From" "boss@example.com" {
            discard;
        }
        else {
            keep;
        }
    }

Whereas the following illustrates how a SIEVE element can be used as part of a C-like expression:

    int score;

    /* Assign a big score for this */
    score += 64 * sieve { header :is "from" "boss@example.com" };

    if (score > 500)
        sieve { discard; }
    else
        sieve { keep; }

Sieve implementation status

This is the current MFL implementation status for SIEVE language elements, and elements under consideration.

Statement Type Status Comments

RFC: 5228 (the fundamental SIEVE spec)

address Test Complete

allof Test Complete

anyof Test Complete

discard Action Complete

else Control Complete

elsif Control Complete

envelope Test Complete Requires capability "envelope"

exists Test Complete

false Test Complete

fileinto Action Complete Requires capability "fileinto";
Also see relevant MVMF Extensions notes below.

header Test Complete

if Control Complete

keep Action Complete

not Test Complete

redirect Action Complete

reject Action Moved "reject" was removed from the fundamental SIEVE language and moved to RFC5429. The action is still present in the language. But it's no longer rooted here.

require Action Complete

size Test Complete

stop Action Complete

text lexical Complete multi-line text literal using the keyword "text";

true Test Complete

# lexical Complete See "Misc Notes" section.

(encoded-character) character encoding Partially implemented Interpretation of special character encoding sequences such as ${hex:ab} was added to SIEVE in RFC5228.
"hex" encoded sequences is implemented. "unicode" is not.
Note that strings containing nuls, which this permits, are not well supported in MFL and mvmf or in email messages, really.

[capability] encoded-character Required for the encoded-character sequences added in RFC5228

RFC: 5231 (SIEVE extension: relational tests)

:count Tagged option Complete

:value Tagged option Complete

[capability] These elements require capability "relational"

[notes] Replaces RFC3431
It's unfortunate that :count and :value, defined in this extension, conflict with other match types, particuarly in that they steal the comparison target from the second term of the test verb (e.g. "header"). This means you can't have :count applied to something using another match type, e.g. :contains, which would be more useful.

RFC: 5233 (SIEVE extension: subaddress)

:detail Tagged option Complete

:user Tagged option Complete

[capability] Requires capability "subaddress"

[notes] Replaces RFC3598
Hardwired to recognize '-' separator between detail and user (the last one, if there are more than one). Ideally, this would be more configurable.

RFC: 3894 (SIEVE extension: copying without side effects)

:copy Tagged option Complete

[capability] Requires capability "copy"

RFC: 5293 (SIEVE extension: editheader)

addheader Action Complete

deleteheader Action Complete

replaceheader Action Complete was in early draft(s), no longer in the RFC

:index Tagged option Complete

:last Tagged option Complete

:newname Tagged option Complete was in draft, no longer in RFC

:newvalue Tagged option Complete was in draft, no longer in RFC

[capability] These elements require capability "editheader"

[variables] See the notes in the section about the Sieve "variables" extension.

[notes] Marked "complete" above, but implemented based on the draft-degener-sieve-editheader documents not the RFC. That was eventually turned into draft-ietf-sieve-editheader, and then to rfc5293. Some changes to the draft made continuing to follow some changes unsatisfactory here. We kept the replaceheader stuff, as noted.

RFC: 5230 (Vacation Extension)

vacation Action Complete

:addresses Tagged option Complete

:days Tagged option Complete

:from Tagged option Complete

:handle Tagged option Complete

:mime Tagged option Complete

:subject Tagged option Complete

[capability] Capability "vacation" required

[notes] This implementation requires the :from option as it does not want to guess the email address of the script owner.

Draft: draft-murchison-sieve-regex and draft-ietf-sieve-regex

:regex Tagged option Complete

[capability] Requires capability "regex"

[notes] Matched subparts are available via the C-like language elements.

[drafts] draft-murchison-sieve-regex expired and was later replaced by draft-ietf-sieve-regex, which then expired. The implementation in mfl was based on the murchison draft. mfl retains this, plus it has regex facilities in the C-like language side.

RFC: 5703 (MIME Part Tests, Iteration, Extraction, Replacement, and Enclosure)

break Action Complete Used in "foreverypart" which requires capability "foreverypart"

enclose Action ITIN Requires capability "enclose"

extracttext Action Unimplemented Requires capability "extracttext"
Requires capability "variables" and that the variables extension is configured into MFL.

foreverypart Control Completed Requires capability "foreverypart"

replace Action Unimplemented Requires capability "replace"

address Test Completed Enhanced by adding ":mime" and ":anychild" tagged options

exists Test Completed Enhanced by adding ":mime" and ":anychild" tagged options

header Test Completed Enhanced by adding ":mime" and ":anychild" tagged options, as well as ":type", ":subtype", ":contenttype", and ":param".

[capability] enclose Required for "enclose" action

[capability] extracttext Required for "extracttext" action

[capability] foreverypart Required for "foreverypart" and "break" actions

[capability] mime Required for ":mime" and ":anychild" tagged options in "header", "address", and "exists" tests

[capability] replace Required for "replace" action

[notes] MFL has its own ways of selecting MIME parts and once a part is selected, most SIEVE facilities pay attention (i.e., operate within that selected part). That conflicts with this extension, not to mention SIEVE principles. OTOH if you don't use the MFL methods of MIME selection, usually these conflicts become irrelevant.
Such a conflict that's worth pointing out is with the options added to "address", "exists", and "header" tests by this extension. If :mime and :anychild options are used in these tests, the RFC states that the anychild behaviour of looking at MIME components starts at the top part unless a "foreverypart" loop is in effect, in which case the descent starts at the current message part. This implementation always starts at the currently selected message part. This behaviour (these behaviours) could conceivably be controled by a parameter or by a compile-time decision in the future. As it is, the script writer could simply avoid selecting message parts, or always selecting the topmost before doing whatever is being done.
"foreverypart" loop can be used to aid in selecting parts where it is convenient. Such a selection becomes sticky, as I've said (probably too much), just like the MFL part selections are. That's a conflict with this RFC but it makes sense in the MFL environment (and would require some convolutions in code and in understanding not to do this).
As noted in the table some elements, "enclose" and "extracttext" specifically, aren't implemented. Perhaps confusingly, these are recognized by the parser but result in "unimplemented" errors when it comes to execution. While "extracttext" might be implemented at some point, it will be tabled for now unless there's some urgent (or even any) need for it here. "enclose" - no idea.
That's brief: this probably needs its own page with more details.

RFC: 5229 (Variables Extension)

set Action Complete

string Test Complete

:length Tagged option Complete

:lower Tagged option Complete

:lowerfirst Tagged option Complete

:upper Tagged option Complete

:upperfirst Tagged option Complete

[capability] Requires capability "variables"

[notes] Some of the standard notes apply, such as lack of UTF-8 implementation and other such things.

[thoughts] Originally had mixed feelings about this. It provides a reasonable facility for the SIEVE language, but MFL already provides much more powerful access to variables. Still it was implemented relatively easily, partly because of existing MFL code. The "string" test was the most interesting thing.

[more] This is subject to conditional compilation, that is, compiled only if enabled when configured. Anything depending on SIEVE variables will also be subjected to that condition.

RFC: 7352 ("duplicate" Extension)

duplicate Test Complete

[capability] Capability "duplicate" required

[notes] The RFC specifies that the registration of an ID into the duplicate tracking database is a side-effect of the duplicate test, and that the result of this registration will only affect duplicate tests in future script executions (not the current one). We honor this to a large degree, but there's an issue when the ":last" option is used. This involves using the last time the ID was checked in a previous script execution (not the current one) rather than the time the ID was first seen. We update the "last checked" time when we check it, so any additional duplicate checks of an ID within the current script are going to test against this new "last checked" time. This is contrary to the RFC. I get what the RFC wants here. However, deferring the update of the last checked time until after the script finishes means having some mechanism to remember these for later posting. That adds a whole lot of complication for very little gain, when a perfectly fine solution to this is for a script not to test the same ID using :last repeatedly.
Note that if the script fails or exits with an error, the transactions involving the duplicate history are reverted, so any "last time" is not remembered outside of this execution in that case. However the changes are still visible within script while the transaction is active.

Known SIEVE documents and drafts not implemented in mfl; starting with a few that are the most likely to be addressed

RFC: 5173 (Body Extension)

[capability] Capability "body" required

[thoughts] We may implement this, but have other ideas on the matter that fit into the MFL framework a little tighter. However, this could be useful.

[todo] TODO

RFC: 5429 (Reject and Extended Reject Extensions)

reject Action Redefines reject

ereject Action defines ereject

[capability] Requires capability "reject" and/or "ereject"

[notes] Concerns rejection of messages at SMTP time. Replaces the fundamental "reject" action and defines an "ereject" action.
Was originally draft-ietf-sieve-refuse-reject

[todo] not entirely supported.
mfl implemented "reject" that was in the original RFC (3028). The RFC 5228 update removed this action so that it could be more carefully introduced in another RFC (this one). The "reject" action remains implemented in mfl from the original implementation; other than that, this extension is unimplemented. The "ereject" action would be attractive, perhaps, where it could be implemented (e.g. in mvmtr).

RFC: 5235 (SIEVE extension: spamtest and virustest)

spamtest Test Unsupported

virustest Test Unsupported

[capability] Requires one or more of capabilities "spamtest," "spamtestplus," or "virustest."
Also requires that capability "relational" be enabled.

[notes] Supersedes RFC3685 and various draft-daboo-sieve-spamtest I-Ds.
This RFC specifies a couple of tests against any spam and virus analysys that may have been applied and normalized into simple status information by the underlying SIEVE implementation. There is no need for and therefore no plan to implement this, although it could be useful, and perhaps the need will arise.

RFC: 5435 (Extension for Notifications)

[todo] not supported.

RFC: 6609 (Include Extension)

[notes] "include" one SIEVE script from another.

[todo] not supported.
Have not looked at this since it was draft-daboo-sieve-include, and it was draft-ietf-sieve-include after that. It did not seem workable for mvmf at the time. MFL has its own include capability so there was (and is) no real personal urgency.

RFC: 6785 (Support for IMAP Events in SIEVE)

[n/a] Not applicable or no interest

RFC: 9042 (Delivery by IMAP MAILBOXID)

[n/a] Not applicable or no interest

RFC: 5232 (SIEVE extension: IMAP4flags)

[todo] Very low interest.

RFC: 5260 (SIEVE extension: Date and Index)

[todo] unimplemented

RFC: 8579 (Delivering to Special-Use Mailboxes)

[capability] Capability "special-use" required

[todo] Not implemented

RFC: 8580 (File Carbon Copy)

[capability] Capability "fcc" required

[todo] Not implemented

MVMF extensions and additions

MVMF Extension: C

C lexical Complete Introduces a block of C-like code, which must be enclosed in curly braces. This is not a Sieve extension, as it is part of MFL, and does not need to be enabled via a Sieve "require" statement.

MVMF Extension: dnsbl

dnsbl Test Complete Requires capability "vnd.mvmf.dnsbl" . See notes below.

:ip Tagged option Complete Specify an IP address to be tested, overriding the default.

MVMF Extensions to fileinto

:body Tagged option Complete File the body portion

:header Tagged option Complete File the header portion

:part Tagged option Complete File the currently selected MIME message part, rather than the entire message.
Message parts are selected explicitly via MFL functions outside of SIEVE, or as an implicit side-effect of SIEVE MIME extension (RFC5703) which is only done by MFL in contradiction to that RFC.

:raw Tagged option Complete When filing a body only (with tagged option :body and without :header), mvmf will attempt to decode the body if it is encoded in ways that mvmf can deal with. Use the :raw tagged option to prevent this.

[capability] Use of these options requires "vnd.mvmf.fileinto" capability.

[notes] These options are extensions provided by mvmf.
If neither :body nor :header are given, which is typical, both are assumed

MVMF Extensions in general

See below for notes about MVMF extensions.

MVMF Extension: sieve

sieve lexical Complete Introduces a block of sieve code, which must be enclosed in curly braces. This allows a script writer to write code that is guaranteed to be in "sieve" mode without having to know the encompassing context. Useful, for example, for script code that is meant to be included (via @include) by some other script. This is not a Sieve extension, as it is part of MFL, and does not need to be enabled via a Sieve "require" statement.

Misc Notes

:comparator "i;ascii-casemap" and "i;octet" are complete;
i;ascii-numeric is also implemented but is not appropriate in all cases. This comparator requires capability "comparator-i;ascii-numeric" .
i;ascii-casemap is the default.

# comments As of the 20050825 release, MFL supports the "#"-style end of line comment. This can conflict somewhat with MFL's preprocessor statements if you enable the '#' character as a preprocessor introducer character. However, also as of the 20050825 release the default preprocessor introducer character has been changed to '@' which does not conflict with this style of comment. Note that MFL also supports the C-like "//" syntax to begin a comment to the end of the line, as well as "/*..*/" bracketed comments.

Statement	Type	Status	Comments
RFC: 5228 (the fundamental SIEVE spec)
address	Test	Complete
allof	Test	Complete
anyof	Test	Complete
discard	Action	Complete
else	Control	Complete
elsif	Control	Complete
envelope	Test	Complete	Requires capability "envelope"
exists	Test	Complete
false	Test	Complete
fileinto	Action	Complete	Requires capability "fileinto"; Also see relevant MVMF Extensions notes below.
header	Test	Complete
if	Control	Complete
keep	Action	Complete
not	Test	Complete
redirect	Action	Complete
reject	Action	Moved	"reject" was removed from the fundamental SIEVE language and moved to RFC5429. The action is still present in the language. But it's no longer rooted here.
require	Action	Complete
size	Test	Complete
stop	Action	Complete
text	lexical	Complete	multi-line text literal using the keyword "text";
true	Test	Complete
#	lexical	Complete	See "Misc Notes" section.
(encoded-character)	character encoding	Partially implemented	Interpretation of special character encoding sequences such as ${hex:ab} was added to SIEVE in RFC5228. "hex" encoded sequences is implemented. "unicode" is not. Note that strings containing nuls, which this permits, are not well supported in MFL and mvmf or in email messages, really.
[capability]	encoded-character	Required for the encoded-character sequences added in RFC5228
RFC: 5231 (SIEVE extension: relational tests)
:count	Tagged option	Complete
:value	Tagged option	Complete
[capability]	These elements require capability "relational"
[notes]	Replaces RFC3431 It's unfortunate that :count and :value, defined in this extension, conflict with other match types, particuarly in that they steal the comparison target from the second term of the test verb (e.g. "header"). This means you can't have :count applied to something using another match type, e.g. :contains, which would be more useful.
RFC: 5233 (SIEVE extension: subaddress)
:detail	Tagged option	Complete
:user	Tagged option	Complete
[capability]	Requires capability "subaddress"
[notes]	Replaces RFC3598 Hardwired to recognize '-' separator between detail and user (the last one, if there are more than one). Ideally, this would be more configurable.
RFC: 3894 (SIEVE extension: copying without side effects)
:copy	Tagged option	Complete
[capability]	Requires capability "copy"
RFC: 5293 (SIEVE extension: editheader)
addheader	Action	Complete
deleteheader	Action	Complete
replaceheader	Action	Complete	was in early draft(s), no longer in the RFC
:index	Tagged option	Complete
:last	Tagged option	Complete
:newname	Tagged option	Complete	was in draft, no longer in RFC
:newvalue	Tagged option	Complete	was in draft, no longer in RFC
[capability]	These elements require capability "editheader"
[variables]	See the notes in the section about the Sieve "variables" extension.
[notes]	Marked "complete" above, but implemented based on the draft-degener-sieve-editheader documents not the RFC. That was eventually turned into draft-ietf-sieve-editheader, and then to rfc5293. Some changes to the draft made continuing to follow some changes unsatisfactory here. We kept the replaceheader stuff, as noted.
RFC: 5230 (Vacation Extension)
vacation	Action	Complete
:addresses	Tagged option	Complete
:days	Tagged option	Complete
:from	Tagged option	Complete
:handle	Tagged option	Complete
:mime	Tagged option	Complete
:subject	Tagged option	Complete
[capability]	Capability "vacation" required
[notes]	This implementation requires the :from option as it does not want to guess the email address of the script owner.
Draft: draft-murchison-sieve-regex and draft-ietf-sieve-regex
:regex	Tagged option	Complete
[capability]	Requires capability "regex"
[notes]	Matched subparts are available via the C-like language elements.
[drafts]	draft-murchison-sieve-regex expired and was later replaced by draft-ietf-sieve-regex, which then expired. The implementation in mfl was based on the murchison draft. mfl retains this, plus it has regex facilities in the C-like language side.
RFC: 5703 (MIME Part Tests, Iteration, Extraction, Replacement, and Enclosure)
break	Action	Complete	Used in "foreverypart" which requires capability "foreverypart"
enclose	Action	ITIN	Requires capability "enclose"
extracttext	Action	Unimplemented	Requires capability "extracttext" Requires capability "variables" and that the variables extension is configured into MFL.
foreverypart	Control	Completed	Requires capability "foreverypart"
replace	Action	Unimplemented	Requires capability "replace"
address	Test	Completed	Enhanced by adding ":mime" and ":anychild" tagged options
exists	Test	Completed	Enhanced by adding ":mime" and ":anychild" tagged options
header	Test	Completed	Enhanced by adding ":mime" and ":anychild" tagged options, as well as ":type", ":subtype", ":contenttype", and ":param".
[capability]	enclose	Required for "enclose" action
[capability]	extracttext	Required for "extracttext" action
[capability]	foreverypart	Required for "foreverypart" and "break" actions
[capability]	mime	Required for ":mime" and ":anychild" tagged options in "header", "address", and "exists" tests
[capability]	replace	Required for "replace" action
[notes]	MFL has its own ways of selecting MIME parts and once a part is selected, most SIEVE facilities pay attention (i.e., operate within that selected part). That conflicts with this extension, not to mention SIEVE principles. OTOH if you don't use the MFL methods of MIME selection, usually these conflicts become irrelevant. Such a conflict that's worth pointing out is with the options added to "address", "exists", and "header" tests by this extension. If :mime and :anychild options are used in these tests, the RFC states that the anychild behaviour of looking at MIME components starts at the top part unless a "foreverypart" loop is in effect, in which case the descent starts at the current message part. This implementation always starts at the currently selected message part. This behaviour (these behaviours) could conceivably be controled by a parameter or by a compile-time decision in the future. As it is, the script writer could simply avoid selecting message parts, or always selecting the topmost before doing whatever is being done. "foreverypart" loop can be used to aid in selecting parts where it is convenient. Such a selection becomes sticky, as I've said (probably too much), just like the MFL part selections are. That's a conflict with this RFC but it makes sense in the MFL environment (and would require some convolutions in code and in understanding not to do this). As noted in the table some elements, "enclose" and "extracttext" specifically, aren't implemented. Perhaps confusingly, these are recognized by the parser but result in "unimplemented" errors when it comes to execution. While "extracttext" might be implemented at some point, it will be tabled for now unless there's some urgent (or even any) need for it here. "enclose" - no idea. That's brief: this probably needs its own page with more details.
RFC: 5229 (Variables Extension)
set	Action	Complete
string	Test	Complete
:length	Tagged option	Complete
:lower	Tagged option	Complete
:lowerfirst	Tagged option	Complete
:upper	Tagged option	Complete
:upperfirst	Tagged option	Complete
[capability]	Requires capability "variables"
[notes]	Some of the standard notes apply, such as lack of UTF-8 implementation and other such things.
[thoughts]	Originally had mixed feelings about this. It provides a reasonable facility for the SIEVE language, but MFL already provides much more powerful access to variables. Still it was implemented relatively easily, partly because of existing MFL code. The "string" test was the most interesting thing.
[more]	This is subject to conditional compilation, that is, compiled only if enabled when configured. Anything depending on SIEVE variables will also be subjected to that condition.
RFC: 7352 ("duplicate" Extension)
duplicate	Test	Complete
[capability]	Capability "duplicate" required
[notes]	The RFC specifies that the registration of an ID into the duplicate tracking database is a side-effect of the duplicate test, and that the result of this registration will only affect duplicate tests in future script executions (not the current one). We honor this to a large degree, but there's an issue when the ":last" option is used. This involves using the last time the ID was checked in a previous script execution (not the current one) rather than the time the ID was first seen. We update the "last checked" time when we check it, so any additional duplicate checks of an ID within the current script are going to test against this new "last checked" time. This is contrary to the RFC. I get what the RFC wants here. However, deferring the update of the last checked time until after the script finishes means having some mechanism to remember these for later posting. That adds a whole lot of complication for very little gain, when a perfectly fine solution to this is for a script not to test the same ID using :last repeatedly. Note that if the script fails or exits with an error, the transactions involving the duplicate history are reverted, so any "last time" is not remembered outside of this execution in that case. However the changes are still visible within script while the transaction is active.

Known SIEVE documents and drafts not implemented in mfl; starting with a few that are the most likely to be addressed
RFC: 5173 (Body Extension)
[capability]	Capability "body" required
[thoughts]	We may implement this, but have other ideas on the matter that fit into the MFL framework a little tighter. However, this could be useful.
[todo]	TODO
RFC: 5429 (Reject and Extended Reject Extensions)
reject	Action	Redefines reject
ereject	Action	defines ereject
[capability]	Requires capability "reject" and/or "ereject"
[notes]	Concerns rejection of messages at SMTP time. Replaces the fundamental "reject" action and defines an "ereject" action. Was originally draft-ietf-sieve-refuse-reject
[todo]	not entirely supported. mfl implemented "reject" that was in the original RFC (3028). The RFC 5228 update removed this action so that it could be more carefully introduced in another RFC (this one). The "reject" action remains implemented in mfl from the original implementation; other than that, this extension is unimplemented. The "ereject" action would be attractive, perhaps, where it could be implemented (e.g. in mvmtr).
RFC: 5235 (SIEVE extension: spamtest and virustest)
spamtest	Test	Unsupported
virustest	Test	Unsupported
[capability]	Requires one or more of capabilities "spamtest," "spamtestplus," or "virustest." Also requires that capability "relational" be enabled.
[notes]	Supersedes RFC3685 and various draft-daboo-sieve-spamtest I-Ds. This RFC specifies a couple of tests against any spam and virus analysys that may have been applied and normalized into simple status information by the underlying SIEVE implementation. There is no need for and therefore no plan to implement this, although it could be useful, and perhaps the need will arise.
RFC: 5435 (Extension for Notifications)
[todo]	not supported.
RFC: 6609 (Include Extension)
[notes]	"include" one SIEVE script from another.
[todo]	not supported. Have not looked at this since it was draft-daboo-sieve-include, and it was draft-ietf-sieve-include after that. It did not seem workable for mvmf at the time. MFL has its own include capability so there was (and is) no real personal urgency.
RFC: 6785 (Support for IMAP Events in SIEVE)
[n/a]	Not applicable or no interest
RFC: 9042 (Delivery by IMAP MAILBOXID)
[n/a]	Not applicable or no interest
RFC: 5232 (SIEVE extension: IMAP4flags)
[todo]	Very low interest.
RFC: 5260 (SIEVE extension: Date and Index)
[todo]	unimplemented
RFC: 8579 (Delivering to Special-Use Mailboxes)
[capability]	Capability "special-use" required
[todo]	Not implemented
RFC: 8580 (File Carbon Copy)
[capability]	Capability "fcc" required
[todo]	Not implemented

MVMF extensions and additions
MVMF Extension: C
C	lexical	Complete	Introduces a block of C-like code, which must be enclosed in curly braces. This is not a Sieve extension, as it is part of MFL, and does not need to be enabled via a Sieve "require" statement.
MVMF Extension: dnsbl
dnsbl	Test	Complete	Requires capability "vnd.mvmf.dnsbl" . See notes below.
:ip	Tagged option	Complete	Specify an IP address to be tested, overriding the default.
MVMF Extensions to fileinto
:body	Tagged option	Complete	File the body portion
:header	Tagged option	Complete	File the header portion
:part	Tagged option	Complete	File the currently selected MIME message part, rather than the entire message. Message parts are selected explicitly via MFL functions outside of SIEVE, or as an implicit side-effect of SIEVE MIME extension (RFC5703) which is only done by MFL in contradiction to that RFC.
:raw	Tagged option	Complete	When filing a body only (with tagged option :body and without :header), mvmf will attempt to decode the body if it is encoded in ways that mvmf can deal with. Use the :raw tagged option to prevent this.
[capability]	Use of these options requires "vnd.mvmf.fileinto" capability.
[notes]	These options are extensions provided by mvmf. If neither :body nor :header are given, which is typical, both are assumed
MVMF Extensions in general
	See below for notes about MVMF extensions.
MVMF Extension: sieve
sieve	lexical	Complete	Introduces a block of sieve code, which must be enclosed in curly braces. This allows a script writer to write code that is guaranteed to be in "sieve" mode without having to know the encompassing context. Useful, for example, for script code that is meant to be included (via @include) by some other script. This is not a Sieve extension, as it is part of MFL, and does not need to be enabled via a Sieve "require" statement.
Misc Notes
:comparator	"i;ascii-casemap" and "i;octet" are complete; i;ascii-numeric is also implemented but is not appropriate in all cases. This comparator requires capability "comparator-i;ascii-numeric" . i;ascii-casemap is the default.
# comments	As of the 20050825 release, MFL supports the "#"-style end of line comment. This can conflict somewhat with MFL's preprocessor statements if you enable the '#' character as a preprocessor introducer character. However, also as of the 20050825 release the default preprocessor introducer character has been changed to '@' which does not conflict with this style of comment. Note that MFL also supports the C-like "//" syntax to begin a comment to the end of the line, as well as "/../" bracketed comments.

* ITIN - Inexpedient To Implement Now

MVMF Language extensions

MVMF extension: C

The C statement allows C-like code to be parsed and executed inside a SIEVE block. C-like code is included in curly braces. E.g.:

    sieve {
       if header "to" "user@example.com" {
           C { to_me = 1; }
	   keep;
       }
    }

This is not a Sieve extension, it is simply part of the MFL implementation of Sieve as an embedded language, and thus is not enabled via a Sieve "require" statement.

MVMF extension: dnsbl

The dnsbl statment is used to test against one or more DNS-based blocklists/blacklists (called DNSBLs). It takes an optional ":ip" tagged option, plus two arguments, each of which may be a string or string list:

    dnsbl  [:ip <ipaddr: string>] <blnames: string-list>  <result-codes: string-list>

With the :ip option, the given IP address ipaddr is tested against the specified DNSBLs. No mail message need be open to use this form.

With no :ip option, the dnsbl statement tests each responsible IP address (each IP address that is believed to be responsible for transporting the message to the local server) against the specified DNSBLs.

The statement returns true if the IP address was found in one of the DNSBLs, and false if it was not.

Note: a list of responsible IP addresses is maintained by any application that includes and supports this language construct. Your MFL code may call a built-in-function "$msg_rip_add()" to add an IP address to this list. This would normally happen when the application calls a specifically-named MFL function, i.e. a "hook," at a relevant point in its processing. For example, the mvmda (Mail Delivery Agent) calls a hook when it has opened and scanned the incoming message. See each application's documentation for descriptions of any hooks supported.

DNSBL blacklist names and result code types are registered in a system-wide file dnsbl.conf normally located in /usr/local/share/mvmf . The blacklist name identifies a domain name suffix to be used for DNSBL lookups, and a result code is a mnemonic name for a result returned by that DNSBL. Generally all blnames have a result code "std" defined as their standard result. Some DNSBLs have various results indicating various things. Result code "*" will match any result returned by the DNSBL lookup.

The code section:

    sieve {
        if dnsbl ["spamcop", "njabl"] "std" {
            discard; stop;
        }
    }

tests all responsible IP addresses against the standard result codes of both the "spamcop" and "njabl" DNSBLs, discarding the message if one of the IP addresses is listed.

This code:

    int flag;
    flag = sieve { dnsbl :ip "127.0.0.2" "spamhaus" "sbl" };

sets the variable "flag" depending on whether the specific IP address 127.0.0.2 is found in the "spamhaus" sbl DNSBL, as does this code:

    string prefix = "127.";
    int flag = sieve { dnsbl :ip [ prefix + "0.0.2" ] "spamhaus" "sbl" };

The "dnsbl" capability must be enabled via SIEVE's "require" statement in order to use this statement, using a capability name of "vnd.mvmf.dnsbl" . Earlier MFL implementations used a capability name of "dnsbl" instead of the more proper vendor-specific name. When you configure and build the mvmf package you can still choose to support the old capability name as an alias.

MVMF Extension: pipe in "fileinto"

If the target of a "fileinto" statement begins with a vertical bar ("|"), it's taken to mean that the mail message should be piped into the command following the vertical bar. This capability is restricted, though; it works only if the "pipe_allow" control has been adminstratively enabled. (Administrative controls are addressed elsewhere.)

    // Assumes that this has been executed somewhere in admin mode:
    $admin_int_set( "pipe_allow", 1 );
             .
             .
             .
    sieve {
        fileinto "|process-report";
    }

MFL also has an interface to system-defined plugins using the $cusp_ family of built-in-functions. The CUSP interface is intended for helper applications that have a more compex interface than simply piping a message into an external program. See, for example, the clamdif interface to a clamav anti-virus daemon.

MVMF Extension: expressions in SIEVE constructs

Some SIEVE constructs have been extended so that an MFL expression can occur in some places. In general, anywhere that a string can be used in a SIEVE statement, you can place an MFL expression. However, because of some syntax conflicts between SIEVE string lists and MFL expressions (see below), such expressions must appear only within the string list format (i.e., enclosed in square brackets). For a rather contrived example:

  string my_other_domain;
  my_other_domain = "example.com";
  sieve {
     if not address :domain "To" [ my_other_domain ] {
         keep;
     }
     else {
         redirect [ (string)"myself@" + my_other_domain ];
     }
  }

What's the syntax conflict that requires that expressions only be used inside square brackets? The problem comes down to the fact that in SIEVE statements, terms outside of square brackets are separated by spaces, while terms inside of square brackets are separated by commas. Imagine allowing expressions anywhere, such as in this potential case:

    string header_name = "subject";
    sieve {
        if header :contains header_name ["ADV"] { discard; }
    }

If the parser is allowed to look for an expression outside of a string list (i.e., outside of square brackets), it can easily think that

    header_name [ "ADV" ]

follows the syntax of an array reference. While this simple case might seem easy to resolve, more complex cases are not. Fortunately, terms inside of string lists are separated by commas, removing that kind of ambiguity there. (Use of commas in that part of the SIEVE language seems a mite inconsistent, but I'm not complaining.)

MVMF Extension: message part selection

MFL knows about the MIME structure of messages, and has the concept of a "current message part." All header tests are done in the context of this current message part. In the default state, the top-level message part (i.e., the message headers) are selected. MFL scripts may select other message parts (e.g. the children of a multipart message part). Let's say you have a message whose top level content type is "multipart/alternative" with two children, one with content type "text/plain" and the next with "text/html". Consider these three statements:

    /* A */  sieve { header :matches "content-type" "multipart/*" }
    /* B */  sieve { header :matches "content-type" "text/plain*" }
    /* C */  sieve { header :matches "content-type" "text/html*" }

With the top (default) message part selected, only statement A evaluates to TRUE. With the first child selected, only B is true, and with the second child selected, only C is true.

MFL is like C

MFL's enveloping language for procedural and logic flow is C-like in nature. (We won't explain "C" here, but if you are reading this far you probably either know it or can find out about it.) We say "C-like" because it gets its data typing and control flow from C, but it doesn't implement a full C language.

What's in MFL's C-like component: fundamental and compound data types, expressions, control flow statements, variables, initializers, functions, and a cpp-like preprocessor.

What's not: switch statement (and case labels), function prototypes (except for function definitions), local variables inside any compound block (including functions);

Oddities: MFL C-like variables may contain "$". Thus "$a" and "a$" are legal variable names. Functions supplied as part of mvmf will always name variables and functions starting with '$' -- other script writers should avoid doing that.

Status of C-like implementation

The following table shows the implementation status of various C-like elements of MFL.

Thing Status Comments

Fundamental data types and modifiers

unsigned Supported May be used as a modifier to an integer type, or by itself as an abbreviation for unsigned int

short Supported May be used as a modifier to int, or by itself as an abbreviation for short int

long Supported May be used as a modifier to int, or by itself as an abbreviation for long int

char Supported A 1-byte value

int Supported A natural integer (currently 2-byte value)

$int4$ Supported MFL extension to guarantee 4-byte int

(short int is 2 bytes; long int is 4 bytes.)

float Supported Floating point number

double Supported Double precision floating point number

string Supported MFL extension for character strings.

Aggregates and metatypes

typedef Supported Defines a new type in terms of another type definition

struct Supported A data structure

union Supported A data overlay

enum Supported Enumerated types. See note "E".

[] Supported Arrays.

* Supported Pointers. See note "P".

Control statements

break Supported Exit loop.

continue Supported Next loop iteration.

do Supported Loop control.

if Supported Conditional execution

else Supported (implemented as part of "if")

for Supported Loop control

return Supported Return from a function (with optional return value)

switch..case Not supported Value detection -- no plan to support this

while Supported Loop control

pv$ Supported MFL extension to print to stdout. See note "PV".

[built-in functions] Supported Described here

[MFL functions] Supported User-written functions (see below)

Expressions and evaluation

C Supported MFL extension to introduce a C-like code block, which must be enclosed in curly braces. Useful to guarantee that a script is in C-like mode, e.g. for a code snippet that is intended to be included by another script.

sizeof Supported Returns number of bytes of a variable, storage element, type, or expression. See note "SO".

sieve Supported MFL extension to introduce a SIEVE code block, which must be enclosed in curly braces.

( ) Supported Parenthetical grouping for explicit precedence

? : Supported Conditional expression: test ? truth : falsth

! Supported "!" Operator (boolean not)

~ Supported "~" Operator (bitwise complement)

, Supported "," Operator (return second of two expressions)

= Supported "=" Operator or assignment

== Supported "==" Operator (compare equal)

==^ Supported "==^" Operator (string compare equal, ignore case) See note "S".

!= Supported "!=" Operator (compare not equal)

!=^ Supported "!=^" Operator (string compare not equal, ignore case) See note "S".

=. Supported "=." Operator (regex matching, pattern on RHS) See note "S".

!=. Supported "!=." Operator (regex non-matching, pattern on RHS) See note "S".

=? Supported "=?" Operator (glob-style matching, pattern on RHS) See note "S".

=?^ Supported "=?^" Operator (glob-style matching, ignore case, pattern on RHS) See note "S".

!=? Supported "!=?" Operator (glob-style non-matching, pattern on RHS) See note "S".

!=?^ Supported "!=?^" Operator (glob-style non-matching, ignore case, pattern on RHS) See note "S".

< Supported "<" Operator (compare less than)

<^ Supported "<^" Operator (string compare less than, ignore case) See note "S".

<= Supported "<=" Operator (compare less than or equal)

<=^ Supported "<=^" Operator (string compare less than or equal, ignore case) See note "S".

<< Supported "<<" Operator (shift left)

<<= Supported "<<=" Assignment operator (shift left)

> Supported ">" Operator (compare greater than)

>^ Supported ">^" Operator (string compare greater than) See note "S".

>= Supported ">=" Operator (compare greater than or equal)

>=^ Supported ">=^" Operator (string compare greater than or equal, ignore case) See note "S".

>> Supported ">>" Operator (shift right)

>>= Supported ">>=" Assignment operator (shift right)

+ Supported "+" Operator (add)

+= Supported "+=" Assignment operator (add)

++ Supported "++" Operator (increment)

- Supported "-" Operator (subtract)

-= Supported "-=" Assignment operator (subtract)

-- Supported "--" Operator (decrement)

* Supported "*" Prefix operator (pointer dereference)

* Supported "*" Infix operator (multiply)

*= Supported "*=" Assignment operator (multiply)

/ Supported "/" Operator (divide)

/= Supported "/=" Assignment operator (divide)

% Supported "%" Operator (modulo)

%= Supported "%=" Assignment operator (modulo)

[ Supported "[" Operator(kinda) (array reference)

& Supported "&" infix operator (bitwise AND)

& Supported "&" Prefix operator (address-of)

&& Supported "&&" Operator (boolean AND)

&= Supported "&=" Assignment operator (bitwise AND)

| Supported "|" Operator (bitwise OR)

|| Supported "||" Operator (boolean OR)

|= Supported "|=" Assignment operator (bitwise OR)

. Supported "." Operator (member reference)

-> Supported "->" Operator (member reference)

Preprocessor

MFL sports a basic cpp-like preprocessor; this section lists the preprocessor elements you might expect. The preprocessor is conceptually responsible for removing comments and interpreting preprocessor directives. Directives are indicated in a script by using '@' as the first character on the line (i.e., in the first column). (The use of '#', as with C, conflicts with the Sieve-mandated comment characters. Nevertheless when you configure mvmf you can enable the use of '#' instead of or in addition to the '@' character.)
There are more elaborate notes about the use of the preprocessor later in this document.

@define Supported Defines a preprocessor constant or macro.

@else Supported Starts the "else" part of a preprocessor conditional

@endif Supported Ends a preprocessor conditional block

@help Supported Prints the supported preprocessor statements (useful only in interactive mode)

@ifdef Supported Begins a conditional block that is executed if a preprocessor symbol is defined.

@ifndef Supported Begins a conditional block that is executed if a preprocessor symbol is not defined.

@include Supported Includes the contents of another MFL file at this point in the compilation/interpretation

/*..*/ Supported Block comment

// Supported Comment to end of line

Preprocessor extensions

MFL has some other preprocessor directives.

@ifdef_func Supported Begins a conditional block that is executed if an mfl function is defined.

@ifdef_var Supported Begins a conditional block that is executed if an mfl variable is defined.

@ifndef_func Supported Begins a conditional block that is executed if an mfl function is not defined.

@ifndef_var Supported Begins a conditional block that is executed if an mfl variable is not defined.

@include_noerr Supported Like @include, but doesn't complain if the file is not available. Useful for loading control files that don't have to exist.

Thing	Status	Comments
Fundamental data types and modifiers
unsigned	Supported	May be used as a modifier to an integer type, or by itself as an abbreviation for `unsigned int`
short	Supported	May be used as a modifier to `int`, or by itself as an abbreviation for `short int`
long	Supported	May be used as a modifier to `int`, or by itself as an abbreviation for `long int`
char	Supported	A 1-byte value
int	Supported	A natural integer (currently 2-byte value)
$int4$	Supported	MFL extension to guarantee 4-byte int
(short int is 2 bytes; long int is 4 bytes.)
float	Supported	Floating point number
double	Supported	Double precision floating point number
string	Supported	MFL extension for character strings.
Aggregates and metatypes
typedef	Supported	Defines a new type in terms of another type definition
struct	Supported	A data structure
union	Supported	A data overlay
enum	Supported	Enumerated types. See note "E".
[]	Supported	Arrays.
*	Supported	Pointers. See note "P".
Control statements
break	Supported	Exit loop.
continue	Supported	Next loop iteration.
do	Supported	Loop control.
if	Supported	Conditional execution
else	Supported	(implemented as part of "if")
for	Supported	Loop control
return	Supported	Return from a function (with optional return value)
switch..case	Not supported	Value detection -- no plan to support this
while	Supported	Loop control
pv$	Supported	MFL extension to print to stdout. See note "PV".
[built-in functions]	Supported	Described here
[MFL functions]	Supported	User-written functions (see below)
Expressions and evaluation
C	Supported	MFL extension to introduce a C-like code block, which must be enclosed in curly braces. Useful to guarantee that a script is in C-like mode, e.g. for a code snippet that is intended to be included by another script.
sizeof	Supported	Returns number of bytes of a variable, storage element, type, or expression. See note "SO".
sieve	Supported	MFL extension to introduce a SIEVE code block, which must be enclosed in curly braces.
( )	Supported	Parenthetical grouping for explicit precedence
? :	Supported	Conditional expression: test ? truth : falsth
!	Supported	"!" Operator (boolean not)
~	Supported	"~" Operator (bitwise complement)
,	Supported	"," Operator (return second of two expressions)
=	Supported	"=" Operator or assignment
==	Supported	"==" Operator (compare equal)
==^	Supported	"==^" Operator (string compare equal, ignore case) See note "S".
!=	Supported	"!=" Operator (compare not equal)
!=^	Supported	"!=^" Operator (string compare not equal, ignore case) See note "S".
=.	Supported	"=." Operator (regex matching, pattern on RHS) See note "S".
!=.	Supported	"!=." Operator (regex non-matching, pattern on RHS) See note "S".
=?	Supported	"=?" Operator (glob-style matching, pattern on RHS) See note "S".
=?^	Supported	"=?^" Operator (glob-style matching, ignore case, pattern on RHS) See note "S".
!=?	Supported	"!=?" Operator (glob-style non-matching, pattern on RHS) See note "S".
!=?^	Supported	"!=?^" Operator (glob-style non-matching, ignore case, pattern on RHS) See note "S".
<	Supported	"<" Operator (compare less than)
<^	Supported	"<^" Operator (string compare less than, ignore case) See note "S".
<=	Supported	"<=" Operator (compare less than or equal)
<=^	Supported	"<=^" Operator (string compare less than or equal, ignore case) See note "S".
<<	Supported	"<<" Operator (shift left)
<<=	Supported	"<<=" Assignment operator (shift left)
>	Supported	">" Operator (compare greater than)
>^	Supported	">^" Operator (string compare greater than) See note "S".
>=	Supported	">=" Operator (compare greater than or equal)
>=^	Supported	">=^" Operator (string compare greater than or equal, ignore case) See note "S".
>>	Supported	">>" Operator (shift right)
>>=	Supported	">>=" Assignment operator (shift right)
+	Supported	"+" Operator (add)
+=	Supported	"+=" Assignment operator (add)
++	Supported	"++" Operator (increment)
-	Supported	"-" Operator (subtract)
-=	Supported	"-=" Assignment operator (subtract)
--	Supported	"--" Operator (decrement)
*	Supported	"*" Prefix operator (pointer dereference)
*	Supported	"*" Infix operator (multiply)
*=	Supported	"*=" Assignment operator (multiply)
/	Supported	"/" Operator (divide)
/=	Supported	"/=" Assignment operator (divide)
%	Supported	"%" Operator (modulo)
%=	Supported	"%=" Assignment operator (modulo)
[	Supported	"[" Operator(kinda) (array reference)
&	Supported	"&" infix operator (bitwise AND)
&	Supported	"&" Prefix operator (address-of)
&&	Supported	"&&" Operator (boolean AND)
&=	Supported	"&=" Assignment operator (bitwise AND)
\|	Supported	"\|" Operator (bitwise OR)
\|\|	Supported	"\|\|" Operator (boolean OR)
\|=	Supported	"\|=" Assignment operator (bitwise OR)
.	Supported	"." Operator (member reference)
->	Supported	"->" Operator (member reference)
Preprocessor
MFL sports a basic cpp-like preprocessor; this section lists the preprocessor elements you might expect. The preprocessor is conceptually responsible for removing comments and interpreting preprocessor directives. Directives are indicated in a script by using '@' as the first character on the line (i.e., in the first column). (The use of '#', as with C, conflicts with the Sieve-mandated comment characters. Nevertheless when you configure mvmf you can enable the use of '#' instead of or in addition to the '@' character.) There are more elaborate notes about the use of the preprocessor later in this document.
@define	Supported	Defines a preprocessor constant or macro.
@else	Supported	Starts the "else" part of a preprocessor conditional
@endif	Supported	Ends a preprocessor conditional block
@help	Supported	Prints the supported preprocessor statements (useful only in interactive mode)
@ifdef	Supported	Begins a conditional block that is executed if a preprocessor symbol is defined.
@ifndef	Supported	Begins a conditional block that is executed if a preprocessor symbol is not defined.
@include	Supported	Includes the contents of another MFL file at this point in the compilation/interpretation
/../	Supported	Block comment
//	Supported	Comment to end of line
Preprocessor extensions
MFL has some other preprocessor directives.
@ifdef_func	Supported	Begins a conditional block that is executed if an mfl function is defined.
@ifdef_var	Supported	Begins a conditional block that is executed if an mfl variable is defined.
@ifndef_func	Supported	Begins a conditional block that is executed if an mfl function is not defined.
@ifndef_var	Supported	Begins a conditional block that is executed if an mfl variable is not defined.
@include_noerr	Supported	Like @include, but doesn't complain if the file is not available. Useful for loading control files that don't have to exist.

Note E: A specific assignment to an enum member definition is not supported, e.g.:

    enum {
        aa,
	bb=3,
	cc }

does not work in MFL.

Note P: Pointers are supported inasmuch as you can point to some other data storage defined in an MFL program. Pointers are constrained at run-time only to reference a particular data object.

Note PV: pv$ is basically a hack to allow debugging printouts. You can print a single value e.g.

    int x;
    pv$ x;

or you can print a printf-like format string and a single argument, e.g.:

    int x;
    x = 23;
    pv$ "x is %d\n", x;

Note S: These string comparison operators are MFL extensions.

Note SO: The MFL interpreter does late type binding and late evaluation; there is currently no way for the interpreter to figure out the type of an expression without evaluating it. sizeof can give you the size of an expression, but note well that the expression will be evaluated in the process. E.g. in:

   int x = 0;
   int sx;

   sx = sizeof( x = 3 );

sx will be the size of the expression (an int), and x will be set to 3!

Initializers

Definitions of most variable types can include initializers (one exception is unions). Examples:

    int x = 3;
    int y = {3};

instantiate x and y and set both values to 3. (It's an inconsistency of C syntax (so we follow it) that scalar initializers can optionally be enclosed in braces, yet initializers for scalars within aggregates can not.)

   struct {
       int key;
       string val;
} kvt[3] = {
    { 10, "key 1" },
    { 20, "key 2" },
    { 30, "key 3" } };

Statements and blocks in expressions

MFL allows compound blocks to be used as terms in an expression. The statements inside of the compound blocks still need to be fully-formed statements themselves. The value of the compound block is the value of the last statement executed in it.

    int i;
    i = { 3 + 7 };		// This is wrong
    i = { 3 + 7; };		// This is correct

The mvmf application may also be configured (when it is compiled) to allow the use of some native C-like statements as expression terms. These statements include do, for, if, pv$, and while. sizeof is always available as a term, while break, return, and continue never are. As with statements inside of compound blocks, the statement as an expression term must still be fully-formed, which can result in some odd-looking code, as in this contrived example:

    int i;
    i = if ( foo() ) 3; else 4; ;	// looks odd
    i = if ( foo() ) {3;} else {4;} ;	// perhaps better.

Strings

Strings are a native type in MFL. A string's basic type is a fixed length even though it refers to a string that may change size. For this reason a string variable can easily be included as a member of an aggregate (e.g. arrays or structs) -- the string data element is an anchor for the string, and not the string itself. (This may sound like a pointer, but it's not: it obeys native type semantics, not pointer semantics.)

Strings are implemented using something called a refstr, which is a view of a referenced string. Multiple views to common strings may be obtained via string pointers (i.e. (string *)) which can be dereferenced to access their reference target. When a string is modified, any views into the underlying string object are modified to reflect the change. For example, consider this MFL code sequence:

    string s = "I am a test";
    string *sp = $str_sub(s, 7, 4);       // points to "test"
    string *s1P = $str_sub(s, 2, 2);      // points to "am"
    *s1P = "used to be";

The string s is now

    "I used to be a test"

and the string pointer sp still points to

    "test"

Every string, including the targets of string pointers, has the following attributes:

Start offset

The offset into the base string at which this refstr begins. This offset is absolute, but it may be subject to change either explicitly or implicitly when some other refstr alters the base string (see discussion above). An absolute value of -1 means that the offset is anchored at the beginning of the string, despite any attempts to move it.

End offset

Just like the start offset, this is where in the base string that the refstr ends. A value of -1 anchors the end offset at the end of the base string.

Current byte index aka "bx"

Some string operations keep track of a current byte index. For example, built-in-functions to find or extract a token from a string will use the bx as a start point, and will update it to provide the next start point -- this allows repeated "find token" operations to step through the string. The bx may be used internally in some cases as well. Note that the bx is associated with each refstr, including the refstr that is the target of a pointer. So:

      string s = "hello there";
      string *sP;
      sP = $str_sub(s, 4, 4);	// "o th"
      $str_bx_set(*sP,3);	// Sets to 3, the 'h' position
      $str_bx(s);		// will initially be 0
      $str_bx(*sP);		// is still 3.

Some operations on strings:

Conversions.

Converting from a non-string to a string will attempt to do the right thing. e.g. '(string)3' yields the string "3". Converting from a string to a non-string will also attempt to do something reasonable: e.g. assigning from a string to an int will perform a C-like 'atol()' function on the string.

"+"

Adding something to a string will convert the second term to a string if possible, then return the concatenation of the two strings. So '(string)"hello" + 3' yields the string '"hello3"' .

Comparisons

'==', '!=', '<', '<=', '>=', '>' do what you might expect.

Regex matches

'=.' performs a regex match on a string, with the pattern on the right hand side. e.g. in:

       string s = "hello";
       if ( s =. "h.*o" )
           some code here;

the test would succeed. '!=.' is the notted version of the test.

Wild matches

'=?' and '!=?' are similar to regex matching, but using more familiar (to some) matching where '?' matches exactly one character and '*' matches zero or more characters. This is just like the sieve ":matches" match-type.

Case insensitivity

Some of the string match operators have a case-insensitive mode which is indicated by using a '^' after the operator. These are: "==^", "!=^", "=?^", "!=?^", "<^", "<=^", ">^", and ">=^" . The '^' is supposed to suggest some kind of case shifting.

Notes about string pointers:

General

Every string is a refstr which is a view into an underlying string. The only way to manipulate this view is through string pointers. Various built-in functions such as $str_sub provide ways to create a refstr to another string; the reference is always done via a pointer. When you dereference the pointer, you are operating on the string.

"+" and "-"

If you add or subtract to a string pointer, you shift its position into the underlying string. Both the start and ending positions will be adjusted, unless they run off the end of the reference string. Consider:

       string s = "abcdefgh";
       string *sp;
       sp = $str_sub( s, 3, 2 );	// Now points to "de"
       ++sp;				// Now points to "ef"
       sp - 4;				// points to "ab"
       sp - 5;				// points to "a"

Note that a string literal is not a string until it is coerced into one. Since those kinds of coercions happen automatically in many places you might not notice the need for it. But be aware that:

    "abcdefg" + 3

is essentially an array reference to the third character of the character array (not a string!) "abcdefg", while

    (string)"abcdefg" + 3

evaluates to the string "abcdefg3" since the first term is coerced via the typecast.

SIEVE strings vs MFL strings

This section has talked about strings in the C language mode of MFL. Making things a bit more confusing, SIEVE mode also has its own string type and string operations. (Much or maybe all of this was added to SIEVE well after MFL came along.) SIEVE strings may have some capabilities such as supporting references to SIEVE variables (which are also different from C-language variables) and special ways of embedding lexical elements (like those enabled with the "encoded-character" capability). The C side of MFL does not handle special SIEVE string characteristics by default. It may provide ways to do so explicitly, e.g. via the $str_sieve() built-in function.

built-in functions

A limited number of built-in functions are supported. They are described in a separate document.

MFL functions

An MFL function has a C-like syntax, with a declaration of a return type, a formal argument list, and a function body. One quirk of MFL functions is that a function is treated syntactically like a variable declaration, one side-effect of which is that it has to be terminated with a semicolon (or a comma and another declaration using the same type). An MFL function therefore looks something like this:

    /* Recursive function to return digits of an integer
       separated by spaces
    */
    string dp( int n ) {
        if ( n < 10 ) return (string)n;
        return dp( n/10 ) + " " + (string)(n%10);
    };

Using the above, dp(25821) returns the string "2 5 8 2 1"

Preprocessor

As with the "C-like" side of the language, we should call this "preprocessor-like" or more exactly "cpp-like" because it implements some of the functions of C's "cpp" preprocessor program. Because MFL includes a self-contained parser and interpreter, the preprocessor functions are built into the lexical input stage, and are thus not part of a separate preprocessor program. Nevertheless they do offer basic basic preprocessor capabilities. Here's a brief statement of what those capabilities are.

Comment removal. Conceptually, the preprocessor removes comments from the script before it is interpreted. There are two styles of comments: block comments and rest-of-line comments.

A block comment begins with the pair of characters /* and ends with the pair */. Comments do not nest: once a comment block is opened with /* the next */ closes it, even if another /* is encountered first.

A rest-of-line comment begins with the pair of characters // or with the single character # and ends at the end of the line. This is useful for annotating a single statement.

The following illustrates both kinds of comments:

    /* Basic SIEVE setup */
    sieve {
	require ["fileinto", "envelope"];
    }

    int score = 0;		// Declare and initialize score
    float f;			# A temporary

    /* Now look for a special "X-Spam-Score" header and adjust
       our integer score according to the floating point value
       found there.
    */
    if ( sieve { header :matches "X-Spam-Score" "*" } ) {
	f = $str_match(0);	// pick up the score value
	if ( ( f >= 9.0 ) && ( f <= 9.9 ) )
	    ++score;		// Significant value bumps score.

As you can see: block comments can span lines, rest-of-line comments continue only to the end of the line, and (sometimes) too many comments can obscure the meaning rather than amplify it. (Unless we are illustrating how comments are implemented, of course.)

Macro substitution. The preprocessor has its own symbol table. Symbols in this table are variously thought of as preprocessor symbols, macro names, or manifest constants. The combination of a symbol and its value may be thought of as a macro. Whenever the preprocessor encounters one of these symbols in the input stream (e.g., in the script), the value that has been assigned to that symbol is used instead of the actual symbol. This is known as macro substitution or macro expansion. A macro is created via the "@define" preprocessor directive, described below.

There are two kinds of macros: those with arguments and those without. Actual arguments to a macro are supplied in a parenthesized list, with arguments separated by commas. (Depending on the way MFL is built, whitespace may or may not be allowed between the macro name and the opening parenthesis. To be safe, don't use whitespace here: follow the macro name immediately by an opening parenthesis.)

A simple example of macro definition and substitution:

    @define ALTADDR "fred@example.com"

    sieve {
        redirect ALTADDR;
    }

Macro substitution for ALTADDR occurs before the "redirect" statement is parsed: that statement is parsed exactly as if it were written:

	redirect "fred@example.com";

A macro with arguments:

    @define aab(a,b) (a+a+b)

    int i = aab(3,4);

The preprocessor turns that into:

    int i = (3 + 3 + 4);

initializing variable i with a value of 10.

Macro references can not be recursive. While a macro is being expanded, it is prevented from further expansion until its value is completely substituted. (It's said that the macro is "painted blue" while it is ineligible for expansion.) This prevents a macro value from refering to the macro name itself, or to the name of another macro being expanded. Note also that macro substitution occurs on a token-by-token basis. Since a quoted string is an individual token, any macro names inside a quoted string are not substituted.

Preprocessor directives. The preprocessor is otherwise commanded via preprocessor directives. A preprocessor directive is indicated by a @ character at the first character position on a line, followed by a recognized preprocessor command. For compatibility with older mvmf releases, when you build and install mvmf, you can choose to enable '#' as an alternative preprocessor introducer character, in place of or in addition to the use of '@'. Note that the indentation in various examples is for clarity only: the @ (or '#') must occur at column 1. Whitespace may occur between the @ and the command and in fact is encouraged to indicate a nesting level. However, when one talks about a preprocessor directive, it's usually with the concatenation of the @ and the command name. Preprocessor directives:

@define macroname[(arguments)] [value]

Define a macro, either with or without formal arguments. Its simplest form is:

    @define NAME  value

so that whenever NAME is encountered in the script after this, the value is used instead. Formal arguments may also be given by including them in parenthesis directly following the macro name. Each occurance of a formal argument in the macro body will be replaced by the actual argument when the macro is invoked. For example:

    @define RS(rcpt) sieve { redirect rcpt; }

    RS( "bozo@example.com" )

will cause this code to be used:

    sieve { redirect "bozo@example.com"; }

Macros can be used to stand in for commonly used strings or sequences, particulary for code that might be changed from time to time (thus you'd only have to change the macro definition rather than changing code in multiple places in your script). As with C guidelines, making your macro names uppercase gives a visual clue that macro names are being used.

@else

Negates the effect of conditional parsing.

@endif

Ends a conditionally parsed block.

@help

Prints a list of valid preprocessor commands. Useful only in interactive mode.

@ifdef macroname

Tests whether a name exists in the preprocessor symbol table; if it does, allows interpretation of the script up to the next matching @else or @endif directive. If it doesn't, the script up to the next matching @else or @endif directive is not parsed. Example:

    /* @define DROPMAIL  */

    sieve {
    @ifdef DROPMAIL
	discard;
    @else
	keep;
    @endif
    }

i.e., if DROPMAIL is defined to the preprocessor, the discard statement is executed. Otherwise, the keep statement is executed.

@ifdef_func funcname

Tests whether an MFL function exists.

@ifdef_var varname

Tests whether an MFL variable exists.

@ifndef macroname

Tests whether a name doesn't exist in the preprocessor symbol table, and performs conditional script parsing based on that. i.e., this is the inverse of @ifdef.

@ifndef_func funcname

Tests whether an MFL function doesn't exist.

@ifndef_var varname

Tests whether an MFL variable doesn't exist.

@include fileref

Inserts another script file at this point in the parsing of the script where the @include is encountered. fileref is either a quoted string, in which case it refers to a user-level file, or it's a name enclosed in anglebrackets, in which case it refers to a system-level file. Each sort of access has its own include path, i.e. a list of directories that will be searched for the file (for a user-level file, the file is always looked for relative to the current directory before trying the user-level include path). Every application that uses MFL will establish at least one directory in the system-level path, and you can add directories to both paths, e.g. by using the built-in-function $mfl_incdir_add().

    @include "data.mfl"

inserts the contents of the file data.mfl located in the current directory (e.g., your home directory) or in the user-level include path, and

    @include <common.mfl>

inserts the contents of the file common.mfl that is found along the system-level include path.

@include_noerr fileref

Like @include, but ignores any error accessing the file.

Preprocessor statements are always acted on at parse time. You might have an elaborate MFL function that you store in its own file; using something like

    @include "bigfunction.mfl"

will always load and parse that file whether or not you ever need the function (assuming that this does not occur in a false preprocessor condition). If you only want to load the function when you know you are going to need it, you can include the file using a runtime parse and execute function, e.g.:

    sieve {
	if envelope :is "from" "monthly-report@example.com" {
	    C {
		$mfl_exec_string( "@include \"bigfunction.mfl\"" );
		bigfunction();
	    }
	}
    }

Misc notes

Depending on how it was compiled, each mvmf application may have the capability of executing system-wide or user-level MFL scripts when it starts. These can be used to define commonly-used functions, hook functions that are called automatically by the application at certain stages, variables, and so forth.

Each application may also call specially-named MFL functions at particular points in the utility's execution. You may supply these hook functions to affect some aspect of the application's operation at the point the hook is called.

Application-specific details such as these are described along with each utility that incorporates MFL.

Future plans and ideas

This section has been incorporated into the To Do page.

Examples

Examples are primarily relevant to each utility; please see the documentation for each mvmf application that uses MFL.