Comments on EAP state machine v4
From: Florent Bersani (florent.bersanird.francetelecom.fr)
Date: Wed, 9 Jun 2004 05:10:40 -0400 (EDT)
Hi all,

Below are some comments that came to my mind while reading the latest version of EAP state machine I am aware of (http://www.cs.umd.edu/~npetroni/EAP/drafts/ietf-eap-statemachine/04/draft-ietf-eap-statemachine-strawman-04.pdf).

I know that these comments may:
* Come much too late (since IETF last call has ended 2004-05-13 so it shouldn't be the time IINM to propose any "real" changes to this document)
* Be completely naive or off the rocket
* Echo some comments that have already been made (although I tried my best to reread - and understand ;-) - the EAP state machine issues tracked by Bernard


I just felt like making them:
* Because the probability that they be of some help somehow is - I hope - non-zero
* Because the probability that someone kindly replies to educate me, is - I hope - non-zero
* For the record


So here they go.

Comment #1 - Editorial

In section 3.2 state machine symbols, we can read:
"+ Arithmetic addition operator.
- Arithmetic subtraction operator."

I find these definitions useless (IINM we never see the notation + in the state machines and the notation ++ which is perhaps why this was included should either be defined directly or assumed understandable as it is said in section 3.1 "The interpretation of the special symbols and operators used in the state diagrams is as defined in Section 3.2; these symbols and operators are derived from the notation of the C++ programming language, ISO/IEC 14882.") BTW, I cannot find any use of - or -- in the document. And anyway, I think that in case + and - are however necessary for the comprehension of the document, it should be stated on which space they operate (N, Z/2**16Z, Z/2**32Z...)

Comment #2 - Editorial

This is about portEnabled and eapRestart.

This discussion for what regards portEnabled is a follow-up on issues 198 and 203. I do still find the name of this variable too .1X centric and, in the light of the recent debates on corner cases on EAP starts and restarts, I'd prefer a more explicit name (like Yoshi had proposed for instance). Also I'd like the current definition (e.g. in section 4.1.1 "portEnabled (boolean) Indicates that the EAP peer state machine should be ready for communication. This is set to TRUE
when the EAP conversation is started by the lower layer. If at any point the communication port or session is not available, portEnabled is set to FALSE and the state machine transitions to DISABLED.") clarified (and why not - two - example instantiations given, one .1x and one IKEv2 for instance) to take the "new"? problems into account (e.g. section 7.12 of RFC 3748b "In IEEE 802.11, a "link down" indication is an unreliable indication of link failure, since wireless signal strength can come and go and may be influenced by radio frequency interference generated by an attacker. To avoid unnecessary resets, it is advisable to damp these indications, rather than passing them directly to the EAP. Since EAP supports retransmission, it is robust against transient connectivity losses. " )


For eapRestart, my problem is much the same, two concrete examples (.1X, PPP or IKEv2 for instance) would considerably help understand what this variable stands for.

Comment #3 - Editorial

This is about the initialization of lastID which is not done in the EAP peer state machine. This had been pointed out in issue 229 but apparently not taken into account.
Also specify, e.g. in section 4.3.1 that lastId may take the value "NONE"


Comment #4 - Editorial

This is about idleWhile. From my understanding, this timer steadily decreases and when it reaches 0, the peer may time out. Clearly, knowing the initial value of this timer, by a simple subtraction, one can get to know how long the peer has been waiting.
So, contrary to the definition in section 4.1.1 ("idleWhile (integer) Outside timer used to indicate how long the peer has waited for a new (valid) request."), I'd rather say that this timer indicates how long remains before the peer may time out.


Comment #5 - Technical

That's just a triviality about for instance the peer state machine: thanks to EAP idleWhile, the method does not have to set timers (EAP cares for it). However, in case the method wants to implement a "bad packet received" counter (e.g. it is waiting for a packet and to provide DoS resilience it wants to allow receiving a limited number of "bad packets" before the right one - instead of going automatically to failure), it has to do so by itself (and typically will use altReject if it wants to fail before the timeout. This is not an issue but perhaps it could be worth discussing the usefulness of such a behavior for EAP methods (see e..G g. RFC 3748 section 7.5 "Whether a MIC validation failure is considered a fatal error or not is determined by the EAP method specification") and that it can indeed be implemented in the EAP state machines (with a little disymmetry between the timer implemented within EAP and the "bad packet received" counter implemented within the method. I guess this comment is a way for me to express that I wholeheartedly agree with the point .3 Joe made in issue 203 (in other words the imbrication of EAP and EAP methods confine to layer violation).

Comment #6 - Technical

This is about DONE, CONT and MAY_CONT/UNCOND_SUCC, COND_SUCC and FAIL.

While I do not doubt that there are could technical reasons to use these variables (rather than simply CONT and DONE) and that the EAP state machine does not claim to be THE way to implement EAP (in its introduction "The State Machine and associated model are informative only. Implementations may achieve the same results using different methods"), I think that giving briefly the rationales behind this choice (which is not explicit in section 4.2 IMHO) would help the reader. In particular, giving an example of MAY_CONT's usefulness.

About the decision variable, here also an explanation of the design (maybe with an example) could help. Indeed, it seems to me that not all pairs (state, decision) are acceptable so state/decision are not totally independent. Here again, giving an example why COND_SUCC was introduced could help.

I think this concern is also related to the conditions in the state machine that allow the peer to transition to success or failure. They do not appear to be either trivial or symmetric. The newbie I unfortunately am, needs much more time to (fully) understand them than any other transition condition in the state machine. Bernard for instance questioned about these Success/Failure transitions in Issue 229. For instance, I am wondering, how the condition "altAccept && methodState != CONT && decision == FAIL" may occur.

Also in section 4.2 I tend to feel dizzy with some text in the paragraph methodState=DONE: "If both (a) the server has informed us that it will allow access and the next packet will be EAP Success,and (b) we're willing to use this access, set decision=UNCOND SUCC." I guess that condition (a) should rather be formulated in terms of altAccept, shouldn't it? Indeed while IIRC RFC 3748 mandates (in section 4.2 "The authenticator MUST transmit an EAP packet with the Code field set to 3 (Success)" that a success packet be sent, this does not guarantee that the peer will ever receive it.

Comment #7 - Editorial

I do not find any definition of m.buildResp (that is used in Figure 3 EAP peer state machine) in section 4.4 Peer state machine procedures). Moreover, i would find it clearer if m.buildresp somehow indicated that it does not only depend on ReqId but also on some internal method state that has been calculated by m.process.

Comment #8 - Editorial

Although in section 4.4, it is said parseEapReq() "checks that the lengthfield is not longer than the received packet", I do not find it completely straightforward from Figure 3 what the peer does in case there is a parse error due to the length (or an invalid code type or...).

Comment #9 - Technical

Apparently Figure 4 (EAP Standalone Authenticator State Machine) leaves the door open to a sequence of EAP authentication methods (which is explicitly forbidden by RFC 3748 section 2.1 "However, the peer and authenticator MUST utilize only one authentication method (Type 4 or greater) within an EAP conversation"). This behavior may be prevented thanks to Policy.getDecision or PolicygetNextMethod... but I do not find this is exactly a matter of policy and at least, this should be pointed out (that the policy MUST forbid this behavior).

Comment #10 - Technical

Why include a separate TIMEOUT_FAILURE State? Why not use the FAILURE state?

Comment #11 - Editorial

eapTimeout does not seem to be defined in the text.

Comment #12 - Technical

This one is stupid but what happens, according to Figure 4, when the standalone authenticator fails directly, i.e. starts by INITIALIZE, transitions to SELECT_ACTION where Policy.getDecision replies FAILURE and thus transitions to FAILURE - in the FAILURE state, I bet there is some problem with eapReqData = buildFailure(currentId) since currentId=NONE

Comment #13 - Editorial

Why call section 5 "standalone authenticator"? I bet this is the old story of the glass half full or half empty, because I'd rather have standalone EAP server.

Comment #14 - Technical

I am totally novice to DoS (I found a lot of papers on the subject, for instance related to IKE - I plan to read them soon :-)) so this point is probably not very important (my understanding is that one of the difficulties with DoS is to understand what is really relevant and what rather belongs to the .11 microwave oven attack, another one could be set the trade off between DoS resilience and "efficiency").

It just seems to me that Figure 4 prevents the standalone authenticator from ignoring (bogus) NAKs. Indeed, let us consider a corporate WLAN deployment where exactly one EAP method is allowed - so that no valid user will ever NAK. In this setting, there is no point in processing the NAK, possibly loosing the valid user's response if the attacker's NAK arrived first and starting all over. I did not find text on this in RFC 3748 (the text I found was about preventing NAKs when a response to a method has already been received) which is not our case here.

Comment #15 - Editorial

The title of Section 6 is "EAP Backend Authenticator" which I find quite strange. I'd rather suggest "backend authentication server".

Comment #16 - Editorial

In section 6.2 we find: "The only difference is that some methods on the backend may support "picking up" a conversation started by the pass-through. That is, the EAP Request packet was sent by the pass-through, but the backend must process the corresponding EAP Response. Usually only the Identity method supports this, but others are possible"
Would it be possible to explain whether this possibility is explicitly left open by some document (in this case, which one) or is implicitly allowed (and in this case, whether there are yet settings/implementations which use such a possibility).


Comment #17 - Technical

I fail to understand the transition in Figure 7 from INITIALIZE_PASSTHROUGH to AAA_IDLE when currentId==None, given that AAA_IDLE sets aaaEapResp=TRUE

Comment #18 - Editorial

If the purpose of aaaIdentity is to allow encapsulation of the peer's identity in e.g. RADIUS User-Name attribute, I'd rather have aaaIdentity be directly the identity of the peer than a whole EAP packet

Comment #19 - General

Just a late and useless remark, I tend to dislike EAP being too much 802.1X centric: EAP is (or at least should be) media independent. Therefore, the problems implied by splitting the server side into an authenticator and an authentication server do probably not belong (originally) to EAP.
Thus, for the sake of clarity, I would rather have had two separate documents: one for the EAP state machine (peer and standalone server) and for the EAP state machines in case the server side is split (the split case is however also evoked in RFC 3748 - which I would also have avoided)...



Florent, at least you know the reactions of naive reader while reading the EAP state machine ;-)


Results generated by Tiger Technologies using MHonArc.