<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Dominic Sayers &#187; rfc5322</title>
	<atom:link href="http://blog.dominicsayers.com/tag/rfc5322/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.dominicsayers.com</link>
	<description></description>
	<lastBuildDate>Tue, 31 Aug 2010 09:12:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<atom:link rel='hub' href='http://blog.dominicsayers.com/?pushpress=hub'/>
		<item>
		<title>What does a double colon mean in IPv6 addresses?</title>
		<link>http://blog.dominicsayers.com/2010/08/24/what-does-a-double-colon-mean-in-ipv6-addresses/</link>
		<comments>http://blog.dominicsayers.com/2010/08/24/what-does-a-double-colon-mean-in-ipv6-addresses/#comments</comments>
		<pubDate>Tue, 24 Aug 2010 11:42:46 +0000</pubDate>
		<dc:creator>Dominic</dc:creator>
				<category><![CDATA[Email address validation]]></category>
		<category><![CDATA[address]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[ietf]]></category>
		<category><![CDATA[rfc2821]]></category>
		<category><![CDATA[rfc2822]]></category>
		<category><![CDATA[rfc3696]]></category>
		<category><![CDATA[rfc4291]]></category>
		<category><![CDATA[rfc5322]]></category>
		<category><![CDATA[rfc5952]]></category>
		<category><![CDATA[rfc822]]></category>
		<category><![CDATA[smtp]]></category>
		<category><![CDATA[validation]]></category>

		<guid isPermaLink="false">http://blog.dominicsayers.com/?p=521</guid>
		<description><![CDATA[My conclusion is this: my own validator is_email() will accept as valid any address that conforms to RFC 4291 (even though that is contradicted by RFC 5321). It will raise a warning if the double colon elides only one zero group.]]></description>
			<content:encoded><![CDATA[<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
<p>This is a post for IPv6 geeks and people who care about email address validation. That&#8217;s probably not you, so I&#8217;m warning you that it gets a bit nerdy below. YHBW.</p>
<p>People keep saying the IPv4 address space is going to run out Real Soon Now but it&#8217;s still the protocol you are using right now to connect to the internet. It&#8217;s still working. When I was at school many years ago, I was told that oil would probably run out before the year 2000. People who believed this started investing in Alternative Energy such as solar power, wind power and, least successful of all, wave power. Most of the early investors lost their money, I guess.</p>
<p>The Alternative Energy of the internet is IPv6. This is the solution that people designed when they first thought the IPv4 address space was in danger of running out. It&#8217;s still a minority sport even though your Windows PC has it installed and running. It&#8217;s talking this language to nobody though. Even if you have a few computers at your house, the router you use to network them together is still only talking IPv4.</p>
<p>IPv6 is there and it&#8217;s real. One day we might start using it. Until then it remains a laboratory curiosity.</p>
<p>But it&#8217;s a valid part of an email address. So if you want to validate somebody&#8217;s email address in your registration form you shouldn&#8217;t go rejecting <em>jon.postel@[IPv6:1234::cdef]</em> just because it doesn&#8217;t match the usual <em>first.last@domain.com</em> format. It&#8217;s a valid address. Check out <a href="http://tools.ietf.org/html/rfc5321" target="_blank">RFC 5321</a> if you don&#8217;t believe me.</p>
<p>OK, let&#8217;s assume you actually clicked that link and read the RFC (I won&#8217;t tell if you don&#8217;t).</p>
<p>Now tell me whether this is a valid address: <em>jon.postel@[IPv6:1111:2222:3333:4444:5555::7777:8888]</em></p>
<p>The answer according to the bible of SMTP is <strong>no</strong>. I quote the comments to the definition of ipv6-comp: &#8220;<span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; line-height: 18px; font-size: 12px;">The &#8220;::&#8221; represents at least 2 16-bit groups of zeros.  No more than 6 groups in addition to the &#8220;::&#8221; may be present.</span>&#8221;</p>
<p>But let&#8217;s look at the bible of <strong>IPv6</strong>, <a href="http://tools.ietf.org/html/rfc4291" target="_blank">RFC 4291</a>: &#8220;<span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; line-height: 18px; font-size: 12px;">The use of &#8220;::&#8221; indicates one or more groups of 16 bits of zeros.</span>&#8221;</p>
<p>So RFC 4291 appears to disagree with RFC 5321. Thanks, IETF. Which should we use as our authority when validating email addresses? Perhaps RFC 5321 is documenting a <em>special case</em> of IPv6 that only applies to SMTP transactions. Hmmm.</p>
<p>Fortunately just when we feel like banging John Klensin&#8217;s head against RFC 4291 or (frankly) anything solid, along come Seiichi Kawamura and Masanobu Kawashima to our rescue. The brand-new <a href="http://tools.ietf.org/html/rfc5952" target="_blank">RFC 5952</a> gives us clear guidelines about the use of the double colon in IPv6 addresses:</p>
<blockquote><p>
<em>The symbol &#8220;::&#8221; MUST NOT be used to shorten just one 16-bit 0 field.</em>
</p></blockquote>
<p>Phew. We have clarity at last.</p>
<p>Or do we?</p>
<p>Remember Jon Postel&#8217;s <a href="http://en.wikipedia.org/wiki/Robustness_principle" target="_blank">Robustness Principle</a>? &#8220;Be conservative in what you do, be liberal in what you accept from others&#8221;. How might we apply that here? RFC 5952 still accepts the authority of RFC 4291. It is a <em>recommendation</em> for how IPv6 addresses might be standardised when written as text. The robustness principle would suggest we should ensure our own addresses conform to RFC 5952, but we should accept any addresses that conform to RFC 4291.</p>
<p>My conclusion is this: my own validator <em><a href="http://code.google.com/p/isemail/source/browse/trunk" target="_blank">is_email()</a></em> will accept as valid any address that conforms to RFC 4291 (even though that is contradicted by RFC 5321). It will raise a warning if the double colon elides only one zero group.</p>
<p>As a final personal note, I would say that an address of the format <em>::1111:2222:3333:4444:5555:6666:7777</em> is nonsense. It&#8217;s valid according to RFC 4291 but it contains 8 colons. That&#8217;s just silly. I think it&#8217;s clear that the only sensible use of the double colon is to elide two or more zero groups and I certainly agree with RFC 5952 that that should be the standard.</p>
<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
<p>Thanks to my correspondent <a href="mailto:michael@squiloople.com">Michael Rushton</a> for bringing my attention to this issue.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dominicsayers.com/2010/08/24/what-does-a-double-colon-mean-in-ipv6-addresses/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Email validation version 2.1</title>
		<link>http://blog.dominicsayers.com/2010/08/18/email-validation-version-2-1/</link>
		<comments>http://blog.dominicsayers.com/2010/08/18/email-validation-version-2-1/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 13:46:49 +0000</pubDate>
		<dc:creator>Dominic</dc:creator>
				<category><![CDATA[Email address validation]]></category>
		<category><![CDATA[address]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[ietf]]></category>
		<category><![CDATA[rfc2821]]></category>
		<category><![CDATA[rfc2822]]></category>
		<category><![CDATA[rfc3696]]></category>
		<category><![CDATA[rfc4291]]></category>
		<category><![CDATA[rfc5322]]></category>
		<category><![CDATA[rfc822]]></category>
		<category><![CDATA[smtp]]></category>
		<category><![CDATA[validation]]></category>

		<guid isPermaLink="false">http://blog.dominicsayers.com/?p=515</guid>
		<description><![CDATA[This has allowed me to make it a true validator - it follows the RFCs as precisely as I can make it - without losing real-world usefulness.

is_email() version 2.1 was released yesterday. Try it. Let me know if it works for you.]]></description>
			<content:encoded><![CDATA[<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
<p>I&#8217;ve had a lot of correspondence about <em><a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">is_email()</a></em>, the free PHP email address validation software that I maintain. The principle topics of debate were the edge cases where an email address is technically valid but extremely unlikely in the real world.</p>
<p>Examples of this sort of address would be <em>&#8220;&#8221;@example.com</em> or <em>benedictXIII@va</em> &#8211; the first because it doesn&#8217;t contain any text to identify the mailbox and the second because it&#8217;s at a Top Level Domain.</p>
<p>Both these addresses could exist but neither is likely to. If a user entered one of these addresses into your registration page it is much more likely to be a typo than a real address.</p>
<p>So in the first versions of <em>is_email()</em> I made the decision to call these address invalid because they were unlikely. It was this decision that generated most of the correspondence.</p>
<p>My learned correspondents were right. The purpose of <em>is_email()</em> is to determine whether an address is valid or not. It should not be rejecting valid addresses &#8211; this is the most common fault of other ways of validating email addresses.</p>
<p>But I wanted to identify unlikely addresses without declaring them invalid. For this reason I added a Warning feature to <em>is_email()</em>. Without losing any backward compatibility, I have enabled it to return a diagnostic code that identifies either the fault (if it&#8217;s invalid) or the reason it&#8217;s unlikely to be a real address (despite being valid).</p>
<p>This has allowed me to make it a true validator &#8211; it follows the RFCs as precisely as I can make it &#8211; without losing real-world usefulness.</p>
<p><a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank"><em>is_email()</em> version 2.1</a> was released yesterday. Try it. Let me know if it works for you.</p>
<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
]]></content:encoded>
			<wfw:commentRss>http://blog.dominicsayers.com/2010/08/18/email-validation-version-2-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Comments in email addresses</title>
		<link>http://blog.dominicsayers.com/2009/02/27/comments-in-email-addresses/</link>
		<comments>http://blog.dominicsayers.com/2009/02/27/comments-in-email-addresses/#comments</comments>
		<pubDate>Fri, 27 Feb 2009 10:28:17 +0000</pubDate>
		<dc:creator>Dominic</dc:creator>
				<category><![CDATA[Email address validation]]></category>
		<category><![CDATA[Relevant to my work]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[address]]></category>
		<category><![CDATA[cal henderson]]></category>
		<category><![CDATA[comments]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[ietf]]></category>
		<category><![CDATA[paul gregg]]></category>
		<category><![CDATA[rfc2821]]></category>
		<category><![CDATA[rfc2822]]></category>
		<category><![CDATA[rfc3696]]></category>
		<category><![CDATA[rfc5322]]></category>
		<category><![CDATA[rfc822]]></category>
		<category><![CDATA[smtp]]></category>
		<category><![CDATA[validation]]></category>

		<guid isPermaLink="false">http://blog.dominicsayers.com/?p=408</guid>
		<description><![CDATA[I was turning a blind eye to the part of RFC5322 that allows you to put comments within an email address. But Cal H brought it up in an email so I had to bite the bullet.]]></description>
			<content:encoded><![CDATA[<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
<p>I was turning a blind eye to the part of <a href="http://tools.ietf.org/html/rfc5322" target="_blank">RFC5322</a> that allows you to put comments <em>within</em> an email address. But <a href="http://www.iamcal.com/help/cal/" target="_blank">Cal H</a> brought it up in an email so I had to bite the bullet.</p>
<p>On reflection I think this was worthwhile. The most common error in email address validators is that they reject valid addresses. This really annoys people who like to put a &#8216;+&#8217; in their address and find they can&#8217;t because registration form won&#8217;t allow it.</p>
<p>Why do they like putting a &#8216;+&#8217; in their address? Well it effectively tags the incoming email for you automatically. Mail sent to <em>first.last+hello@example.com</em> will go to the <em>first.last</em> mailbox, tagged with &#8216;hello&#8217;. GMail will do this for you &#8211; try it.</p>
<p>So that&#8217;s why I think it&#8217;s worth allowing comments. The next GMail might be able to do the same thing or something even more useful with comments:</p>
<p><em>first.last(notify IM)@example.com</em></p>
<p>Version 1.6 of my validator now passes all 222 unit tests. So does Cal&#8217;s. I see no reason why you wouldn&#8217;t use one of these in your project: they are free and they work. Why reinvent the wheel?</p>
<h4>RFC nerd notes</h4>
<p>Comments can contain folding white space and can be nested. This is the final nail in the coffin for regular expressions that claim to validate email address. Show me a regex that says this is a valid address:</p>
<p><em>first(Welcome to<br />
 the (&#8220;wonderful&#8221; (!)) world<br />
 of email)@example.com</em></p>
<p>A thank-you also to Paul Gregg who allowed me to add his validator to the head-to-head (and added mine to <a href="http://www.pgregg.com/projects/php/code/showvalidemail.php" target="_blank">his page</a>). He also provided some more unit tests.</p>
<h4><em><span style="font-style:normal;">Quick links: </span><a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank"><span style="font-style:normal;">Source code</span></a><span style="font-style:normal;"> | </span><a href="http://www.dominicsayers.com/isemail/" target="_blank"><span style="font-style:normal;">Email address validators head-to-head</span></a><br />
</em></h4>
]]></content:encoded>
			<wfw:commentRss>http://blog.dominicsayers.com/2009/02/27/comments-in-email-addresses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>light@the.end.of.the.tunnel</title>
		<link>http://blog.dominicsayers.com/2009/02/24/lighttheendofthetunnel/</link>
		<comments>http://blog.dominicsayers.com/2009/02/24/lighttheendofthetunnel/#comments</comments>
		<pubDate>Tue, 24 Feb 2009 12:10:29 +0000</pubDate>
		<dc:creator>Dominic</dc:creator>
				<category><![CDATA[Email address validation]]></category>
		<category><![CDATA[Relevant to my work]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[cal henderson]]></category>
		<category><![CDATA[ietf]]></category>
		<category><![CDATA[rfc2821]]></category>
		<category><![CDATA[rfc2822]]></category>
		<category><![CDATA[rfc5322]]></category>
		<category><![CDATA[rfc822]]></category>
		<category><![CDATA[smtp]]></category>

		<guid isPermaLink="false">http://blog.dominicsayers.com/?p=406</guid>
		<description><![CDATA[I've released version 1.3 of my validator and you can download it with the test cases from Google Code.]]></description>
			<content:encoded><![CDATA[<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
<p>The amazingly helpful Cal Henderson has nearly caught up in the arms race that is email address validation. After our latest round of discussions we disagree about only one of the 161 test cases in the test suite. Cal&#8217;s validator successfully validates all the test cases except that one (and he may be right about it -we&#8217;ll see).</p>
<p>I&#8217;ve released version 1.3 of my validator and you can <a href="http://code.google.com/p/isemail/source/browse/trunk" target="_blank">download</a> it with the test cases from Google Code.</p>
<h4>RFC nerd notes</h4>
<p>I had a lesson from Cal on Folding White Space. Who knew that you could have an email address that was split over several lines? Please don&#8217;t run out and try this &#8211; it&#8217;s strictly for completeness.</p>
<p>Secondly, all those validators out there that use <a href="http://tools.ietf.org/html/rfc2822" target="_blank">RFC2822</a> as their authority: those are <em>yesterday&#8217;s</em> validators. All the cool kids are validating using <a href="http://tools.ietf.org/html/rfc5322" target="_blank">RFC5322</a>. It&#8217;s the latest thing.</p>
<h4>Quick links: <a href="http://code.google.com/p/isemail/source/browse/#svn/trunk" target="_blank">Source code</a> | <a href="http://www.dominicsayers.com/isemail/" target="_blank">Email address validators head-to-head</a></h4>
]]></content:encoded>
			<wfw:commentRss>http://blog.dominicsayers.com/2009/02/24/lighttheendofthetunnel/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
