Do you know - How emails format works ?

Hello there,

If you’ve ventured here, it’s because you think you know everything about email addresses. But are you sure?

In this article, we’re going to talk about the syntax of email addresses, and we’ll try to define what is valid and what is not.

Let’s start with the simplest. For you, an email address is simply:

uppercase or lowercase characters with a dash or a dot @ domain name . tld? Well, no!

Many people settle for this definition, but it’s wrong!

I’m sure that as a good pentester, CTF player, or infosec addict that you are, you definitely know the “+” trick. But yes, you know, that trick you’ve heard about that allows you to see from which crappy site your email addresses are leaking by adding tags behind a “+”!

For all the examples I’m going to cite in this article, let’s imagine that I’ve created a unique mailbox:

[email protected]

Example: I plan to create an account on Wish, so I’ll use a specific email (throwaway email) to see if Wish sells my information. But I don’t want to create a new email address every time! So, I use a tag:

[email protected]

Breaking it down, we have “sicalabe” which is the name of my mailbox, or the “local-part” according to RFC 822 from 1982, then there’s a “+” sign that delineates a tag in the “local-part” of the email, and finally, “wish” which is a tag and gmail.com is the domain.

If we send an email to [email protected], the email will be delivered to the [email protected] inbox, but the real address in the email header “RCPT to:” will be [email protected]!

Well, that’s where my personal knowledge on email addresses ends.

But it’s important to know that an email is much (much) more complex than that. There are valid email syntaxes that pose a real potential threat to a system. It’s also worth noting that the “local-part” of an email is limited to 64 characters.

For the rest of the document, I will rely on RFCs 5322, 5321, and 822, as well as the Wikipedia page dedicated to email addresses.

The comments

Did you know that an email address can include comments?

Here are examples of valid emails:

(commentaire)[email protected]

sicalabe(commentaire)@gmail.com

These addresses are automatically transformed to [email protected] when sending an email, but if you register an account sicalabe(site)@gmail.com, in the majority of cases you will need to use the email WITH the comment to log in!

When is this useful? Imagine, you can create a unique email address for each site you visit, and you can add a comment to remember the site. This way, you can easily identify the site that leaked your email address!

The IPs

Did you know that the domain of an email address can also be an IP address?

The following examples are all theoretically valid:

[email protected]

sicalabe@[127.0.0.1]

sicalabe@[IPv6:2001:db8::1]

It’s worth noting that some domains are specifically unresolved by SMTP servers so that the emails are just examples:

[email protected]

[email protected]|net|org

[email protected]

Lastly, just like with the local-part, comments in the domain name are theoretically allowed (but rarely accepted by websites, because in most cases, the site will attempt a DNS resolution of the domain before validating the email). At Google, it’s not allowed anyway:

comment denied on gmail

Neither at Hackerone:

Hackerone register with comment

Special Characters

Now we get to the part that inspired me to write this article. An email address can actually include a rather alarming array of special characters. Here is the complete list:

! $ & * - = \^ ` | ~ # % ' + / ? _ { } . 

But also, under certain circumstances (detailed in the Misc section):

@ " <space> < > : ; , 

All these special characters are totally valid in email addresses, which results in some really interesting email-payloads. Here’s one that I use quite often:

sicalabe+${7*7}{{7*7}}`id`|'or''='@gmail.com

This email address can test for SSTI, code injections, and SQLI all at once! Crazy, right?

It’s clear that some sites will apply email filters (often in JavaScript), but with this payload, you provide a completely valid email in the eyes of the SMTP server to receive your little test emails, while also testing the robustness of the front & back end code of the application.

Misc

In this final section, we’ll discuss the syntaxes that drive me crazy.

As with many things, putting a string of characters between quotes allows it to be escaped. Email addresses are no exception.

Examples of emails (with descriptions if necessary):

Email Addresses and Their SMTP Equivalence

According to RFCs 6530, 6531, and 6532, these addresses are also valid:

  • fromagère.pelé[email protected] -> remember that in the 1982 RFC only the characters a-z and A-Z are allowed for letters
  • δοκιμή@παράδειγμα.δοκιμή
  • 我買@屋企.香港
  • संपर्क@डाटामेल.भारतारत

Feeling adventurous?

  • "sica.(),:;<>[]\".labe.\"sica@\ \"labe\".sicalabe"@gmail.com

There are also specific syntaxes that will be interpreted by systems:

Indeed, it’s amazing the realm of possibilities that has just opened up before me. And you, did you learn anything new?

Cheers, Sicarius

Hey there, Wanna talk about this article ? Hit me up on X :