|
- Metadata-Version: 2.1
- Name: email-validator
- Version: 1.0.4
- Summary: A robust email syntax and deliverability validation library for Python 2.x/3.x.
- Home-page: https://github.com/JoshData/python-email-validator
- Author: Joshua Tauberer
- Author-email: jt@occams.info
- License: CC0 (copyright waived)
- Keywords: email address validator
- Platform: UNKNOWN
- Classifier: Development Status :: 5 - Production/Stable
- Classifier: License :: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
- Classifier: Intended Audience :: Developers
- Classifier: Topic :: Software Development :: Libraries :: Python Modules
- Classifier: Programming Language :: Python :: 2
- Classifier: Programming Language :: Python :: 2.7
- Classifier: Programming Language :: Python :: 3
- Classifier: Programming Language :: Python :: 3.4
- Classifier: Programming Language :: Python :: 3.5
- Classifier: Programming Language :: Python :: 3.6
- Requires-Dist: idna (>=2.0.0)
- Requires-Dist: dnspython (>=1.15.0)
-
- email\_validator
- ================
-
- A robust email address syntax and deliverability validation library
- for Python 2.7/3.4 by `Joshua Tauberer <https://razor.occams.info>`__.
-
- This library validates that address are of the form ``x@y.com``. This is
- the sort of validation you would want for a login form on a website.
-
- Key features:
-
- * Good for validating email addresses used for logins/identity.
- * Friendly error messages when validation fails (appropriate to show to end users).
- * (optionally) Checks deliverability: Does the domain name resolve?
- * Supports internationalized domain names and (optionally) internationalized local parts.
- * Normalizes email addresses (super important for internationalized addresses! see below).
-
- The library is NOT for validation of the To: line in an email message (e.g.
- ``My Name <my@address.com>``), which `flanker <https://github.com/mailgun/flanker>`__
- is more appropriate for. And this library does NOT permit obsolete
- forms of email addresses, so if you need strict validation against the
- email specs exactly, use `pyIsEmail <https://github.com/michaelherold/pyIsEmail>`__.
-
- The current version is 1.0.3 (Sept 12, 2017). The only changes since 1.0.0 (Sept 5, 2015)
- have been small bug and packaging fixes.
-
- Installation
- ------------
-
- This package is on PyPI, so:
-
- ::
-
- pip install email_validator
-
- ``pip3`` also works.
-
- Usage
- -----
-
- If you're validating a user's email address before creating a user
- account, you might do this:
-
- ::
-
- from email_validator import validate_email, EmailNotValidError
-
- email = "my+address@mydomain.tld"
-
- try:
- v = validate_email(email) # validate and get info
- email = v["email"] # replace with normalized form
- except EmailNotValidError as e:
- # email is not valid, exception message is human-readable
- print(str(e))
-
- This validates the address and gives you its normalized form. You should
- put the normalized form in your database and always normalize before
- checking if an address is in your database.
-
- The validator will accept internationalized email addresses, but email
- addresses with non-ASCII characters in the *local* part of the address
- (before the @-sign) require the `SMTPUTF8 <https://tools.ietf.org/html/rfc6531>`__
- extension which may not be supported by your mail submission library or
- your outbound mail server. If you know ahead of time that SMTPUTF8 is
- not supported then **add the keyword argument allow_smtputf8=False
- to fail validation for addresses that would require SMTPUTF8**:
-
- ::
-
- validate_email(email, allow_smtputf8=False)
-
- Overview
- --------
-
- The module provides a function ``validate_email(email_address)`` which takes
- an email address (either a ``str`` or ASCII ``bytes``) and:
-
- - Raises a ``EmailNotValidError`` with a helpful, human-readable error
- message explaining why the email address is not valid, or
-
- - Returns a dict with information about the deliverability of the email
- address.
-
- When an email address is not valid, ``validate_email`` raises either an
- ``EmailSyntaxError`` if the form of the address is invalid or an
- ``EmailUndeliverableError`` if the domain name does not resolve. Both
- exception classes are subclasses of ``EmailNotValidError``, which in
- turn is a subclass of ``ValueError``.
-
- But when an email address is valid, a dict is returned containing
- information that might aid deliverability (see below).
-
- The validator doesn't permit obsoleted forms of email addresses that no one
- uses anymore even though they are still valid and deliverable, since they
- will probably give you grief if you're using email for login. (See later in the
- document about that.)
-
- The validator checks that the domain name in the email address resolves.
- There is nothing to be gained by trying to actually contact an SMTP
- server, so that's not done here. For privacy, security, and practicality
- reasons servers are good at not giving away whether an address is
- deliverable or not: email addresses that appear to accept mail at first
- can bounce mail after a delay, and bounced mail may indicate a temporary
- failure of a good email address (sometimes an intentional failure, like
- greylisting).
-
- The function also accepts the following keyword arguments (default as
- shown):
-
- ``allow_smtputf8=True``
- Set to ``False`` to prohibit internationalized
- addresses that would require the `SMTPUTF8 <https://tools.ietf.org/html/rfc6531>`__
- extension.
-
- ``check_deliverability=True``
- Set to ``False`` to skip the domain name resolution check.
-
- ``allow_empty_local=False``
- Set to ``True`` to allow an empty local
- part (i.e. ``@example.com``), e.g. for validating Postfix aliases.
-
- Internationalized email addresses
- ---------------------------------
-
- The email protocol SMTP and the domain name system DNS have historically
- only allowed ASCII characters in email addresses and domain names,
- respectively. Each has adapted to internationalization in a separate
- way, creating two separate aspects to email address
- internationalization.
-
- Internationalized domain names (IDN)
- ''''''''''''''''''''''''''''''''''''
-
- The first is `internationalized domain names (RFC
- 5891) <https://tools.ietf.org/html/rfc5891>`__, a.k.a IDNA 2008. The DNS system has not
- been updated with Unicode support. Instead, internationalized domain
- names are converted into a special IDNA ASCII form starting with
- ``xn--``. When an email address has non-ASCII characters in its domain
- part, the domain part is replaced with its IDNA ASCII equivalent form
- in the process of mail transmission. Your mail submission library probably
- does this for you transparently. Note that most web browsers are currently
- in transition between IDNA 2003 (RFC 3490) and IDNA 2008 (RFC 5891) and
- `compliance around the web is not very good <http://archives.miloush.net/michkap/archive/2012/02/27/10273315.html>`__
- in any case, so be aware that edge cases are handled differently by different
- applications and libraries. This library conforms to IDNA 2008 using the
- `idna <https://github.com/kjd/idna>`__ module by Kim Davies.
-
- Internationalized local parts
- '''''''''''''''''''''''''''''
-
- The second sort of internationalization is internationalization in the
- *local* part of the address (before the @-sign). These email addresses
- require that your mail submission library and the mail servers along the
- route to the destination, including your own outbound mail server, all
- support the `SMTPUTF8 (RFC
- 6531) <https://tools.ietf.org/html/rfc6531>`__ extension. Support for
- SMTPUTF8 varies.
-
- How this module works
- '''''''''''''''''''''
-
- By default all internationalized forms are accepted by the validator.
- But if you know ahead of time that SMTPUTF8 is not supported by your
- mail submission stack, then you must filter out addresses that require
- SMTPUTF8 using the ``allow_smtputf8=False`` keyword argument (see
- above). This will cause the validation function to raise a
- ``EmailSyntaxError`` if delivery would require SMTPUTF8. That's just
- in those cases where non-ASCII characters appear before the @-sign.
- If you do not set ``allow_smtputf8=False``, you can also check the
- value of the ``smtputf8`` field in the returned dict.
-
- If your mail submission library doesn't support Unicode at all --- even
- in the domain part of the address --- then immediately prior to mail
- submission you must replace the email address with its ASCII-ized
- form. This library gives you back the ASCII-ized form in the
- ``email_ascii`` field in the returned dict, which you can get like this:
-
- ::
-
- v = validate_email(email, allow_smtputf8=False)
- email = v['email_ascii']
-
- The local part is left alone (if it has internationalized characters
- ``allow_smtputf8=False`` will force validation to fail) and the domain
- part is converted to `IDNA
- ASCII <https://tools.ietf.org/html/rfc5891>`__. (You probably should not
- do this at account creation time so you don't change the user's login
- information without telling them.)
-
- UCS-4 support required for Python 2.7
- '''''''''''''''''''''''''''''''''''''
-
- Note that when using Python 2.7, it is required that it was built with
- UCS-4 support (see `here <https://stackoverflow.com/questions/29109944/python-returns-length-of-2-for-single-unicode-character-string>`__); otherwise emails with unicode characters outside
- of the BMP (Basic Multilingual Plane) will not validate correctly.
-
- Normalization
- -------------
-
- The use of Unicode in email addresses introduced a normalization problem.
- Different Unicode strings can look identical and have the same semantic
- meaning to the user. The ``email`` field returned on successful validation
- provides the correctly normalized form of the given email address:
-
- ::
-
- v = validate_email(email)
- email = v['email']
-
- Because you may get an email address in a variety of forms, you ought to replace
- it with its normalized form immediately prior to going into your database
- (during account creation), querying your database (during login), or sending
- outbound mail.
-
- The normalizations include lowercasing the domain part of the email address
- (domain names are case-insensitive), `Unicode "NFC" normalization <https://en.wikipedia.org/wiki/Unicode_equivalence>`__
- of the whole address (which turns characters plus `combining characters <https://en.wikipedia.org/wiki/Combining_character>`__
- into precomposed characters where possible and replaces certain Unicode characters
- (such as angstrom and ohm) with other equivalent code points (a-with-ring and omega,
- respectively)), replacement of `fullwidth and halfwidth characters <https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms>`__
- in the domain part, and possibly other `UTS46 <http://unicode.org/reports/tr46>`__ mappings
- on the domain part.
-
- (See `RFC 6532 (internationalized email) section 3.1 <https://tools.ietf.org/html/rfc6532#section-3.1>`__
- and `RFC 5895 (IDNA 2008) section 2 <http://www.ietf.org/rfc/rfc5895.txt>`__.)
-
- Examples
- --------
-
- For the email address ``test@example.org``, the returned dict is:
-
- ::
-
- {
- "email": "test@example.org",
- "email_ascii": "test@example.org",
- "local": "test",
- "domain": "example.org",
- "domain_i18n": "example.org",
-
- "smtputf8": false,
-
- "mx": [
- [
- 0,
- "93.184.216.34"
- ]
- ],
- "mx-fallback": "A"
- }
-
- For the fictitious address ``example@良好Mail.中国``, which has an
- internationalized domain but ASCII local part, the returned dict is:
-
- ::
-
- {
- "email": "example@良好mail.中国",
- "email_ascii": "example@xn--mail-p86gl01s.xn--fiqs8s",
- "local": "example",
- "domain": "xn--mail-p86gl01s.xn--fiqs8s",
- "domain_i18n": "良好mail.中国",
-
- "smtputf8": false,
-
- "mx": [
- [
- 0,
- "218.241.116.40"
- ]
- ],
- "mx-fallback": "A"
- }
-
- Note that ``smtputf8`` is ``False`` even though the domain part is
- internationalized because
- `SMTPUTF8 <https://tools.ietf.org/html/rfc6531>`__ is only
- needed if the local part of the address is internationalized (the domain
- part can be converted to IDNA ASCII). Also note that the ``email`` and
- ``domain_i18n`` fields provide a normalized form of the email address
- and domain name (casefolding and Unicode normalization as required by
- IDNA 2008).
-
- For the fictitious address ``树大@occams.info``, which has an
- internationalized local part, the returned dict is:
-
- ::
-
- {
- "email": "树大@occams.info",
- "local": "树大",
- "domain": "occams.info",
- "domain_i18n": "occams.info",
-
- "smtputf8": true,
-
- "mx": [
- [
- 10,
- "box.occams.info"
- ]
- ],
- "mx-fallback": false
- }
-
- Now ``smtputf8`` is ``True`` and ``email_ascii`` is missing because the
- local part of the address is internationalized. The ``local`` and ``email``
- fields return the normalized form of the address: certain Unicode characters
- (such as angstrom and ohm) may be replaced by other equivalent code points
- (a-with-ring and omega).
-
- Return value
- ------------
-
- When an email address passes validation, the fields in the returned dict
- are:
-
- ``email``
- The canonical form of the email address, mostly useful for
- display purposes. This merely combines the ``local`` and
- ``domain_i18n`` fields (see below).
-
- ``email_ascii``
- If present, an ASCII-only form of the email address
- by replacing the domain part with `IDNA
- ASCII <https://tools.ietf.org/html/rfc5891>`__. This field will be
- present when an ASCII-only form of the email address exists
- (including if the email address is already ASCII). If the local part
- of the email address contains internationalized characters,
- ``email_ascii`` will not be present.
-
- ``local``
- The local part of the given email address (before the
- @-sign) with Unicode NFC normalization applied.
-
- ``domain``
- The `IDNA ASCII <https://tools.ietf.org/html/rfc5891>`__-encoded form of the
- domain part of the given email address (after the @-sign), as it
- would be transmitted on the wire.
-
- ``domain_i18n``
- The canonical internationalized form of
- the domain part of the address, by round-tripping through IDNA ASCII.
- If the returned string contains non-ASCII characters, either the
- `SMTPUTF8 <https://tools.ietf.org/html/rfc6531>`__ feature of MTAs
- will be required to transmit the message or else the email address('s
- domain part) must be converted to IDNA ASCII first (given in the
- returned ``domain`` field).
-
- ``smtputf8``
- A boolean indicating that the `SMTPUTF8 <https://tools.ietf.org/html/rfc6531>`__
- feature of MTAs will be required to transmit messages to this address because the
- local part of the address has non-ASCII characters (the local part
- cannot be IDNA-encoded). If ``allow_smtputf8=False`` is passed as an
- argument, this flag will always be false because an exception is raised
- if it would have been true.
-
- ``mx``
- A list of `(priority, domain)` tuples of MX records specified
- in the DNS for the domain (see `RFC 5321 section
- 5 <https://tools.ietf.org/html/rfc5321#section-5>`__).
-
- ``mx-fallback``
- ``None`` if an ``MX`` record is found. If no MX
- records are actually specified in DNS and instead are inferred,
- through an obsolete mechanism, from A or AAAA records, the value is
- the type of DNS record used instead (``A`` or ``AAAA``).
-
- Assumptions
- -----------
-
- By design, this validator does not pass all email addresses that
- strictly conform to the standards. Many email address forms are obsolete
- or likely to cause trouble:
-
- - The validator assumes the email address is intended to be deliverable
- on the public Internet using DNS, and so the domain part of the email
- address must be a resolvable domain name.
- - The "quoted string" form of the local part of the email address (RFC
- 5321 4.1.2) is not permitted --- no one uses this anymore anyway.
- Quoted forms allow multiple @-signs, space characters, and other
- troublesome conditions.
- - The "literal" form for the domain part of an email address (an IP
- address) is not accepted --- no one uses this anymore anyway.
-
- Testing
- -------
-
- A handful of valid email addresses are pasted in ``test_pass.txt``. Run
- them through the validator (without deliverability checks) like so:
-
- ::
-
- python3 email_validator/__init__.py --tests < test_pass.txt
-
- For Project Maintainers
- -----------------------
-
- The package is distributed as a universal wheel. The wheel is specified as
- universal in the file ``setup.cfg`` by the ``universal = 1`` key in the
- ``[bdist_wheel]`` section. To publish a universal wheel to pypi::
-
- pip3 install twine
- rm -rf dist
- python3 setup.py bdist_wheel
- twine upload dist/*
- git tag v1.0.XXX
- git push --tags
-
|